WorldWideScience

Sample records for survey items measuring

  1. Reduced-Item Food Audits Based on the Nutrition Environment Measures Surveys.

    Science.gov (United States)

    Partington, Susan N; Menzies, Tim J; Colburn, Trina A; Saelens, Brian E; Glanz, Karen

    2015-10-01

    The community food environment may contribute to obesity by influencing food choice. Store and restaurant audits are increasingly common methods for assessing food environments, but are time consuming and costly. A valid, reliable brief measurement tool is needed. The purpose of this study was to develop and validate reduced-item food environment audit tools for stores and restaurants. Nutrition Environment Measures Surveys for stores (NEMS-S) and restaurants (NEMS-R) were completed in 820 stores and 1,795 restaurants in West Virginia, San Diego, and Seattle. Data mining techniques (correlation-based feature selection and linear regression) were used to identify survey items highly correlated to total survey scores and produce reduced-item audit tools that were subsequently validated against full NEMS surveys. Regression coefficients were used as weights that were applied to reduced-item tool items to generate comparable scores to full NEMS surveys. Data were collected and analyzed in 2008-2013. The reduced-item tools included eight items for grocery, ten for convenience, seven for variety, and five for other stores; and 16 items for sit-down, 14 for fast casual, 19 for fast food, and 13 for specialty restaurants-10% of the full NEMS-S and 25% of the full NEMS-R. There were no significant differences in median scores for varying types of retail food outlets when compared to the full survey scores. Median in-store audit time was reduced 25%-50%. Reduced-item audit tools can reduce the burden and complexity of large-scale or repeated assessments of the retail food environment without compromising measurement quality. Copyright © 2015 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.

  2. Harmonizing Measures of Cognitive Performance Across International Surveys of Aging Using Item Response Theory.

    Science.gov (United States)

    Chan, Kitty S; Gross, Alden L; Pezzin, Liliana E; Brandt, Jason; Kasper, Judith D

    2015-12-01

    To harmonize measures of cognitive performance using item response theory (IRT) across two international aging studies. Data for persons ≥65 years from the Health and Retirement Study (HRS, N = 9,471) and the English Longitudinal Study of Aging (ELSA, N = 5,444). Cognitive performance measures varied (HRS fielded 25, ELSA 13); 9 were in common. Measurement precision was examined for IRT scores based on (a) common items, (b) common items adjusted for differential item functioning (DIF), and (c) DIF-adjusted all items. Three common items (day of date, immediate word recall, and delayed word recall) demonstrated DIF by survey. Adding survey-specific items improved precision but mainly for HRS respondents at lower cognitive levels. IRT offers a feasible strategy for harmonizing cognitive performance measures across other surveys and for other multi-item constructs of interest in studies of aging. Practical implications depend on sample distribution and the difficulty mix of in-common and survey-specific items. © The Author(s) 2015.

  3. Validation of the MOS Social Support Survey 6-item (MOS-SSS-6) measure with two large population-based samples of Australian women.

    Science.gov (United States)

    Holden, Libby; Lee, Christina; Hockey, Richard; Ware, Robert S; Dobson, Annette J

    2014-12-01

    This study aimed to validate a 6-item 1-factor global measure of social support developed from the Medical Outcomes Study Social Support Survey (MOS-SSS) for use in large epidemiological studies. Data were obtained from two large population-based samples of participants in the Australian Longitudinal Study on Women's Health. The two cohorts were aged 53-58 and 28-33 years at data collection (N = 10,616 and 8,977, respectively). Items selected for the 6-item 1-factor measure were derived from the factor structure obtained from unpublished work using an earlier wave of data from one of these cohorts. Descriptive statistics, including polychoric correlations, were used to describe the abbreviated scale. Cronbach's alpha was used to assess internal consistency and confirmatory factor analysis to assess scale validity. Concurrent validity was assessed using correlations between the new 6-item version and established 19-item version, and other concurrent variables. In both cohorts, the new 6-item 1-factor measure showed strong internal consistency and scale reliability. It had excellent goodness-of-fit indices, similar to those of the established 19-item measure. Both versions correlated similarly with concurrent measures. The 6-item 1-factor MOS-SSS measures global functional social support with fewer items than the established 19-item measure.

  4. Development and evaluation of CAHPS survey items assessing how well healthcare providers address health literacy.

    Science.gov (United States)

    Weidmer, Beverly A; Brach, Cindy; Hays, Ron D

    2012-09-01

    The complexity of health information often exceeds patients' skills to understand and use it. To develop survey items assessing how well healthcare providers communicate health information. Domains and items for the Consumer Assessment of Healthcare Providers and Systems (CAHPS) Item Set for Addressing Health Literacy were identified through an environmental scan and input from stakeholders. The draft item set was translated into Spanish and pretested in both English and Spanish. The revised item set was field tested with a randomly selected sample of adult patients from 2 sites using mail and telephonic data collection. Item-scale correlations, confirmatory factor analysis, and internal consistency reliability estimates were estimated to assess how well the survey items performed and identify composite measures. Finally, we regressed the CAHPS global rating of the provider item on the CAHPS core communication composite and the new health literacy composites. A total of 601 completed surveys were obtained (52% response rate). Two composite measures were identified: (1) Communication to Improve Health Literacy (16 items); and (2) How Well Providers Communicate About Medicines (6 items). These 2 composites were significantly uniquely associated with the global rating of the provider (communication to improve health literacy: PLiteracy composite accounted for 90% of the variance of the original 16-item composite. This study provides support for reliability and validity of the CAHPS Item Set for Addressing Health Literacy. These items can serve to assess whether healthcare providers have communicated effectively with their patients and as a tool for quality improvement.

  5. Development of the Quantitative Reasoning Items on the National Survey of Student Engagement

    Directory of Open Access Journals (Sweden)

    Amber D. Dumford

    2015-01-01

    Full Text Available As society’s needs for quantitative skills become more prevalent, college graduates require quantitative skills regardless of their career choices. Therefore, it is important that institutions assess students’ engagement in quantitative activities during college. This study chronicles the process taken by the National Survey of Student Engagement (NSSE to develop items that measure students’ participation in quantitative reasoning (QR activities. On the whole, findings across the quantitative and qualitative analyses suggest good overall properties for the developed QR items. The items show great promise to explore and evaluate the frequency with which college students participate in QR-related activities. Each year, hundreds of institutions across the United States and Canada participate in NSSE, and, with the addition of these new items on the core survey, every participating institution will have information on this topic. Our hope is that these items will spur conversations on campuses about students’ use of quantitative reasoning activities.

  6. Single-Item Measurement of Suicidal Behaviors: Validity and Consequences of Misclassification.

    Directory of Open Access Journals (Sweden)

    Alexander J Millner

    Full Text Available Suicide is a leading cause of death worldwide. Although research has made strides in better defining suicidal behaviors, there has been less focus on accurate measurement. Currently, the widespread use of self-report, single-item questions to assess suicide ideation, plans and attempts may contribute to measurement problems and misclassification. We examined the validity of single-item measurement and the potential for statistical errors. Over 1,500 participants completed an online survey containing single-item questions regarding a history of suicidal behaviors, followed by questions with more precise language, multiple response options and narrative responses to examine the validity of single-item questions. We also conducted simulations to test whether common statistical tests are robust against the degree of misclassification produced by the use of single-items. We found that 11.3% of participants that endorsed a single-item suicide attempt measure engaged in behavior that would not meet the standard definition of a suicide attempt. Similarly, 8.8% of those who endorsed a single-item measure of suicide ideation endorsed thoughts that would not meet standard definitions of suicide ideation. Statistical simulations revealed that this level of misclassification substantially decreases statistical power and increases the likelihood of false conclusions from statistical tests. Providing a wider range of response options for each item reduced the misclassification rate by approximately half. Overall, the use of single-item, self-report questions to assess the presence of suicidal behaviors leads to misclassification, increasing the likelihood of statistical decision errors. Improving the measurement of suicidal behaviors is critical to increase understanding and prevention of suicide.

  7. Examining Multiple Sources of Differential Item Functioning on the Clinician & Group CAHPS® Survey

    Science.gov (United States)

    Rodriguez, Hector P; Crane, Paul K

    2011-01-01

    Objective To evaluate psychometric properties of a widely used patient experience survey. Data Sources English-language responses to the Clinician & Group Consumer Assessment of Healthcare Providers and Systems (CG-CAHPS®) survey (n = 12,244) from a 2008 quality improvement initiative involving eight southern California medical groups. Methods We used an iterative hybrid ordinal logistic regression/item response theory differential item functioning (DIF) algorithm to identify items with DIF related to patient sociodemographic characteristics, duration of the physician–patient relationship, number of physician visits, and self-rated physical and mental health. We accounted for all sources of DIF and determined its cumulative impact. Principal Findings The upper end of the CG-CAHPS® performance range is measured with low precision. With sensitive settings, some items were found to have DIF. However, overall DIF impact was negligible, as 0.14 percent of participants had salient DIF impact. Latinos who spoke predominantly English at home had the highest prevalence of salient DIF impact at 0.26 percent. Conclusions The CG-CAHPS® functions similarly across commercially insured respondents from diverse backgrounds. Consequently, previously documented racial and ethnic group differences likely reflect true differences rather than measurement bias. The impact of low precision at the upper end of the scale should be clarified. PMID:22092021

  8. Differential item functioning magnitude and impact measures from item response theory models.

    Science.gov (United States)

    Kleinman, Marjorie; Teresi, Jeanne A

    2016-01-01

    Measures of magnitude and impact of differential item functioning (DIF) at the item and scale level, respectively are presented and reviewed in this paper. Most measures are based on item response theory models. Magnitude refers to item level effect sizes, whereas impact refers to differences between groups at the scale score level. Reviewed are magnitude measures based on group differences in the expected item scores and impact measures based on differences in the expected scale scores. The similarities among these indices are demonstrated. Various software packages are described that provide magnitude and impact measures, and new software presented that computes all of the available statistics conveniently in one program with explanations of their relationships to one another.

  9. Surveillance indicators for potential reduced exposure products (PREPs: developing survey items to measure awareness

    Directory of Open Access Journals (Sweden)

    McNeill Ann

    2009-10-01

    Full Text Available Abstract Background Over the past decade, tobacco companies have introduced cigarettes and smokeless tobacco products (known as Potential Reduced Exposure Products, PREPs with purportedly lower levels of some toxins than conventional cigarettes and smokeless products. It is essential that public health agencies monitor awareness, interest, use, and perceptions of these products so that their impact on population health can be detected at the earliest stages. Methods This paper reviews and critiques existing strategies for measuring awareness of PREPs from 16 published and unpublished studies. From these measures, we developed new surveillance items and subjected them to two rounds of cognitive testing, a common and accepted method for evaluating questionnaire wording. Results Our review suggests that high levels of awareness of PREPs reported in some studies are likely to be inaccurate. Two likely sources of inaccuracy in awareness measures were identified: 1 the tendency of respondents to misclassify "no additive" and "natural" cigarettes as PREPs and 2 the tendency of respondents to mistakenly report awareness as a result of confusion between PREPs brands and similarly named familiar products, for example, Eclipse chewing gum and Accord automobiles. Conclusion After evaluating new measures with cognitive interviews, we conclude that as of winter 2006, awareness of reduced exposure products among U.S. smokers was likely to be between 1% and 8%, with the higher estimates for some products occurring in test markets. Recommended measurement strategies for future surveys are presented.

  10. Surveillance indicators for potential reduced exposure products (PREPs): developing survey items to measure awareness

    Science.gov (United States)

    Bogen, Karen; Biener, Lois; Garrett, Catherine A; Allen, Jane; Cummings, K Michael; Hartman, Anne; Marcus, Stephen; McNeill, Ann; O'Connor, Richard J; Parascandola, Mark; Pederson, Linda

    2009-01-01

    Background Over the past decade, tobacco companies have introduced cigarettes and smokeless tobacco products (known as Potential Reduced Exposure Products, PREPs) with purportedly lower levels of some toxins than conventional cigarettes and smokeless products. It is essential that public health agencies monitor awareness, interest, use, and perceptions of these products so that their impact on population health can be detected at the earliest stages. Methods This paper reviews and critiques existing strategies for measuring awareness of PREPs from 16 published and unpublished studies. From these measures, we developed new surveillance items and subjected them to two rounds of cognitive testing, a common and accepted method for evaluating questionnaire wording. Results Our review suggests that high levels of awareness of PREPs reported in some studies are likely to be inaccurate. Two likely sources of inaccuracy in awareness measures were identified: 1) the tendency of respondents to misclassify "no additive" and "natural" cigarettes as PREPs and 2) the tendency of respondents to mistakenly report awareness as a result of confusion between PREPs brands and similarly named familiar products, for example, Eclipse chewing gum and Accord automobiles. Conclusion After evaluating new measures with cognitive interviews, we conclude that as of winter 2006, awareness of reduced exposure products among U.S. smokers was likely to be between 1% and 8%, with the higher estimates for some products occurring in test markets. Recommended measurement strategies for future surveys are presented. PMID:19840394

  11. Poisson and negative binomial item count techniques for surveys with sensitive question.

    Science.gov (United States)

    Tian, Guo-Liang; Tang, Man-Lai; Wu, Qin; Liu, Yin

    2017-04-01

    Although the item count technique is useful in surveys with sensitive questions, privacy of those respondents who possess the sensitive characteristic of interest may not be well protected due to a defect in its original design. In this article, we propose two new survey designs (namely the Poisson item count technique and negative binomial item count technique) which replace several independent Bernoulli random variables required by the original item count technique with a single Poisson or negative binomial random variable, respectively. The proposed models not only provide closed form variance estimate and confidence interval within [0, 1] for the sensitive proportion, but also simplify the survey design of the original item count technique. Most importantly, the new designs do not leak respondents' privacy. Empirical results show that the proposed techniques perform satisfactorily in the sense that it yields accurate parameter estimate and confidence interval.

  12. Refinement of the Brazilian Household Food Insecurity Measurement Scale: Recommendation for a 14-item EBIA

    Directory of Open Access Journals (Sweden)

    Ana Maria Segall-Corrêa

    2014-04-01

    Full Text Available OBJECTIVE: To review and refine Brazilian Household Food Insecurity Measurement Scale structure. METHODS: The study analyzed the impact of removing the item "adult lost weight" and one of two possibly redundant items on Brazilian Household Food Insecurity Measurement Scale psychometric behavior using the one-parameter logistic (Rasch model. Brazilian Household Food Insecurity Measurement Scale psychometric behavior was analyzed with respect to acceptable adjustment values ranging from 0.7 to 1.3, and to severity scores of the items with theoretically expected gradients. The socioeconomic and food security indicators came from the 2004 National Household Sample Survey, which obtained complete answers to Brazilian Household Food Insecurity Measurement Scale items from 112,665 households. RESULTS: Removing the items "adult reduced amount..." followed by "adult ate less..." did not change the infit of the remaining items, except for "adult lost weight", whose infit increased from 1.21 to 1.56. The internal consistency and item severity scores did not change when "adult ate less" and one of the two redundant items were removed. CONCLUSION: Brazilian Household Food Insecurity Measurement Scale reanalysis reduced the number of scale items from 16 to 14 without changing its internal validity. Its use as a nationwide household food security measure is strongly recommended.

  13. Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS: An item response theory approach

    Directory of Open Access Journals (Sweden)

    JOSEPH P. EIMICKE

    2009-06-01

    Full Text Available The aims of this paper are to present findings related to differential item functioning (DIF in the Patient Reported Outcome Measurement Information System (PROMIS depression item bank, and to discuss potential threats to the validity of results from studies of DIF. The 32 depression items studied were modified from several widely used instruments. DIF analyses of gender, age and education were performed using a sample of 735 individuals recruited by a survey polling firm. DIF hypotheses were generated by asking content experts to indicate whether or not they expected DIF to be present, and the direction of the DIF with respect to the studied comparison groups. Primary analyses were conducted using the graded item response model (for polytomous, ordered response category data with likelihood ratio tests of DIF, accompanied by magnitude measures. Sensitivity analyses were performed using other item response models and approaches to DIF detection. Despite some caveats, the items that are recommended for exclusion or for separate calibration were "I felt like crying" and "I had trouble enjoying things that I used to enjoy." The item, "I felt I had no energy," was also flagged as evidencing DIF, and recommended for additional review. On the one hand, false DIF detection (Type 1 error was controlled to the extent possible by ensuring model fit and purification. On the other hand, power for DIF detection might have been compromised by several factors, including sparse data and small sample sizes. Nonetheless, practical and not just statistical significance should be considered. In this case the overall magnitude and impact of DIF was small for the groups studied, although impact was relatively large for some individuals.

  14. Using item response theory to measure extreme response style in marketing research

    NARCIS (Netherlands)

    de Jong, Martijn G.; Steenkamp, Jan-Benedict E.M.; Fox, Gerardus J.A.; Baumgartner, Hans

    2008-01-01

    Extreme response style (ERS) is an important threat to the validity of survey-based marketing research. In this article, the authors present a new item response theory–based model for measuring ERS. This model contributes to the ERS literature in two ways. First, the method improves on existing

  15. Reliability of the Core Items in the General Social Survey: Estimates from the Three-Wave Panels, 2006–2014

    Directory of Open Access Journals (Sweden)

    Michael Hout

    2016-11-01

    Full Text Available We used standard and multilevel models to assess the reliability of core items in the General Social Survey panel studies spanning 2006 to 2014. Most of the 293 core items scored well on the measure of reliability: 62 items (21 percent had reliability measures greater than 0.85; another 71 (24 percent had reliability measures between 0.70 and 0.85. Objective items, especially facts about demography and religion, were generally more reliable than subjective items. The economic recession of 2007–2009, the slow recovery afterward, and the election of Barack Obama in 2008 altered the social context in ways that may look like unreliability of items. For example, unemployment status, hours worked, and weeks worked have lower reliability than most work-related items, reflecting the consequences of the recession on the facts of peoples lives. Items regarding racial and gender discrimination and racial stereotypes scored as particularly unreliable, accounting for most of the 15 items with reliability coefficients less than 0.40. Our results allow scholars to more easily take measurement reliability into consideration in their own research, while also highlighting the limitations of these approaches.

  16. Factors affecting study efficiency and item non-response in health surveys in developing countries: the Jamaica national healthy lifestyle survey

    Directory of Open Access Journals (Sweden)

    Bennett Franklyn

    2007-02-01

    Full Text Available Abstract Background Health surveys provide important information on the burden and secular trends of risk factors and disease. Several factors including survey and item non-response can affect data quality. There are few reports on efficiency, validity and the impact of item non-response, from developing countries. This report examines factors associated with item non-response and study efficiency in a national health survey in a developing Caribbean island. Methods A national sample of participants aged 15–74 years was selected in a multi-stage sampling design accounting for 4 health regions and 14 parishes using enumeration districts as primary sampling units. Means and proportions of the variables of interest were compared between various categories. Non-response was defined as failure to provide an analyzable response. Linear and logistic regression models accounting for sample design and post-stratification weighting were used to identify independent correlates of recruitment efficiency and item non-response. Results We recruited 2012 15–74 year-olds (66.2% females at a response rate of 87.6% with significant variation between regions (80.9% to 97.6%; p Conclusion Informative health surveys are possible in developing countries. While survey response rates may be satisfactory, item non-response was high in respect of income and sexual practice. In contrast to developed countries, non-response to questions on income is higher and has different correlates. These findings can inform future surveys.

  17. Item-level psychometrics of the ADL instrument of the Korean National Survey on persons with physical disabilities.

    Science.gov (United States)

    Hong, Ickpyo; Lee, Mi Jung; Kim, Moon Young; Park, Hae Yean

    2017-10-01

    The aim of this study is to investigate the psychometrics of the 12 items of an instrument assessing activities of daily living (ADL) using an item response theory model. A total of 648 adults with physical disabilities and having difficulties in ADLs were retrieved from the 2014 Korean National Survey on People with Disabilities. The psychometric testing included factor analysis, internal consistency, precision, and differential item functioning (DIF) across categories including sex, older age, marital status, and physical impairment area. The sample had a mean age of 69.7 years old (SD = 13.7). The majority of the sample had lower extremity impairments (62.0%) and had at least 2.1 chronic conditions. The instrument demonstrated unidimensional construct and good internal consistency (Cronbach's alpha = 0.95). The instrument precisely estimated person measures within a wide range of theta values (-2.22 logits  5.0%). Our findings indicate that the dressing item would need to be modified to improve its psychometrics. Overall, the ADL instrument demonstrates good psychometrics, and thus, it may be used as a standardized instrument for measuring disability in rehabilitation contexts. However, the findings are limited to adults with physical disabilities. Future studies should replicate psychometric testing for survey respondents with other disorders and for children.

  18. Recommended core items to assess e-cigarette use in population-based surveys.

    Science.gov (United States)

    Pearson, Jennifer L; Hitchman, Sara C; Brose, Leonie S; Bauld, Linda; Glasser, Allison M; Villanti, Andrea C; McNeill, Ann; Abrams, David B; Cohen, Joanna E

    2018-05-01

    A consistent approach using standardised items to assess e-cigarette use in both youth and adult populations will aid cross-survey and cross-national comparisons of the effect of e-cigarette (and tobacco) policies and improve our understanding of the population health impact of e-cigarette use. Focusing on adult behaviour, we propose a set of e-cigarette use items, discuss their utility and potential adaptation, and highlight e-cigarette constructs that researchers should avoid without further item development. Reliable and valid items will strengthen the emerging science and inform knowledge synthesis for policy-making. Building on informal discussions at a series of international meetings of 65 experts from 15 countries, the authors provide recommendations for assessing e-cigarette use behaviour, relative perceived harm, device type, presence of nicotine, flavours and reasons for use. We recommend items assessing eight core constructs: e-cigarette ever use, frequency of use and former daily use; relative perceived harm; device type; primary flavour preference; presence of nicotine; and primary reason for use. These items should be standardised or minimally adapted for the policy context and target population. Researchers should be prepared to update items as e-cigarette device characteristics change. A minimum set of e-cigarette items is proposed to encourage consensus around items to allow for cross-survey and cross-jurisdictional comparisons of e-cigarette use behaviour. These proposed items are a starting point. We recognise room for continued improvement, and welcome input from e-cigarette users and scientific colleagues. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  19. Improving ability measurement in surveys by following the principles of IRT: The Wordsum vocabulary test in the General Social Survey.

    Science.gov (United States)

    Cor, M Ken; Haertel, Edward; Krosnick, Jon A; Malhotra, Neil

    2012-09-01

    Survey researchers often administer batteries of questions to measure respondents' abilities, but these batteries are not always designed in keeping with the principles of optimal test construction. This paper illustrates one instance in which following these principles can improve a measurement tool used widely in the social and behavioral sciences: the GSS's vocabulary test called "Wordsum". This ten-item test is composed of very difficult items and very easy items, and item response theory (IRT) suggests that the omission of moderately difficult items is likely to have handicapped Wordsum's effectiveness. Analyses of data from national samples of thousands of American adults show that after adding four moderately difficult items to create a 14-item battery, "Wordsumplus" (1) outperformed the original battery in terms of quality indicators suggested by classical test theory; (2) reduced the standard error of IRT ability estimates in the middle of the latent ability dimension; and (3) exhibited higher concurrent validity. These findings show how to improve Wordsum and suggest that analysts should use a score based on all 14 items instead of using the summary score provided by the GSS, which is based on only the original 10 items. These results also show more generally how surveys measuring abilities (and other constructs) can benefit from careful application of insights from the contemporary educational testing literature. Copyright © 2012 Elsevier Inc. All rights reserved.

  20. Psychometric evaluation of an inpatient consumer survey measuring satisfaction with psychiatric care.

    Science.gov (United States)

    Ortiz, Glorimar; Schacht, Lucille

    2012-01-01

    Measurement of consumers' satisfaction in psychiatric settings is important because it has been correlated with improved clinical outcomes and administrative measures of high-quality care. These consumer satisfaction measurements are actively used as performance measures required by the accreditation process and for quality improvement activities. Our objectives were (i) to re-evaluate, through exploratory factor analysis (EFA) and confirmatory factor analysis (CFA), the structure of an instrument intended to measure consumers' satisfaction with care in psychiatric settings and (ii) to examine and publish the psychometric characteristics, validity and reliability, of the Inpatient Consumer Survey (ICS). To psychometrically test the structure of the ICS, 34 878 survey results, submitted by 90 psychiatric hospitals in 2008, were extracted from the Behavioral Healthcare Performance Measurement System (BHPMS). Basic descriptive item-response and correlation analyses were performed for total surveys. Two datasets were randomly created for analysis. A random sample of 8229 survey results was used for EFA. Another random sample of 8261 consumer survey results was used for CFA. This same sample was used to perform validity and reliability analyses. The item-response analysis showed that the mean range for a disagree/agree five-point scale was 3.10-3.94. Correlation analysis showed a strong relationship between items. Six domains (dignity, rights, environment, empowerment, participation, and outcome) with internal reliabilities between good to moderate (0.87-0.73) were shown to be related to overall care satisfaction. Overall reliability for the instrument was excellent (0.94). Results from CFA provided support for the domains structure of the ICS proposed through EFA. The overall findings from this study provide evidence that the ICS is a reliable measure of consumer satisfaction in psychiatric inpatient settings. The analysis has shown the ICS to provide valid and

  1. Development of an assessment tool to measure students′ perceptions of respiratory care education programs: Item generation, item reduction, and preliminary validation

    Directory of Open Access Journals (Sweden)

    Ghazi Alotaibi

    2013-01-01

    Full Text Available Objectives: Students who perceived their learning environment positively are more likely to develop effective learning strategies, and adopt a deep learning approach. Currently, there is no validated instrument for measuring the educational environment of educational programs on respiratory care (RC. The aim of this study was to develop an instrument to measure students′ perception of the RC educational environment. Materials and Methods: Based on the literature review and an assessment of content validity by multiple focus groups of RC educationalists, potential items of the instrument relevant to RC educational environment construct were generated by the research group. The initial 71 item questionnaire was then field-tested on all students from the 3 RC programs in Saudi Arabia and was subjected to multi-trait scaling analysis. Cronbach′s alpha was used to assess internal consistency reliabilities. Results: Two hundred and twelve students (100% completed the survey. The initial instrument of 71 items was reduced to 65 across 5 scales. Convergent and discriminant validity assessment demonstrated that the majority of items correlated more highly with their intended scale than a competing one. Cronbach′s alpha exceeded the standard criterion of >0.70 in all scales except one. There was no floor or ceiling effect for scale or overall score. Conclusions: This instrument is the first assessment tool developed to measure the RC educational environment. There was evidence of its good feasibility, validity, and reliability. This first validation of the instrument supports its use by RC students to evaluate educational environment.

  2. 5 CFR 591.212 - How does OPM select survey items?

    Science.gov (United States)

    2010-01-01

    ... 5 Administrative Personnel 1 2010-01-01 2010-01-01 false How does OPM select survey items? 591.212 Section 591.212 Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT CIVIL SERVICE REGULATIONS ALLOWANCES AND DIFFERENTIALS Cost-of-Living Allowance and Post Differential-Nonforeign Areas Cost-Of-Living...

  3. Test-retest reliability of selected items of Health Behaviour in School-aged Children (HBSC survey questionnaire in Beijing, China

    Directory of Open Access Journals (Sweden)

    Liu Yang

    2010-08-01

    Full Text Available Abstract Background Children's health and health behaviour are essential for their development and it is important to obtain abundant and accurate information to understand young people's health and health behaviour. The Health Behaviour in School-aged Children (HBSC study is among the first large-scale international surveys on adolescent health through self-report questionnaires. So far, more than 40 countries in Europe and North America have been involved in the HBSC study. The purpose of this study is to assess the test-retest reliability of selected items in the Chinese version of the HBSC survey questionnaire in a sample of adolescents in Beijing, China. Methods A sample of 95 male and female students aged 11 or 15 years old participated in a test and retest with a three weeks interval. Student Identity numbers of respondents were utilized to permit matching of test-retest questionnaires. 23 items concerning physical activity, sedentary behaviour, sleep and substance use were evaluated by using the percentage of response shifts and the single measure Intraclass Correlation Coefficients (ICC with 95% confidence interval (CI for all respondents and stratified by gender and age. Items on substance use were only evaluated for school children aged 15 years old. Results The percentage of no response shift between test and retest varied from 32% for the item on computer use at weekends to 92% for the three items on smoking. Of all the 23 items evaluated, 6 items (26% showed a moderate reliability, 12 items (52% displayed a substantial reliability and 4 items (17% indicated almost perfect reliability. No gender and age group difference of the test-retest reliability was found except for a few items on sedentary behaviour. Conclusions The overall findings of this study suggest that most selected indicators in the HBSC survey questionnaire have satisfactory test-retest reliability for the students in Beijing. Further test-retest studies in a large

  4. Measurement Equivalence in ADL and IADL Difficulty Across International Surveys of Aging: Findings From the HRS, SHARE, and ELSA

    Science.gov (United States)

    Kasper, Judith D.; Brandt, Jason; Pezzin, Liliana E.

    2012-01-01

    Objective. To examine the measurement equivalence of items on disability across three international surveys of aging. Method. Data for persons aged 65 and older were drawn from the Health and Retirement Survey (HRS, n = 10,905), English Longitudinal Study of Aging (ELSA, n = 5,437), and Survey of Health, Ageing and Retirement in Europe (SHARE, n = 13,408). Differential item functioning (DIF) was assessed using item response theory (IRT) methods for activities of daily living (ADL) and instrumental activities of daily living (IADL) items. Results. HRS and SHARE exhibited measurement equivalence, but 6 of 11 items in ELSA demonstrated meaningful DIF. At the scale level, this item-level DIF affected scores reflecting greater disability. IRT methods also spread out score distributions and shifted scores higher (toward greater disability). Results for mean disability differences by demographic characteristics, using original and DIF-adjusted scores, were the same overall but differed for some subgroup comparisons involving ELSA. Discussion. Testing and adjusting for DIF is one means of minimizing measurement error in cross-national survey comparisons. IRT methods were used to evaluate potential measurement bias in disability comparisons across three international surveys of aging. The analysis also suggested DIF was mitigated for scales including both ADL and IADL and that summary indexes (counts of limitations) likely underestimate mean disability in these international populations. PMID:22156662

  5. Constructing the 32-item Fitness-to-Drive Screening Measure.

    Science.gov (United States)

    Medhizadah, Shabnam; Classen, Sherrilene; Johnson, Andrew M

    2018-04-01

    The Fitness-to-Drive Screening Measure © (FTDS) enables proxies to identify at-risk older drivers via 54 driving-related items, but may be too lengthy for widespread uptake. We reduced the number of items in the FTDS and validated the shorter measure, using 200 caregiver responses. Exploratory factor analysis and classical test theory techniques were used to determine the most interpretable factor model and the minimum number of items to be used for predicting fitness to drive. The extent to which the shorter FTDS predicted the results of the 54-item FTDS was evaluated through correlational analysis. A three-factor model best represented the empirical data. Classical test theory techniques lead to the development of the 32-item FTDS. The 32-item FTDS was highly correlated ( r = .99, p = .05) with the FTDS. The 32-item FTDS may provide raters with a faster and more efficient way to identify at-risk older drivers.

  6. Reliability and validity of the International Spinal Cord Injury Basic Pain Data Set items as self-report measures

    DEFF Research Database (Denmark)

    Jensen, M P; Widerström-Noga, E; Richards, J S

    2010-01-01

    To evaluate the psychometric properties of a subset of International Spinal Cord Injury Basic Pain Data Set (ISCIBPDS) items that could be used as self-report measures in surveys, longitudinal studies and clinical trials....

  7. Transgender-inclusive measures of sex/gender for population surveys: Mixed-methods evaluation and recommendations.

    Science.gov (United States)

    Bauer, Greta R; Braimoh, Jessica; Scheim, Ayden I; Dharma, Christoffer

    2017-01-01

    Given that an estimated 0.6% of the U.S. population is transgender (trans) and that large health disparities for this population have been documented, government and research organizations are increasingly expanding measures of sex/gender to be trans inclusive. Options suggested for trans community surveys, such as expansive check-all-that-apply gender identity lists and write-in options that offer maximum flexibility, are generally not appropriate for broad population surveys. These require limited questions and a small number of categories for analysis. Limited evaluation has been undertaken of trans-inclusive population survey measures for sex/gender, including those currently in use. Using an internet survey and follow-up of 311 participants, and cognitive interviews from a maximum-diversity sub-sample (n = 79), we conducted a mixed-methods evaluation of two existing measures: a two-step question developed in the United States and a multidimensional measure developed in Canada. We found very low levels of item missingness, and no indicators of confusion on the part of cisgender (non-trans) participants for both measures. However, a majority of interview participants indicated problems with each question item set. Agreement between the two measures in assessment of gender identity was very high (K = 0.9081), but gender identity was a poor proxy for other dimensions of sex or gender among trans participants. Issues to inform measure development or adaptation that emerged from analysis included dimensions of sex/gender measured, whether non-binary identities were trans, Indigenous and cultural identities, proxy reporting, temporality concerns, and the inability of a single item to provide a valid measure of sex/gender. Based on this evaluation, we recommend that population surveys meant for multi-purpose analysis consider a new Multidimensional Sex/Gender Measure for testing that includes three simple items (one asked only of a small sub-group) to assess gender

  8. Transgender-inclusive measures of sex/gender for population surveys: Mixed-methods evaluation and recommendations.

    Directory of Open Access Journals (Sweden)

    Greta R Bauer

    Full Text Available Given that an estimated 0.6% of the U.S. population is transgender (trans and that large health disparities for this population have been documented, government and research organizations are increasingly expanding measures of sex/gender to be trans inclusive. Options suggested for trans community surveys, such as expansive check-all-that-apply gender identity lists and write-in options that offer maximum flexibility, are generally not appropriate for broad population surveys. These require limited questions and a small number of categories for analysis. Limited evaluation has been undertaken of trans-inclusive population survey measures for sex/gender, including those currently in use. Using an internet survey and follow-up of 311 participants, and cognitive interviews from a maximum-diversity sub-sample (n = 79, we conducted a mixed-methods evaluation of two existing measures: a two-step question developed in the United States and a multidimensional measure developed in Canada. We found very low levels of item missingness, and no indicators of confusion on the part of cisgender (non-trans participants for both measures. However, a majority of interview participants indicated problems with each question item set. Agreement between the two measures in assessment of gender identity was very high (K = 0.9081, but gender identity was a poor proxy for other dimensions of sex or gender among trans participants. Issues to inform measure development or adaptation that emerged from analysis included dimensions of sex/gender measured, whether non-binary identities were trans, Indigenous and cultural identities, proxy reporting, temporality concerns, and the inability of a single item to provide a valid measure of sex/gender. Based on this evaluation, we recommend that population surveys meant for multi-purpose analysis consider a new Multidimensional Sex/Gender Measure for testing that includes three simple items (one asked only of a small sub-group to

  9. Transgender-inclusive measures of sex/gender for population surveys: Mixed-methods evaluation and recommendations

    Science.gov (United States)

    Bauer, Greta R.; Braimoh, Jessica; Scheim, Ayden I.; Dharma, Christoffer

    2017-01-01

    Given that an estimated 0.6% of the U.S. population is transgender (trans) and that large health disparities for this population have been documented, government and research organizations are increasingly expanding measures of sex/gender to be trans inclusive. Options suggested for trans community surveys, such as expansive check-all-that-apply gender identity lists and write-in options that offer maximum flexibility, are generally not appropriate for broad population surveys. These require limited questions and a small number of categories for analysis. Limited evaluation has been undertaken of trans-inclusive population survey measures for sex/gender, including those currently in use. Using an internet survey and follow-up of 311 participants, and cognitive interviews from a maximum-diversity sub-sample (n = 79), we conducted a mixed-methods evaluation of two existing measures: a two-step question developed in the United States and a multidimensional measure developed in Canada. We found very low levels of item missingness, and no indicators of confusion on the part of cisgender (non-trans) participants for both measures. However, a majority of interview participants indicated problems with each question item set. Agreement between the two measures in assessment of gender identity was very high (K = 0.9081), but gender identity was a poor proxy for other dimensions of sex or gender among trans participants. Issues to inform measure development or adaptation that emerged from analysis included dimensions of sex/gender measured, whether non-binary identities were trans, Indigenous and cultural identities, proxy reporting, temporality concerns, and the inability of a single item to provide a valid measure of sex/gender. Based on this evaluation, we recommend that population surveys meant for multi-purpose analysis consider a new Multidimensional Sex/Gender Measure for testing that includes three simple items (one asked only of a small sub-group) to assess gender

  10. Measurement of Ethnic Background in Cross-national School Surveys

    DEFF Research Database (Denmark)

    Jensen, Helene Nordahl; Krølner, Rikke; Páll, Gabrilla

    2011-01-01

    Indicators such as country of birth and language spoken at home have been used as proxy measures for ethnic background, but the validity of these indicators in surveys among school children remains unclear. This study aimed at comparing item response and student-parent agreement on four questions...

  11. Recommended core items to assess e-cigarette use in population-based surveys

    OpenAIRE

    Pearson, Jennifer L; Hitchman, Sara C; Brose, Leonie S; Bauld, Linda; Glasser, Allison M; Villanti, Andrea C; McNeill, Ann; Abrams, David B; Cohen, Joanna E

    2017-01-01

    Background: A consistent approach using standardized items to assess e-cigarette use in both youth and adult populations will aid cross-survey and cross-national comparisons of the effect of e-cigarette (and tobacco) policies and improve our understanding of the population health impact of e-cigarette use. Focusing on adult behavior, we propose a set of e-cigarette use items, discuss their utility and potential adaptation, and highlight e-cigarette constructs that researchers should avoid wit...

  12. An item-response theory approach to safety climate measurement: The Liberty Mutual Safety Climate Short Scales.

    Science.gov (United States)

    Huang, Yueng-Hsiang; Lee, Jin; Chen, Zhuo; Perry, MacKenna; Cheung, Janelle H; Wang, Mo

    2017-06-01

    Zohar and Luria's (2005) safety climate (SC) scale, measuring organization- and group- level SC each with 16 items, is widely used in research and practice. To improve the utility of the SC scale, we shortened the original full-length SC scales. Item response theory (IRT) analysis was conducted using a sample of 29,179 frontline workers from various industries. Based on graded response models, we shortened the original scales in two ways: (1) selecting items with above-average discriminating ability (i.e. offering more than 6.25% of the original total scale information), resulting in 8-item organization-level and 11-item group-level SC scales; and (2) selecting the most informative items that together retain at least 30% of original scale information, resulting in 4-item organization-level and 4-item group-level SC scales. All four shortened scales had acceptable reliability (≥0.89) and high correlations (≥0.95) with the original scale scores. The shortened scales will be valuable for academic research and practical survey implementation in improving occupational safety. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  13. Improving measurement of injection drug risk behavior using item response theory.

    Science.gov (United States)

    Janulis, Patrick

    2014-03-01

    Recent research highlights the multiple steps to preparing and injecting drugs and the resultant viral threats faced by drug users. This research suggests that more sensitive measurement of injection drug HIV risk behavior is required. In addition, growing evidence suggests there are gender differences in injection risk behavior. However, the potential for differential item functioning between genders has not been explored. To explore item response theory as an improved measurement modeling technique that provides empirically justified scaling of injection risk behavior and to examine for potential gender-based differential item functioning. Data is used from three studies in the National Institute on Drug Abuse's Criminal Justice Drug Abuse Treatment Studies. A two-parameter item response theory model was used to scale injection risk behavior and logistic regression was used to examine for differential item functioning. Item fit statistics suggest that item response theory can be used to scale injection risk behavior and these models can provide more sensitive estimates of risk behavior. Additionally, gender-based differential item functioning is present in the current data. Improved measurement of injection risk behavior using item response theory should be encouraged as these models provide increased congruence between construct measurement and the complexity of injection-related HIV risk. Suggestions are made to further improve injection risk behavior measurement. Furthermore, results suggest direct comparisons of composite scores between males and females may be misleading and future work should account for differential item functioning before comparing levels of injection risk behavior.

  14. Creating a Screening Measure of Health Literacy for the Health Information National Trends Survey.

    Science.gov (United States)

    Champlin, Sara; Mackert, Michael

    2016-03-01

    Create a screening measure of health literacy for use with the Health Information National Trends Survey (HINTS). Participants completed a paper-based survey. Items from the survey were used to construct a health literacy screening measure. A population-based survey conducted in geographic areas of high and low minority frequency and in Central Appalachia. Two thousand nine hundred four English-speaking participants were included in this study: 66% white, 93% completed high school, mean age = 52.53 years (SD = 16.24). A health literacy screening measure was created using four items included in the HINTS survey. Scores could range from 0 (no questions affirmative/correct) to 4 (all questions answered affirmatively/correctly). Multiple regression analysis was used to determine whether demographic variables known to predict health literacy were indeed associated with the constructed health literacy screening measure. The weighted average health literacy score was 2.63 (SD = 1.00). Those who were nonwhite (p = .0005), were older (p literacy screening measure scores. This study highlights the need to assess health literacy in national surveys, but also serves as evidence that screening measures can be created within existing datasets to give researchers the ability to consider the impact of health literacy. © The Author(s) 2016.

  15. Overview of Classical Test Theory and Item Response Theory for Quantitative Assessment of Items in Developing Patient-Reported Outcome Measures

    Science.gov (United States)

    Cappelleri, Joseph C.; Lundy, J. Jason; Hays, Ron D.

    2014-01-01

    Introduction The U.S. Food and Drug Administration’s patient-reported outcome (PRO) guidance document defines content validity as “the extent to which the instrument measures the concept of interest” (FDA, 2009, p. 12). “Construct validity is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity” (Strauss & Smith, 2009, p. 7). Hence both qualitative and quantitative information are essential in evaluating the validity of measures. Methods We review classical test theory and item response theory approaches to evaluating PRO measures including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized “difficulty” (severity) order of items is represented by observed responses. Conclusion Classical test theory and item response theory can be useful in providing a quantitative assessment of items and scales during the content validity phase of patient-reported outcome measures. Depending on the particular type of measure and the specific circumstances, either one or both approaches should be considered to help maximize the content validity of PRO measures. PMID:24811753

  16. Development and validation of a survey instrument to measure children's advertising literacy

    NARCIS (Netherlands)

    Rozendaal, E.; Opree, S.J.; Buijzen, M.A.

    2016-01-01

    The aim of this study was to develop and validate a survey measurement instrument for children's advertising literacy. Based on the multidimensional conceptualization of advertising literacy by 0056"> Rozendaal, Lapierre, Van Reijmersdal, and Buijzen (2011), 39 items were created to measure two

  17. An Effect Size Measure for Raju's Differential Functioning for Items and Tests

    Science.gov (United States)

    Wright, Keith D.; Oshima, T. C.

    2015-01-01

    This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…

  18. Assessing errors related to characteristics of the items measured

    International Nuclear Information System (INIS)

    Liggett, W.

    1980-01-01

    Errors that are related to some intrinsic property of the items measured are often encountered in nuclear material accounting. An example is the error in nondestructive assay measurements caused by uncorrected matrix effects. Nuclear material accounting requires for each materials type one measurement method for which bounds on these errors can be determined. If such a method is available, a second method might be used to reduce costs or to improve precision. If the measurement error for the first method is longer-tailed than Gaussian, then precision might be improved by measuring all items by both methods. 8 refs

  19. Using Localized Survey Items to Augment Standardized Benchmarking Measures: A LibQUAL+[TM] Study

    Science.gov (United States)

    Thompson, Bruce; Cook, Colleen; Kyrillidou, Martha

    2006-01-01

    The LibQUAL+[TM] protocol solicits open-ended comments from users with regard to library service quality, gathers data on 22 core items, and, at the option of individual libraries, also garners ratings on five items drawn from a pool of more than 100 choices selected by libraries. In this article, the relationship of scores on these locally…

  20. The 1992 Pacific Northwest Residential Energy Survey : Phase 1 : Book 4 : Item-by-item Crosstabulations.

    Energy Technology Data Exchange (ETDEWEB)

    United States. Bonneville Power Administration. End-Use Research Section; Applied Management & Planning Group (Firm)

    1993-06-01

    This book constitutes a portion of the primary documentation for the 1992 Pacific Northwest Residential Energy Survey, Phase I. The complete 33-volume set of primary documentation provides information needed by energy analysts and interpreters with respect to planning, execution, data collection, and data management of the PNWRES92-I process. Thirty of these volumes are devoted to different ``views`` of the data themselves, with each view having a special purpose or interest as its focus. Analyses and interpretations of these data will be the subjects of forthcoming publications. Conducted during the late summer and fall months of 1992, PNWRES92-I had the over-arching goal of satisfying basic requirements for a variety of information about the stock of residential units in Bonneville`s service region. Surveys with a similar goal were conducted in 1979 and 1983. This volume discerns the information by state. ``Selected crosstabulations`` refers to a set of nine survey items of wide interest (Dwelling Type, Ownership Type, Year-of-Construction, Dwelling Size, Primary Space-Heating Fuel, Primary Water-Heating Fuel, Household Income for 1991, Utility Type, and Space-Heating Fuels: Systems and Equipment) that were crosstabulated among themselves.

  1. Development of an item bank for computerized adaptive test (CAT) measurement of pain

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Aaronson, Neil K; Chie, Wei-Chu

    2016-01-01

    PURPOSE: Patient-reported outcomes should ideally be adapted to the individual patient while maintaining comparability of scores across patients. This is achievable using computerized adaptive testing (CAT). The aim here was to develop an item bank for CAT measurement of the pain domain as measured...... were obtained from 1103 cancer patients from five countries. Psychometric evaluations showed that 16 items could be retained in a unidimensional item bank. Evaluations indicated that use of the CAT measure may reduce sample size requirements with 15-25 % compared to using the QLQ-C30 pain scale....... CONCLUSIONS: We have established an item bank of 16 items suitable for CAT measurement of pain. While being backward compatible with the QLQ-C30, the new item bank will significantly improve measurement precision of pain. We recommend initiating CAT measurement by screening for pain using the two original QLQ...

  2. Assessing the validity of single-item life satisfaction measures: results from three large samples.

    Science.gov (United States)

    Cheung, Felix; Lucas, Richard E

    2014-12-01

    The present paper assessed the validity of single-item life satisfaction measures by comparing single-item measures to the Satisfaction with Life Scale (SWLS)-a more psychometrically established measure. Two large samples from Washington (N = 13,064) and Oregon (N = 2,277) recruited by the Behavioral Risk Factor Surveillance System and a representative German sample (N = 1,312) recruited by the Germany Socio-Economic Panel were included in the present analyses. Single-item life satisfaction measures and the SWLS were correlated with theoretically relevant variables, such as demographics, subjective health, domain satisfaction, and affect. The correlations between the two life satisfaction measures and these variables were examined to assess the construct validity of single-item life satisfaction measures. Consistent across three samples, single-item life satisfaction measures demonstrated substantial degree of criterion validity with the SWLS (zero-order r = 0.62-0.64; disattenuated r = 0.78-0.80). Patterns of statistical significance for correlations with theoretically relevant variables were the same across single-item measures and the SWLS. Single-item measures did not produce systematically different correlations compared to the SWLS (average difference = 0.001-0.005). The average absolute difference in the magnitudes of the correlations produced by single-item measures and the SWLS was very small (average absolute difference = 0.015-0.042). Single-item life satisfaction measures performed very similarly compared to the multiple-item SWLS. Social scientists would get virtually identical answer to substantive questions regardless of which measure they use.

  3. A signal detection-item response theory model for evaluating neuropsychological measures.

    Science.gov (United States)

    Thomas, Michael L; Brown, Gregory G; Gur, Ruben C; Moore, Tyler M; Patt, Virginie M; Risbrough, Victoria B; Baker, Dewleen G

    2018-02-05

    Models from signal detection theory are commonly used to score neuropsychological test data, especially tests of recognition memory. Here we show that certain item response theory models can be formulated as signal detection theory models, thus linking two complementary but distinct methodologies. We then use the approach to evaluate the validity (construct representation) of commonly used research measures, demonstrate the impact of conditional error on neuropsychological outcomes, and evaluate measurement bias. Signal detection-item response theory (SD-IRT) models were fitted to recognition memory data for words, faces, and objects. The sample consisted of U.S. Infantry Marines and Navy Corpsmen participating in the Marine Resiliency Study. Data comprised item responses to the Penn Face Memory Test (PFMT; N = 1,338), Penn Word Memory Test (PWMT; N = 1,331), and Visual Object Learning Test (VOLT; N = 1,249), and self-report of past head injury with loss of consciousness. SD-IRT models adequately fitted recognition memory item data across all modalities. Error varied systematically with ability estimates, and distributions of residuals from the regression of memory discrimination onto self-report of past head injury were positively skewed towards regions of larger measurement error. Analyses of differential item functioning revealed little evidence of systematic bias by level of education. SD-IRT models benefit from the measurement rigor of item response theory-which permits the modeling of item difficulty and examinee ability-and from signal detection theory-which provides an interpretive framework encompassing the experimentally validated constructs of memory discrimination and response bias. We used this approach to validate the construct representation of commonly used research measures and to demonstrate how nonoptimized item parameters can lead to erroneous conclusions when interpreting neuropsychological test data. Future work might include the

  4. Assessing the Validity of Single-item Life Satisfaction Measures: Results from Three Large Samples

    Science.gov (United States)

    Cheung, Felix; Lucas, Richard E.

    2014-01-01

    Purpose The present paper assessed the validity of single-item life satisfaction measures by comparing single-item measures to the Satisfaction with Life Scale (SWLS) - a more psychometrically established measure. Methods Two large samples from Washington (N=13,064) and Oregon (N=2,277) recruited by the Behavioral Risk Factor Surveillance System (BRFSS) and a representative German sample (N=1,312) recruited by the Germany Socio-Economic Panel (GSOEP) were included in the present analyses. Single-item life satisfaction measures and the SWLS were correlated with theoretically relevant variables, such as demographics, subjective health, domain satisfaction, and affect. The correlations between the two life satisfaction measures and these variables were examined to assess the construct validity of single-item life satisfaction measures. Results Consistent across three samples, single-item life satisfaction measures demonstrated substantial degree of criterion validity with the SWLS (zero-order r = 0.62 – 0.64; disattenuated r = 0.78 – 0.80). Patterns of statistical significance for correlations with theoretically relevant variables were the same across single-item measures and the SWLS. Single-item measures did not produce systematically different correlations compared to the SWLS (average difference = 0.001 – 0.005). The average absolute difference in the magnitudes of the correlations produced by single-item measures and the SWLS were very small (average absolute difference = 0.015 −0.042). Conclusions Single-item life satisfaction measures performed very similarly compared to the multiple-item SWLS. Social scientists would get virtually identical answer to substantive questions regardless of which measure they use. PMID:24890827

  5. Varying the item format improved the range of measurement in patient-reported outcome measures assessing physical function.

    Science.gov (United States)

    Liegl, Gregor; Gandek, Barbara; Fischer, H Felix; Bjorner, Jakob B; Ware, John E; Rose, Matthias; Fries, James F; Nolte, Sandra

    2017-03-21

    Physical function (PF) is a core patient-reported outcome domain in clinical trials in rheumatic diseases. Frequently used PF measures have ceiling effects, leading to large sample size requirements and low sensitivity to change. In most of these instruments, the response category that indicates the highest PF level is the statement that one is able to perform a given physical activity without any limitations or difficulty. This study investigates whether using an item format with an extended response scale, allowing respondents to state that the performance of an activity is easy or very easy, increases the range of precise measurement of self-reported PF. Three five-item PF short forms were constructed from the Patient-Reported Outcomes Measurement Information System (PROMIS®) wave 1 data. All forms included the same physical activities but varied in item stem and response scale: format A ("Are you able to …"; "without any difficulty"/"unable to do"); format B ("Does your health now limit you …"; "not at all"/"cannot do"); format C ("How difficult is it for you to …"; "very easy"/"impossible"). Each short-form item was answered by 2217-2835 subjects. We evaluated unidimensionality and estimated a graded response model for the 15 short-form items and remaining 119 items of the PROMIS PF bank to compare item and test information for the short forms along the PF continuum. We then used simulated data for five groups with different PF levels to illustrate differences in scoring precision between the short forms using different item formats. Sufficient unidimensionality of all short-form items and the original PF item bank was supported. Compared to formats A and B, format C increased the range of reliable measurement by about 0.5 standard deviations on the positive side of the PF continuum of the sample, provided more item information, and was more useful in distinguishing known groups with above-average functioning. Using an item format with an extended

  6. [Instrument to measure adherence in hypertensive patients: contribution of Item Response Theory].

    Science.gov (United States)

    Rodrigues, Malvina Thaís Pacheco; Moreira, Thereza Maria Magalhaes; Vasconcelos, Alexandre Meira de; Andrade, Dalton Francisco de; Silva, Daniele Braz da; Barbetta, Pedro Alberto

    2013-06-01

    To analyze, by means of "Item Response Theory", an instrument to measure adherence to t treatment for hypertension. Analytical study with 406 hypertensive patients with associated complications seen in primary care in Fortaleza, CE, Northeastern Brazil, 2011 using "Item Response Theory". The stages were: dimensionality test, calibrating the items, processing data and creating a scale, analyzed using the gradual response model. A study of the dimensionality of the instrument was conducted by analyzing the polychoric correlation matrix and factor analysis of complete information. Multilog software was used to calibrate items and estimate the scores. Items relating to drug therapy are the most directly related to adherence while those relating to drug-free therapy need to be reworked because they have less psychometric information and low discrimination. The independence of items, the small number of levels in the scale and low explained variance in the adjustment of the models show the main weaknesses of the instrument analyzed. The "Item Response Theory" proved to be a relevant analysis technique because it evaluated respondents for adherence to treatment for hypertension, the level of difficulty of the items and their ability to discriminate between individuals with different levels of adherence, which generates a greater amount of information. The instrument analyzed is limited in measuring adherence to hypertension treatment, by analyzing the "Item Response Theory" of the item, and needs adjustment. The proper formulation of the items is important in order to accurately measure the desired latent trait.

  7. The measurement of tritium in Canadian food items

    International Nuclear Information System (INIS)

    Brown, R.M.

    1995-03-01

    Food items locally grown near Perth, Ontario and grocery store produce and locally grown items from the Pickering-Ajax area in the vicinity of the Pickering Nuclear Generating Station (PNGS) have been analyzed for free water tritium (HTO) and organically bound tritium (OBT). The technique of measuring 3 He ingrowth in samples by mass spectrometry has been used because of its sensitivity and freedom from opportunity for contamination during processing and measurement. Concentrations observed at each site were of the order expected on the basis of known levels of tritium in the local atmosphere and precipitation. There was considerable variation between different materials and limited correlation between materials of a single type. (author). 10 refs., 8 tabs., 4 figs

  8. Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

    Science.gov (United States)

    Suh, Youngsuk

    2016-01-01

    This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…

  9. Psychometric properties of the Triarchic Psychopathy Measure: An item response theory approach.

    Science.gov (United States)

    Shou, Yiyun; Sellbom, Martin; Xu, Jing

    2018-05-01

    There is cumulative evidence for the cross-cultural validity of the Triarchic Psychopathy Measure (TriPM; Patrick, 2010) among non-Western populations. Recent studies using correlational and regression analyses show promising construct validity of the TriPM in Chinese samples. However, little is known about the efficiency of items in TriPM in assessing the proposed latent traits. The current study evaluated the psychometric properties of the Chinese TriPM at the item level using item response theory analyses. It also examined the measurement invariance of the TriPM between the Chinese and the U.S. student samples by applying differential item functioning analyses under the item response theory framework. The results supported the unidimensional nature of the Disinhibition and Meanness scales. Both scales had a greater level of precision in the respective underlying constructs at the positive ends. The two scales, however, had several items that were weakly associated with their respective latent traits in the Chinese student sample. Boldness, on the other hand, was found to be multidimensional, and reflected a more normally distributed range of variation. The examination of measurement bias via differential item functioning analyses revealed that a number of items of the TriPM were not equivalent across the Chinese and the U.S. Some modification and adaptation of items might be considered for improving the precision of the TriPM for Chinese participants. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  10. Robust Measurement via A Fused Latent and Graphical Item Response Theory Model.

    Science.gov (United States)

    Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Ying, Zhiliang

    2018-03-12

    Item response theory (IRT) plays an important role in psychological and educational measurement. Unlike the classical testing theory, IRT models aggregate the item level information, yielding more accurate measurements. Most IRT models assume local independence, an assumption not likely to be satisfied in practice, especially when the number of items is large. Results in the literature and simulation studies in this paper reveal that misspecifying the local independence assumption may result in inaccurate measurements and differential item functioning. To provide more robust measurements, we propose an integrated approach by adding a graphical component to a multidimensional IRT model that can offset the effect of unknown local dependence. The new model contains a confirmatory latent variable component, which measures the targeted latent traits, and a graphical component, which captures the local dependence. An efficient proximal algorithm is proposed for the parameter estimation and structure learning of the local dependence. This approach can substantially improve the measurement, given no prior information on the local dependence structure. The model can be applied to measure both a unidimensional latent trait and multidimensional latent traits.

  11. Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.

    Science.gov (United States)

    Cappelleri, Joseph C; Jason Lundy, J; Hays, Ron D

    2014-05-01

    The US Food and Drug Administration's guidance for industry document on patient-reported outcomes (PRO) defines content validity as "the extent to which the instrument measures the concept of interest" (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity" (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. We review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized "difficulty" (severity) order of items is represented by observed responses. If a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. Classical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures. Copyright © 2014 Elsevier HS Journals, Inc. All rights reserved.

  12. Cross-National Prevalence of Traditional Bullying, Traditional Victimization, Cyberbullying and Cyber-Victimization: Comparing Single-Item and Multiple-Item Approaches of Measurement

    Science.gov (United States)

    Yanagida, Takuya; Gradinger, Petra; Strohmeier, Dagmar; Solomontos-Kountouri, Olga; Trip, Simona; Bora, Carmen

    2016-01-01

    Many large-scale cross-national studies rely on a single-item measurement when comparing prevalence rates of traditional bullying, traditional victimization, cyberbullying, and cyber-victimization between countries. However, the reliability and validity of single-item measurement approaches are highly problematic and might be biased. Data from…

  13. Development of six PROMIS pediatrics proxy-report item banks.

    Science.gov (United States)

    Irwin, Debra E; Gross, Heather E; Stucky, Brian D; Thissen, David; DeWitt, Esi Morgan; Lai, Jin Shei; Amtmann, Dagmar; Khastou, Leyla; Varni, James W; DeWalt, Darren A

    2012-02-22

    Pediatric self-report should be considered the standard for measuring patient reported outcomes (PRO) among children. However, circumstances exist when the child is too young, cognitively impaired, or too ill to complete a PRO instrument and a proxy-report is needed. This paper describes the development process including the proxy cognitive interviews and large-field-test survey methods and sample characteristics employed to produce item parameters for the Patient Reported Outcomes Measurement Information System (PROMIS) pediatric proxy-report item banks. The PROMIS pediatric self-report items were converted into proxy-report items before undergoing cognitive interviews. These items covered six domains (physical function, emotional distress, social peer relationships, fatigue, pain interference, and asthma impact). Caregivers (n = 25) of children ages of 5 and 17 years provided qualitative feedback on proxy-report items to assess any major issues with these items. From May 2008 to March 2009, the large-scale survey enrolled children ages 8-17 years to complete the self-report version and caregivers to complete the proxy-report version of the survey (n = 1548 dyads). Caregivers of children ages 5 to 7 years completed the proxy report survey (n = 432). In addition, caregivers completed other proxy instruments, PedsQL™ 4.0 Generic Core Scales Parent Proxy-Report version, PedsQL™ Asthma Module Parent Proxy-Report version, and KIDSCREEN Parent-Proxy-52. Item content was well understood by proxies and did not require item revisions but some proxies clearly noted that determining an answer on behalf of their child was difficult for some items. Dyads and caregivers of children ages 5-17 years old were enrolled in the large-scale testing. The majority were female (85%), married (70%), Caucasian (64%) and had at least a high school education (94%). Approximately 50% had children with a chronic health condition, primarily asthma, which was diagnosed or treated within 6

  14. Measurement equivalence and differential item functioning in family psychology.

    Science.gov (United States)

    Bingenheimer, Jeffrey B; Raudenbush, Stephen W; Leventhal, Tama; Brooks-Gunn, Jeanne

    2005-09-01

    Several hypotheses in family psychology involve comparisons of sociocultural groups. Yet the potential for cross-cultural inequivalence in widely used psychological measurement instruments threatens the validity of inferences about group differences. Methods for dealing with these issues have been developed via the framework of item response theory. These methods deal with an important type of measurement inequivalence, called differential item functioning (DIF). The authors introduce DIF analytic methods, linking them to a well-established framework for conceptualizing cross-cultural measurement equivalence in psychology (C.H. Hui and H.C. Triandis, 1985). They illustrate the use of DIF methods using data from the Project on Human Development in Chicago Neighborhoods (PHDCN). Focusing on the Caregiver Warmth and Environmental Organization scales from the PHDCN's adaptation of the Home Observation for Measurement of the Environment Inventory, the authors obtain results that exemplify the range of outcomes that may result when these methods are applied to psychological measurement instruments. (c) 2005 APA, all rights reserved

  15. Individual Social Capital and Its Measurement in Social Surveys

    Directory of Open Access Journals (Sweden)

    Keming Yang

    2007-01-01

    Full Text Available With its popularity has come an unresolved issue about social capital: is it an individual or a collective property, or both? Many researchers take it for granted that social capital is collective, but most social surveys implicitly measure social capital at the individual level. After reviewing the definitions by Bourdieu, Coleman, and Putnam, I become to agree with Portes that social capital can be an individual asset and should be firstly analyzed as such; if social capital is to be analyzed as a collective property, then the analysis should explicitly draw on a clear definition of individual social capital. I thus define individual social capital as the features of social groups or networks that each individual member can access and use for obtaining further benefits. Four types of features are identified (basic, specific, generalized, and structural, and example formulations of survey questions are proposed. Following this approach, I then assess some survey questions organized under five themes commonly found in social surveys for measuring social capital: participation in organizations, social networks, trust, civic participation, and perceptions of local area. I conclude that most of these themes and questions only weakly or indirectly measure individual social capital; therefore, they should be strengthened with the conceptual framework proposed in this paper and complemented with the items used in independent surveys on social networks.

  16. Varying the item format improved the range of measurement in patient-reported outcome measures assessing physical function

    DEFF Research Database (Denmark)

    Liegl, Gregor; Gandek, Barbara; Fischer, H. Felix

    2017-01-01

    precision between the short forms using different item formats. Results: Sufficient unidimensionality of all short-form items and the original PF item bank was supported. Compared to formats A and B, format C increased the range of reliable measurement by about 0.5 standard deviations on the positive side...

  17. The adequacy of measures of gender roles attitudes: a review of current measures in omnibus surveys.

    Science.gov (United States)

    Walter, Jessica Gabriele

    2018-01-01

    The measures of attitudes toward gender roles included in many representative international and national omnibus surveys were developed mostly in the 1970s and 1980s with a focus on the male breadwinner model. This article deals with the issue of whether the measures provided in these omnibus surveys need to be adjusted to specific social changes. A review of these measures has found that adjustments have occurred in a limited way that focused on the role of women and disregarded the role of men. Furthermore, most of these measures only examined the traditional roles of men and women. More egalitarian role models have not been considered sufficiently. In addition, most items that have been measured are phrased in a general form and, for example, do not specify parents' employment or the ages of children. A specification of these aspects of measurement would help to clarify the conceptual meaning of the results and increase the possibility of more accurately analyzing gender role attitudes over time.

  18. Developing an item bank to measure the coping strategies of people with hereditary retinal diseases.

    Science.gov (United States)

    Prem Senthil, Mallika; Khadka, Jyoti; De Roach, John; Lamey, Tina; McLaren, Terri; Campbell, Isabella; Fenwick, Eva K; Lamoureux, Ecosse L; Pesudovs, Konrad

    2018-05-05

    Our understanding of the coping strategies used by people with visual impairment to manage stress related to visual loss is limited. This study aims to develop a sophisticated coping instrument in the form of an item bank implemented via Computerised adaptive testing (CAT) for hereditary retinal diseases. Items on coping were extracted from qualitative interviews with patients which were supplemented by items from a literature review. A systematic multi-stage process of item refinement was carried out followed by expert panel discussion and cognitive interviews. The final coping item bank had 30 items. Rasch analysis was used to assess the psychometric properties. A CAT simulation was carried out to estimate an average number of items required to gain precise measurement of hereditary retinal disease-related coping. One hundred eighty-nine participants answered the coping item bank (median age = 58 years). The coping scale demonstrated good precision and targeting. The standardised residual loadings for items revealed six items grouped together. Removal of the six items reduced the precision of the main coping scale and worsened the variance explained by the measure. Therefore, the six items were retained within the main scale. Our CAT simulation indicated that, on average, less than 10 items are required to gain a precise measurement of coping. This is the first study to develop a psychometrically robust coping instrument for hereditary retinal diseases. CAT simulation indicated that on an average, only four and nine items were required to gain measurement at moderate and high precision, respectively.

  19. Development of six PROMIS pediatrics proxy-report item banks

    Directory of Open Access Journals (Sweden)

    Irwin Debra E

    2012-02-01

    Full Text Available Abstract Background Pediatric self-report should be considered the standard for measuring patient reported outcomes (PRO among children. However, circumstances exist when the child is too young, cognitively impaired, or too ill to complete a PRO instrument and a proxy-report is needed. This paper describes the development process including the proxy cognitive interviews and large-field-test survey methods and sample characteristics employed to produce item parameters for the Patient Reported Outcomes Measurement Information System (PROMIS pediatric proxy-report item banks. Methods The PROMIS pediatric self-report items were converted into proxy-report items before undergoing cognitive interviews. These items covered six domains (physical function, emotional distress, social peer relationships, fatigue, pain interference, and asthma impact. Caregivers (n = 25 of children ages of 5 and 17 years provided qualitative feedback on proxy-report items to assess any major issues with these items. From May 2008 to March 2009, the large-scale survey enrolled children ages 8-17 years to complete the self-report version and caregivers to complete the proxy-report version of the survey (n = 1548 dyads. Caregivers of children ages 5 to 7 years completed the proxy report survey (n = 432. In addition, caregivers completed other proxy instruments, PedsQL™ 4.0 Generic Core Scales Parent Proxy-Report version, PedsQL™ Asthma Module Parent Proxy-Report version, and KIDSCREEN Parent-Proxy-52. Results Item content was well understood by proxies and did not require item revisions but some proxies clearly noted that determining an answer on behalf of their child was difficult for some items. Dyads and caregivers of children ages 5-17 years old were enrolled in the large-scale testing. The majority were female (85%, married (70%, Caucasian (64% and had at least a high school education (94%. Approximately 50% had children with a chronic health condition, primarily

  20. A more general model for testing measurement invariance and differential item functioning.

    Science.gov (United States)

    Bauer, Daniel J

    2017-09-01

    The evaluation of measurement invariance is an important step in establishing the validity and comparability of measurements across individuals. Most commonly, measurement invariance has been examined using 1 of 2 primary latent variable modeling approaches: the multiple groups model or the multiple-indicator multiple-cause (MIMIC) model. Both approaches offer opportunities to detect differential item functioning within multi-item scales, and thereby to test measurement invariance, but both approaches also have significant limitations. The multiple groups model allows 1 to examine the invariance of all model parameters but only across levels of a single categorical individual difference variable (e.g., ethnicity). In contrast, the MIMIC model permits both categorical and continuous individual difference variables (e.g., sex and age) but permits only a subset of the model parameters to vary as a function of these characteristics. The current article argues that moderated nonlinear factor analysis (MNLFA) constitutes an alternative, more flexible model for evaluating measurement invariance and differential item functioning. We show that the MNLFA subsumes and combines the strengths of the multiple group and MIMIC models, allowing for a full and simultaneous assessment of measurement invariance and differential item functioning across multiple categorical and/or continuous individual difference variables. The relationships between the MNLFA model and the multiple groups and MIMIC models are shown mathematically and via an empirical demonstration. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  1. The failing measurement of attitudes: How semantic determinants of individual survey responses come to replace measures of attitude strength.

    Science.gov (United States)

    Arnulf, Jan Ketil; Larsen, Kai Rune; Martinsen, Øyvind Lund; Egeland, Thore

    2018-01-12

    The traditional understanding of data from Likert scales is that the quantifications involved result from measures of attitude strength. Applying a recently proposed semantic theory of survey response, we claim that survey responses tap two different sources: a mixture of attitudes plus the semantic structure of the survey. Exploring the degree to which individual responses are influenced by semantics, we hypothesized that in many cases, information about attitude strength is actually filtered out as noise in the commonly used correlation matrix. We developed a procedure to separate the semantic influence from attitude strength in individual response patterns, and compared these results to, respectively, the observed sample correlation matrices and the semantic similarity structures arising from text analysis algorithms. This was done with four datasets, comprising a total of 7,787 subjects and 27,461,502 observed item pair responses. As we argued, attitude strength seemed to account for much information about the individual respondents. However, this information did not seem to carry over into the observed sample correlation matrices, which instead converged around the semantic structures offered by the survey items. This is potentially disturbing for the traditional understanding of what survey data represent. We argue that this approach contributes to a better understanding of the cognitive processes involved in survey responses. In turn, this could help us make better use of the data that such methods provide.

  2. Measuring participation in patients with chronic back pain-the 5-Item Pain Disability Index.

    Science.gov (United States)

    McKillop, Ashley B; Carroll, Linda J; Dick, Bruce D; Battié, Michele C

    2018-02-01

    Of the three broad outcome domains of body functions and structures, activities, and participation (eg, engaging in valued social roles) outlined in the World Health Organization's (WHO) International Classification of Functioning, Disability and Health (ICF), it has been argued that participation is the most important to individuals, particularly those with chronic health problems. Yet, participation is not commonly measured in back pain research. The aim of this study was to investigate the construct validity of a modified 5-Item Pain Disability Index (PDI) score as a measure of participation in people with chronic back pain. A validation study was conducted using cross-sectional data. Participants with chronic back pain were recruited from a multidisciplinary pain center in Alberta, Canada. The outcome measure of interest is the 5-Item PDI. Each study participant was given a questionnaire package containing measures of participation, resilience, anxiety and depression, pain intensity, and pain-related disability, in addition to the PDI. The first five items of the PDI deal with social roles involving family responsibilities, recreation, social activities with friends, work, and sexual behavior, and comprised the 5-Item PDI seeking to measure participation. The last two items of the PDI deal with self-care and life support functions and were excluded. Construct validity of the 5-Item PDI as a measure of participation was examined using Pearson correlations or point-biserial correlations to test each hypothesized association. Participants were 70 people with chronic back pain and a mean age of 48.1 years. Forty-four (62.9%) were women. As hypothesized, the 5-Item PDI was associated with all measures of participation, including the Participation Assessment with Recombined Tools-Objective (r=-0.61), Late-Life Function and Disability Instrument: Disability Component (frequency: r=-0.66; limitation: r=-0.65), Work and Social Adjustment Scale (r=0.85), a global

  3. Development of a Microsoft Excel tool for one-parameter Rasch model of continuous items: an application to a safety attitude survey.

    Science.gov (United States)

    Chien, Tsair-Wei; Shao, Yang; Kuo, Shu-Chun

    2017-01-10

    Many continuous item responses (CIRs) are encountered in healthcare settings, but no one uses item response theory's (IRT) probabilistic modeling to present graphical presentations for interpreting CIR results. A computer module that is programmed to deal with CIRs is required. To present a computer module, validate it, and verify its usefulness in dealing with CIR data, and then to apply the model to real healthcare data in order to show how the CIR that can be applied to healthcare settings with an example regarding a safety attitude survey. Using Microsoft Excel VBA (Visual Basic for Applications), we designed a computer module that minimizes the residuals and calculates model's expected scores according to person responses across items. Rasch models based on a Wright map and on KIDMAP were demonstrated to interpret results of the safety attitude survey. The author-made CIR module yielded OUTFIT mean square (MNSQ) and person measures equivalent to those yielded by professional Rasch Winsteps software. The probabilistic modeling of the CIR module provides messages that are much more valuable to users and show the CIR advantage over classic test theory. Because of advances in computer technology, healthcare users who are familiar to MS Excel can easily apply the study CIR module to deal with continuous variables to benefit comparisons of data with a logistic distribution and model fit statistics.

  4. Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank.

    Science.gov (United States)

    Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Vonkeman, Harald E; van de Laar, Mart A F J

    2017-11-01

    Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Seventy-two items generated from patient interviews and mapped to the International Classification of Functioning, Disability and Health (ICF) domestic life chapter were administered to 1128 adults representative of the Dutch population. The partial credit model was fitted to the item responses and evaluated with respect to its assumptions, model fit, and differential item functioning (DIF). Measurement performance of a computerized adaptive testing (CAT) algorithm was compared with the SF-36 physical functioning scale (PF-10). A final bank of 41 items was developed. All items demonstrated acceptable fit to the partial credit model and measurement invariance across age, sex, and educational level. Five- and ten-item CAT simulations were shown to have high measurement precision, which exceeded that of SF-36 physical functioning scale across the physical function continuum. Floor effects were absent for a 10-item empirical CAT simulation, and ceiling effects were low (13.5%) compared with SF-36 physical functioning (38.1%). CAT also discriminated better than SF-36 physical functioning between age groups, number of chronic conditions, and respondents with or without rheumatic conditions. The Rasch assessment of everyday activity limitations (REAL) item bank will hopefully prove a useful instrument for assessing everyday activity limitations. T-scores obtained using derived measures can be used to benchmark physical function outcomes against the general Dutch adult population.

  5. The Servant Leadership Survey: Development and Validation of a Multidimensional Measure.

    Science.gov (United States)

    van Dierendonck, Dirk; Nuijten, Inge

    2011-09-01

    PURPOSE: The purpose of this paper is to describe the development and validation of a multi-dimensional instrument to measure servant leadership. DESIGN/METHODOLOGY/APPROACH: Based on an extensive literature review and expert judgment, 99 items were formulated. In three steps, using eight samples totaling 1571 persons from The Netherlands and the UK with a diverse occupational background, a combined exploratory and confirmatory factor analysis approach was used. This was followed by an analysis of the criterion-related validity. FINDINGS: The final result is an eight-dimensional measure of 30 items: the eight dimensions being: standing back, forgiveness, courage, empowerment, accountability, authenticity, humility, and stewardship. The internal consistency of the subscales is good. The results show that the Servant Leadership Survey (SLS) has convergent validity with other leadership measures, and also adds unique elements to the leadership field. Evidence for criterion-related validity came from studies relating the eight dimensions to well-being and performance. IMPLICATIONS: With this survey, a valid and reliable instrument to measure the essential elements of servant leadership has been introduced. ORIGINALITY/VALUE: The SLS is the first measure where the underlying factor structure was developed and confirmed across several field studies in two countries. It can be used in future studies to test the underlying premises of servant leadership theory. The SLS provides a clear picture of the key servant leadership qualities and shows where improvements can be made on the individual and organizational level; as such, it may also offer a valuable starting point for training and leadership development.

  6. Validity of Suicidality Items from the Youth Risk Behavior Survey in a High School Sample

    Science.gov (United States)

    May, Alexis; Klonsky, E. David

    2011-01-01

    The Youth Risk Behavior Survey (YRBS) is used by the United States Centers for Disease Control to estimate rates of suicidal thoughts and behaviors in adolescents. This study investigated the validity of the YRBS suicidality items by examining their relationship to criterion variables including loneliness, anxiety, depression, substance use, and…

  7. Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2013-01-01

    Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

  8. Measuring organizational effectiveness in information and communication technology companies using item response theory.

    Science.gov (United States)

    Trierweiller, Andréa Cristina; Peixe, Blênio César Severo; Tezza, Rafael; Pereira, Vera Lúcia Duarte do Valle; Pacheco, Waldemar; Bornia, Antonio Cezar; de Andrade, Dalton Francisco

    2012-01-01

    The aim of this paper is to measure the effectiveness of the organizations Information and Communication Technology (ICT) from the point of view of the manager, using Item Response Theory (IRT). There is a need to verify the effectiveness of these organizations which are normally associated to complex, dynamic, and competitive environments. In academic literature, there is disagreement surrounding the concept of organizational effectiveness and its measurement. A construct was elaborated based on dimensions of effectiveness towards the construction of the items of the questionnaire which submitted to specialists for evaluation. It demonstrated itself to be viable in measuring organizational effectiveness of ICT companies under the point of view of a manager through using Two-Parameter Logistic Model (2PLM) of the IRT. This modeling permits us to evaluate the quality and property of each item placed within a single scale: items and respondents, which is not possible when using other similar tools.

  9. Image-Based Collection and Measurements for Construction Pay Items

    Science.gov (United States)

    2017-07-01

    Prior to each payment to contractors and suppliers, measurements are made to document the actual amount of pay items placed at the site. This manual process has substantial risk for personnel, and could be made more efficient and less prone to human ...

  10. Reliability, precision, and measurement in the context of data from ability tests, surveys, and assessments

    International Nuclear Information System (INIS)

    Fisher, W P Jr; Elbaum, B; Coulter, A

    2010-01-01

    Reliability coefficients indicate the proportion of total variance attributable to differences among measures separated along a quantitative continuum by a testing, survey, or assessment instrument. Reliability is usually considered to be influenced by both the internal consistency of a data set and the number of items, though textbooks and research papers rarely evaluate the extent to which these factors independently affect the data in question. Probabilistic formulations of the requirements for unidimensional measurement separate consistency from error by modelling individual response processes instead of group-level variation. The utility of this separation is illustrated via analyses of small sets of simulated data, and of subsets of data from a 78-item survey of over 2,500 parents of children with disabilities. Measurement reliability ultimately concerns the structural invariance specified in models requiring sufficient statistics, parameter separation, unidimensionality, and other qualities that historically have made quantification simple, practical, and convenient for end users. The paper concludes with suggestions for a research program aimed at focusing measurement research more on the calibration and wide dissemination of tools applicable to individuals, and less on the statistical study of inter-variable relations in large data sets.

  11. A score for measuring health risk perception in environmental surveys.

    Science.gov (United States)

    Marcon, Alessandro; Nguyen, Giang; Rava, Marta; Braggion, Marco; Grassi, Mario; Zanolin, Maria Elisabetta

    2015-09-15

    In environmental surveys, risk perception may be a source of bias when information on health outcomes is reported using questionnaires. Using the data from a survey carried out in the largest chipboard industrial district in Italy (Viadana, Mantova), we devised a score of health risk perception and described its determinants in an adult population. In 2006, 3697 parents of children were administered a questionnaire that included ratings on 7 environmental issues. Items dimensionality was studied by factor analysis. After testing equidistance across response options by homogeneity analysis, a risk perception score was devised by summing up item ratings. Factor analysis identified one latent factor, which we interpreted as health risk perception, that explained 65.4% of the variance of five items retained after scaling. The scale (range 0-10, mean ± SD 9.3 ± 1.9) had a good internal consistency (Cronbach's alpha 0.87). Most subjects (80.6%) expressed maximum risk perception (score = 10). Italian mothers showed significantly higher risk perception than foreign fathers. Risk perception was higher for parents of young children, and for older parents with a higher education, than for their counterparts. Actual distance to major roads was not associated with the score, while self-reported intense traffic and frequent air refreshing at home predicted higher risk perception. When investigating health effects of environmental hazards using questionnaires, care should be taken to reduce the possibility of awareness bias at the stage of study planning and data analysis. Including appropriate items in study questionnaires can be useful to derive a measure of health risk perception, which can help to identify confounding of association estimates by risk perception. Copyright © 2015 Elsevier B.V. All rights reserved.

  12. Measurement Properties of Two Innovative Item Formats in a Computer-Based Test

    Science.gov (United States)

    Wan, Lei; Henly, George A.

    2012-01-01

    Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…

  13. Development of a Microsoft Excel tool for one-parameter Rasch model of continuous items: an application to a safety attitude survey

    Directory of Open Access Journals (Sweden)

    Tsair-Wei Chien

    2017-01-01

    Full Text Available Abstract Background Many continuous item responses (CIRs are encountered in healthcare settings, but no one uses item response theory’s (IRT probabilistic modeling to present graphical presentations for interpreting CIR results. A computer module that is programmed to deal with CIRs is required. To present a computer module, validate it, and verify its usefulness in dealing with CIR data, and then to apply the model to real healthcare data in order to show how the CIR that can be applied to healthcare settings with an example regarding a safety attitude survey. Methods Using Microsoft Excel VBA (Visual Basic for Applications, we designed a computer module that minimizes the residuals and calculates model’s expected scores according to person responses across items. Rasch models based on a Wright map and on KIDMAP were demonstrated to interpret results of the safety attitude survey. Results The author-made CIR module yielded OUTFIT mean square (MNSQ and person measures equivalent to those yielded by professional Rasch Winsteps software. The probabilistic modeling of the CIR module provides messages that are much more valuable to users and show the CIR advantage over classic test theory. Conclusions Because of advances in computer technology, healthcare users who are familiar to MS Excel can easily apply the study CIR module to deal with continuous variables to benefit comparisons of data with a logistic distribution and model fit statistics.

  14. The Long-Term Conditions Questionnaire: conceptual framework and item development.

    Science.gov (United States)

    Peters, Michele; Potter, Caroline M; Kelly, Laura; Hunter, Cheryl; Gibbons, Elizabeth; Jenkinson, Crispin; Coulter, Angela; Forder, Julien; Towers, Ann-Marie; A'Court, Christine; Fitzpatrick, Ray

    2016-01-01

    To identify the main issues of importance when living with long-term conditions to refine a conceptual framework for informing the item development of a patient-reported outcome measure for long-term conditions. Semi-structured qualitative interviews (n=48) were conducted with people living with at least one long-term condition. Participants were recruited through primary care. The interviews were transcribed verbatim and analyzed by thematic analysis. The analysis served to refine the conceptual framework, based on reviews of the literature and stakeholder consultations, for developing candidate items for a new measure for long-term conditions. Three main organizing concepts were identified: impact of long-term conditions, experience of services and support, and self-care. The findings helped to refine a conceptual framework, leading to the development of 23 items that represent issues of importance in long-term conditions. The 23 candidate items formed the first draft of the measure, currently named the Long-Term Conditions Questionnaire. The aim of this study was to refine the conceptual framework and develop items for a patient-reported outcome measure for long-term conditions, including single and multiple morbidities and physical and mental health conditions. Qualitative interviews identified the key themes for assessing outcomes in long-term conditions, and these underpinned the development of the initial draft of the measure. These initial items will undergo cognitive testing to refine the items prior to further validation in a survey.

  15. Measures of gender role attitudes under revision: The example of the German General Social Survey.

    Science.gov (United States)

    Walter, Jessica Gabriele

    2018-05-01

    Using the example of the German General Social Survey, this study describes how measures of gender role attitudes can be revised. To date measures have focused on the traditional male breadwinner model. However, social developments in female labor force participation, education, and family structure suggest that a revision and adjustment of existing measures are required. First, these measures need to be supplemented with items that represent more egalitarian models of division of labor and the role of the father in the family. Second, the phrasing of existing items needs to be revised. The results of this study indicate that especially regarding the amount of working hours and the age of children, a specification is needed. This study presents a revised measure, to facilitate analyses over time. This revised measure represents two factors: one referring to traditional and one to modern gender role attitudes. Copyright © 2018 Elsevier Inc. All rights reserved.

  16. The Effect of Answering in a Preferred Versus a Non-Preferred Survey Mode on Measurement

    Directory of Open Access Journals (Sweden)

    Jolene Smyth

    2014-12-01

    Full Text Available Previous research has shown that offering respondents their preferred mode can increase response rates, but the effect of doing so on how respondents process and answer survey questions (i.e., measurement is unclear. In this paper, we evaluate whether changes in question format have different effects on data quality for those responding in their preferred mode than for those responding in a non-preferred mode for three question types (multiple answer, open-ended, and grid. Respondents were asked about their preferred mode in a 2008 survey and were recontacted in 2009. In the recontact survey, respondents were randomly assigned to one of two modes such that some responded in their preferred mode and others did not. They were also randomly assigned to one of two questionnaire forms in which the format of individual questions was varied. On the multiple answer and open-ended items, those who answered in a non-preferred mode seemed to take advantage of opportunities to satisfice when the question format allowed or encouraged it (e.g., selecting fewer items in the check-all than the forced-choice format and being more likely to skip the open-ended item when it had a larger answer box, while those who answered in a preferred mode did not. There was no difference on a grid formatted item across those who did and did not respond by their preferred mode, but results indicate that a fully labeled grid reduced item missing rates vis-à-vis a grid with only column heading labels. Results provide insight into the effect of tailoring to mode preference on commonly used questionnaire design features.

  17. Evaluating measurement invariance across assessment modes of phone interview and computer self-administered survey for the PROMIS measures in a population-based cohort of localized prostate cancer survivors.

    Science.gov (United States)

    Wang, Mian; Chen, Ronald C; Usinger, Deborah S; Reeve, Bryce B

    2017-11-01

    To evaluate measurement invariance (phone interview vs computer self-administered survey) of 15 PROMIS measures responded by a population-based cohort of localized prostate cancer survivors. Participants were part of the North Carolina Prostate Cancer Comparative Effectiveness and Survivorship Study. Out of the 952 men who took the phone interview at 24 months post-treatment, 401 of them also completed the same survey online using a home computer. Unidimensionality of the PROMIS measures was examined using single-factor confirmatory factor analysis (CFA) models. Measurement invariance testing was conducted using longitudinal CFA via a model comparison approach. For strongly or partially strongly invariant measures, changes in the latent factors and factor autocorrelations were also estimated and tested. Six measures (sleep disturbance, sleep-related impairment, diarrhea, illness impact-negative, illness impact-positive, and global satisfaction with sex life) had locally dependent items, and therefore model modifications had to be made on these domains prior to measurement invariance testing. Overall, seven measures achieved strong invariance (all items had equal loadings and thresholds), and four measures achieved partial strong invariance (each measure had one item with unequal loadings and thresholds). Three measures (pain interference, interest in sexual activity, and global satisfaction with sex life) failed to establish configural invariance due to between-mode differences in factor patterns. This study supports the use of phone-based live interviewers in lieu of PC-based assessment (when needed) for many of the PROMIS measures.

  18. A psychometric comparison of three scales and a single-item measure to assess sexual satisfaction.

    Science.gov (United States)

    Mark, Kristen P; Herbenick, Debby; Fortenberry, J Dennis; Sanders, Stephanie; Reece, Michael

    2014-01-01

    This study was designed to systematically compare and contrast the psychometric properties of three scales developed to measure sexual satisfaction and a single-item measure of sexual satisfaction. The Index of Sexual Satisfaction (ISS), Global Measure of Sexual Satisfaction (GMSEX), and the New Sexual Satisfaction Scale-Short (NSSS-S) were compared to one another and to a single-item measure of sexual satisfaction. Conceptualization of the constructs, distribution of scores, internal consistency, convergent validity, test-retest reliability, and factor structure were compared between the measures. A total of 211 men and 214 women completed the scales and a measure of relationship satisfaction, with 33% (n = 139) of the sample reassessed two months later. All scales demonstrated appropriate distribution of scores and adequate internal consistency. The GMSEX, NSSS-S, and the single-item measure demonstrated convergent validity. Test-retest reliability was demonstrated by the ISS, GMSEX, and NSSS-S, but not the single-item measure. Taken together, the GMSEX received the strongest psychometric support in this sample for a unidimensional measure of sexual satisfaction and the NSSS-S received the strongest psychometric support in this sample for a bidimensional measure of sexual satisfaction.

  19. Factoring handedness data: I. Item analysis.

    Science.gov (United States)

    Messinger, H B; Messinger, M I

    1995-12-01

    Recently in this journal Peters and Murphy challenged the validity of factor analyses done on bimodal handedness data, suggesting instead that right- and left-handers be studied separately. But bimodality may be avoidable if attention is paid to Oldfield's questionnaire format and instructions for the subjects. Two characteristics appear crucial: a two-column LEFT-RIGHT format for the body of the instrument and what we call Oldfield's Admonition: not to indicate strong preference for handedness item, such as write, unless "... the preference is so strong that you would never try to use the other hand unless absolutely forced to...". Attaining unimodality of an item distribution would seem to overcome the objections of Peters and Murphy. In a 1984 survey in Boston we used Oldfield's ten-item questionnaire exactly as published. This produced unimodal item distributions. With reflection of the five-point item scale and a logarithmic transformation, we achieved a degree of normalization for the items. Two surveys elsewhere based on Oldfield's 20-item list but with changes in the questionnaire format and the instructions, yielded markedly different item distributions with peaks at each extreme and sometimes in the middle as well.

  20. Measuring children's self-reported sport participation, risk perception and injury history: development and validation of a survey instrument.

    Science.gov (United States)

    Siesmaa, Emma J; Blitvich, Jennifer D; White, Peta E; Finch, Caroline F

    2011-01-01

    Despite the health benefits associated with children's sport participation, the occurrence of injury in this context is common. The extent to which sport injuries impact children's ongoing involvement in sport is largely unknown. Surveys have been shown to be useful for collecting children's injury and sport participation data; however, there are currently no published instruments which investigate the impact of injury on children's sport participation. This study describes the processes undertaken to assess the validity of two survey instruments for collecting self-reported information about child cricket and netball related participation, injury history and injury risk perceptions, as well as the reliability of the cricket-specific version. Face and content validity were assessed through expert feedback from primary and secondary level teachers and from representatives of peak sporting bodies for cricket and netball. Test-retest reliability was measured using a sample of 59 child cricketers who completed the survey on two occasions, 3-4 weeks apart. Based on expert feedback relating to face and content validity, modification and/or deletion of some survey items was undertaken. Survey items with low test-retest reliability (κ≤0.40) were modified or deleted, items with moderate reliability (κ=0.41-0.60) were modified slightly and items with higher reliability (κ≥0.61) were retained, with some undergoing minor modifications. This is the first survey of its kind which has been successfully administered to cricketers aged 10-16 years to collect information about injury risk perceptions and intentions for continued sport participation. Implications for its generalisation to other child sport participants are discussed. Copyright © 2010 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  1. Calibration of context-specific survey items to assess youth physical activity behaviour.

    Science.gov (United States)

    Saint-Maurice, Pedro F; Welk, Gregory J; Bartee, R Todd; Heelan, Kate

    2017-05-01

    This study tests calibration models to re-scale context-specific physical activity (PA) items to accelerometer-derived PA. A total of 195 4th-12th grades children wore an Actigraph monitor and completed the Physical Activity Questionnaire (PAQ) one week later. The relative time spent in moderate-to-vigorous PA (MVPA % ) obtained from the Actigraph at recess, PE, lunch, after-school, evening and weekend was matched with a respective item score obtained from the PAQ's. Item scores from 145 participants were calibrated against objective MVPA % using multiple linear regression with age, and sex as additional predictors. Predicted minutes of MVPA for school, out-of-school and total week were tested in the remaining sample (n = 50) using equivalence testing. The results showed that PAQ β-weights ranged from 0.06 (lunch) to 4.94 (PE) MVPA % (P PAQ and accelerometer MVPA at school and out-of-school ranged from -15.6 to +3.8 min and the PAQ was within 10-15% of accelerometer measured activity. This study demonstrated that context-specific items can be calibrated to predict minutes of MVPA in groups of youth during in- and out-of-school periods.

  2. A randomised trial and economic evaluation of the effect of response mode on response rate, response bias, and item non-response in a survey of doctors

    Directory of Open Access Journals (Sweden)

    Witt Julia

    2011-09-01

    Full Text Available Abstract Background Surveys of doctors are an important data collection method in health services research. Ways to improve response rates, minimise survey response bias and item non-response, within a given budget, have not previously been addressed in the same study. The aim of this paper is to compare the effects and costs of three different modes of survey administration in a national survey of doctors. Methods A stratified random sample of 4.9% (2,702/54,160 of doctors undertaking clinical practice was drawn from a national directory of all doctors in Australia. Stratification was by four doctor types: general practitioners, specialists, specialists-in-training, and hospital non-specialists, and by six rural/remote categories. A three-arm parallel trial design with equal randomisation across arms was used. Doctors were randomly allocated to: online questionnaire (902; simultaneous mixed mode (a paper questionnaire and login details sent together (900; or, sequential mixed mode (online followed by a paper questionnaire with the reminder (900. Analysis was by intention to treat, as within each primary mode, doctors could choose either paper or online. Primary outcome measures were response rate, survey response bias, item non-response, and cost. Results The online mode had a response rate 12.95%, followed by the simultaneous mixed mode with 19.7%, and the sequential mixed mode with 20.7%. After adjusting for observed differences between the groups, the online mode had a 7 percentage point lower response rate compared to the simultaneous mixed mode, and a 7.7 percentage point lower response rate compared to sequential mixed mode. The difference in response rate between the sequential and simultaneous modes was not statistically significant. Both mixed modes showed evidence of response bias, whilst the characteristics of online respondents were similar to the population. However, the online mode had a higher rate of item non-response compared

  3. The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency.

    Science.gov (United States)

    Rose, Matthias; Bjorner, Jakob B; Gandek, Barbara; Bruce, Bonnie; Fries, James F; Ware, John E

    2014-05-01

    To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. The items were evaluated using qualitative and quantitative methods. A total of 16,065 adults answered item subsets (n>2,200/item) on the Internet, with oversampling of the chronically ill. Classical test and item response theory methods were used to evaluate 149 PROMIS PF items plus 10 Short Form-36 and 20 Health Assessment Questionnaire-Disability Index items. A graded response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living. In simulations, a 10-item computerized adaptive test (CAT) eliminated floor and decreased ceiling effects, achieving higher measurement precision than any comparable length static tool across four SDs of the measurement range. Improved psychometric properties were transferred to the CAT's superior ability to identify differences between age and disease groups. The item bank provides a common metric and can improve the measurement of PF by facilitating the standardization of patient-reported outcome measures and implementation of CATs for more efficient PF assessments over a larger range. Copyright © 2014. Published by Elsevier Inc.

  4. Robustness of two single-item self-esteem measures: cross-validation with a measure of stigma in a sample of psychiatric patients.

    Science.gov (United States)

    Bagley, Christopher

    2005-08-01

    Robins' Single-item Self-esteem Inventory was compared with a single item from the Coopersmith Self-esteem. Although a new scoring format was used, there was good evidence of cross-validation in 83 current and former psychiatric patients who completed Harvey's adapted measure of stigma felt and experienced by users of mental health services. Scores on the two single-item self-esteem measures correlated .76 (p self-esteem in users of mental health services.

  5. Validation of the Child HCAHPS survey to measure pediatric inpatient experience of care in Flanders.

    Science.gov (United States)

    Bruyneel, Luk; Coeckelberghs, Ellen; Buyse, Gunnar; Casteels, Kristina; Lommers, Barbara; Vandersmissen, Jo; Van Eldere, Johan; Van Geet, Chris; Vanhaecht, Kris

    2017-07-01

    The recently developed Child HCAHPS provides a standard to measure US hospitals' performance on pediatric inpatient experiences of care. We field-tested Child HCAHPS in Belgium to instigate international comparison. In the development stage, forward/backward translation was conducted and patients assessed content validity index as excellent. The draft Flemish Child HCAHPS included 63 items: 38 items for five topics hypothesized to be similar to those proposed in the US (communication with parent, communication with child, attention to safety and comfort, hospital environment, and global rating), 10 screeners, a 14-item demographic and descriptive section, and one open-ended item. A 6-week pilot test was subsequently performed in three pediatric wards (general ward, hematology and oncology ward, infant and toddler ward) at a JCI-accredited university hospital. An overall response rate of 90.99% (303/333) was achieved and was consistent across wards. Confirmatory factor analysis largely confirmed the configuration of the proposed composites. Composite and single-item measures related well to patients' global rating of the hospital. Interpretation of different patient experiences across types of wards merits further investigation. Child HCAHPS provides an opportunity for systematic and cross-national assessment of pediatric inpatient experiences. Sharing and implementing international best practices are the next logical step. What is Known: • Patient experience surveys are increasingly used to reflect on the quality, safety, and centeredness of patient care. • While adult inpatient experience surveys are routinely used across countries around the world, the measurement of pediatric inpatient experiences is a young field of research that is essential to reflect on family-centered care. What is New: • We demonstrate that the US-developed Child HCAHPS provides an opportunity for international benchmarking of pediatric inpatient experiences with care through parents

  6. Using Linear Equating to Map PROMIS(®) Global Health Items and the PROMIS-29 V2.0 Profile Measure to the Health Utilities Index Mark 3.

    Science.gov (United States)

    Hays, Ron D; Revicki, Dennis A; Feeny, David; Fayers, Peter; Spritzer, Karen L; Cella, David

    2016-10-01

    Preference-based health-related quality of life (HR-QOL) scores are useful as outcome measures in clinical studies, for monitoring the health of populations, and for estimating quality-adjusted life-years. This was a secondary analysis of data collected in an internet survey as part of the Patient-Reported Outcomes Measurement Information System (PROMIS(®)) project. To estimate Health Utilities Index Mark 3 (HUI-3) preference scores, we used the ten PROMIS(®) global health items, the PROMIS-29 V2.0 single pain intensity item and seven multi-item scales (physical functioning, fatigue, pain interference, depressive symptoms, anxiety, ability to participate in social roles and activities, sleep disturbance), and the PROMIS-29 V2.0 items. Linear regression analyses were used to identify significant predictors, followed by simple linear equating to avoid regression to the mean. The regression models explained 48 % (global health items), 61 % (PROMIS-29 V2.0 scales), and 64 % (PROMIS-29 V2.0 items) of the variance in the HUI-3 preference score. Linear equated scores were similar to observed scores, although differences tended to be larger for older study participants. HUI-3 preference scores can be estimated from the PROMIS(®) global health items or PROMIS-29 V2.0. The estimated HUI-3 scores from the PROMIS(®) health measures can be used for economic applications and as a measure of overall HR-QOL in research.

  7. Phase I Marine and Terrestrial Cultural Resources Survey of 13 Project Items Located on Marsh Island, Iberia Parish, Louisiana

    National Research Council Canada - National Science Library

    Barr, William

    1999-01-01

    This report presents the results of Phase I cultural resources survey and archeological inventory of two marine and 11 terrestrial project items on and near Marsh Island in Iberia Parish, Louisiana...

  8. The Iranian version of 12-item Short Form Health Survey (SF-12): factor structure, internal consistency and construct validity.

    Science.gov (United States)

    Montazeri, Ali; Vahdaninia, Mariam; Mousavi, Sayed Javad; Omidvari, Speideh

    2009-09-16

    The 12-item Short Form Health Survey (SF-12) as a shorter alternative of the SF-36 is largely used in health outcomes surveys. The aim of this study was to validate the SF-12 in Iran. A random sample of the general population aged 15 years and over living in Tehran, Iran completed the SF-12. Reliability was estimated using internal consistency and validity was assessed using known groups comparison and convergent validity. In addition, the factor structure of the questionnaire was extracted by performing both exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). In all, 5587 individuals were studied (2721 male and 2866 female). The mean age and formal education of the respondents were 35.1 (SD = 15.4) and 10.2 (SD = 4.4) years respectively. The results showed satisfactory internal consistency for both summary measures, that are the Physical Component Summary (PCS) and the Mental Component Summary (MCS); Cronbach's alpha for PCS-12 and MCS-12 was 0.73 and 0.72, respectively. Known-groups comparison showed that the SF-12 discriminated well between men and women and those who differed in age and educational status (P < 0.001). In addition, correlations between the SF-12 scales and single items showed that the physical functioning, role physical, bodily pain and general health subscales correlated higher with the PCS-12 score, while the vitality, social functioning, role emotional and mental health subscales more correlated with the MCS-12 score lending support to its good convergent validity. Finally the principal component analysis indicated a two-factor structure (physical and mental health) that jointly accounted for 57.8% of the variance. The confirmatory factory analysis also indicated a good fit to the data for the two-latent structure (physical and mental health). In general the findings suggest that the SF-12 is a reliable and valid measure of health related quality of life among Iranian population. However, further studies are needed to

  9. Bayesian modeling of measurement error in predictor variables using item response theory

    NARCIS (Netherlands)

    Fox, Gerardus J.A.; Glas, Cornelis A.W.

    2000-01-01

    This paper focuses on handling measurement error in predictor variables using item response theory (IRT). Measurement error is of great important in assessment of theoretical constructs, such as intelligence or the school climate. Measurement error is modeled by treating the predictors as unobserved

  10. Analyzing force concept inventory with item response theory

    Science.gov (United States)

    Wang, Jing; Bao, Lei

    2010-10-01

    Item response theory is a popular assessment method used in education. It rests on the assumption of a probability framework that relates students' innate ability and their performance on test questions. Item response theory transforms students' raw test scores into a scaled proficiency score, which can be used to compare results obtained with different test questions. The scaled score also addresses the issues of ceiling effects and guessing, which commonly exist in quantitative assessment. We used item response theory to analyze the force concept inventory (FCI). Our results show that item response theory can be useful for analyzing physics concept surveys such as the FCI and produces results about the individual questions and student performance that are beyond the capability of classical statistics. The theory yields detailed measurement parameters regarding the difficulty, discrimination features, and probability of correct guess for each of the FCI questions.

  11. Comparison of the Availability and Cost of Foods Compatible With a Renal Diet Versus an Unrestricted Diet Using the Nutrition Environment Measures Survey.

    Science.gov (United States)

    Sullivan, Catherine M; Pencak, Julie A; Freedman, Darcy A; Huml, Anne M; León, Janeen B; Nemcek, Jeanette; Theurer, Jacqueline; Sehgal, Ashwini R

    2017-05-01

    Hemodialysis patients' ability to access food that is both compatible with a renal diet and affordable is affected by the local food environment. Comparisons of the availability and cost of food items suitable for the renal diet versus a typical unrestricted diet were completed using the standard Nutrition Environment Measures Survey and a renal diet-modified Nutrition Environment Measures Survey. Cross-sectional study. Twelve grocery stores in Northeast Ohio. Availability and cost of food items in 12 categories. The mean total number of food items available differed significantly (P ≤ .001) between the unrestricted diet (38.9 ± 4.5) and renal diet (32.2 ± 4.7). The mean total cost per serving did not differ significantly (P = 0.48) between the unrestricted diet ($5.67 ± 2.50) and renal diet ($5.76 ± 2.74). The availability of renal diet food items is significantly less than that of unrestricted diet food items, but there is no difference in the cost of items that are available in grocery stores. Further work is needed to determine how to improve the food environment for patients with chronic diseases. Copyright © 2017 National Kidney Foundation, Inc. Published by Elsevier Inc. All rights reserved.

  12. Item validity vs. item discrimination index: a redundancy?

    Science.gov (United States)

    Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.

    2018-03-01

    In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.

  13. Guideline appraisal with AGREE II: online survey of the potential influence of AGREE II items on overall assessment of guideline quality and recommendation for use.

    Science.gov (United States)

    Hoffmann-Eßer, Wiebke; Siering, Ulrich; Neugebauer, Edmund A M; Brockhaus, Anne Catharina; McGauran, Natalie; Eikermann, Michaela

    2018-02-27

    The AGREE II instrument is the most commonly used guideline appraisal tool. It includes 23 appraisal criteria (items) organized within six domains. AGREE II also includes two overall assessments (overall guideline quality, recommendation for use). Our aim was to investigate how strongly the 23 AGREE II items influence the two overall assessments. An online survey of authors of publications on guideline appraisals with AGREE II and guideline users from a German scientific network was conducted between 10th February 2015 and 30th March 2015. Participants were asked to rate the influence of the AGREE II items on a Likert scale (0 = no influence to 5 = very strong influence). The frequencies of responses and their dispersion were presented descriptively. Fifty-eight of the 376 persons contacted (15.4%) participated in the survey and the data of the 51 respondents with prior knowledge of AGREE II were analysed. Items 7-12 of Domain 3 (rigour of development) and both items of Domain 6 (editorial independence) had the strongest influence on the two overall assessments. In addition, Items 15-17 (clarity of presentation) had a strong influence on the recommendation for use. Great variations were shown for the other items. The main limitation of the survey is the low response rate. In guideline appraisals using AGREE II, items representing rigour of guideline development and editorial independence seem to have the strongest influence on the two overall assessments. In order to ensure a transparent approach to reaching the overall assessments, we suggest the inclusion of a recommendation in the AGREE II user manual on how to consider item and domain scores. For instance, the manual could include an a-priori weighting of those items and domains that should have the strongest influence on the two overall assessments. The relevance of these assessments within AGREE II could thereby be further specified.

  14. Testing measurement invariance in the International Social Survey Program Health 2011 – the mental well-being scale

    NARCIS (Netherlands)

    van Deurzen, I.A.; Roosma, F.

    2014-01-01

    Purpose In the present contribution we address the measurement invariance of a new mental well-being scale of three items that was applied in the International Social Survey Program (ISSP) Health 2011 module. Our aim is to establish if and for how many countries (partial) scalar invariance is

  15. Item response theory, computerized adaptive testing, and PROMIS: assessment of physical function.

    Science.gov (United States)

    Fries, James F; Witter, James; Rose, Matthias; Cella, David; Khanna, Dinesh; Morgan-DeWitt, Esi

    2014-01-01

    Patient-reported outcome (PRO) questionnaires record health information directly from research participants because observers may not accurately represent the patient perspective. Patient-reported Outcomes Measurement Information System (PROMIS) is a US National Institutes of Health cooperative group charged with bringing PRO to a new level of precision and standardization across diseases by item development and use of item response theory (IRT). With IRT methods, improved items are calibrated on an underlying concept to form an item bank for a "domain" such as physical function (PF). The most informative items can be combined to construct efficient "instruments" such as 10-item or 20-item PF static forms. Each item is calibrated on the basis of the probability that a given person will respond at a given level, and the ability of the item to discriminate people from one another. Tailored forms may cover any desired level of the domain being measured. Computerized adaptive testing (CAT) selects the best items to sharpen the estimate of a person's functional ability, based on prior responses to earlier questions. PROMIS item banks have been improved with experience from several thousand items, and are calibrated on over 21,000 respondents. In areas tested to date, PROMIS PF instruments are superior or equal to Health Assessment Questionnaire and Medical Outcome Study Short Form-36 Survey legacy instruments in clarity, translatability, patient importance, reliability, and sensitivity to change. Precise measures, such as PROMIS, efficiently incorporate patient self-report of health into research, potentially reducing research cost by lowering sample size requirements. The advent of routine IRT applications has the potential to transform PRO measurement.

  16. Gender-Based Differential Item Performance in Mathematics Achievement Items.

    Science.gov (United States)

    Doolittle, Allen E.; Cleary, T. Anne

    1987-01-01

    Eight randomly equivalent samples of high school seniors were each given a unique form of the ACT Assessment Mathematics Usage Test (ACTM). Signed measures of differential item performance (DIP) were obtained for each item in the eight ACTM forms. DIP estimates were analyzed and a significant item category effect was found. (Author/LMO)

  17. A Study of General Education Astronomy Students' Understandings of Cosmology. Part III. Evaluating Four Conceptual Cosmology Surveys: An Item Response Theory Approach

    Science.gov (United States)

    Wallace, Colin S.; Prather, Edward E.; Duncan, Douglas K.

    2012-01-01

    This is the third of five papers detailing our national study of general education astronomy students' conceptual and reasoning difficulties with cosmology. In this paper, we use item response theory to analyze students' responses to three out of the four conceptual cosmology surveys we developed. The specific item response theory model we use is…

  18. Maslach Burnout Inventory and a Self-Defined, Single-Item Burnout Measure Produce Different Clinician and Staff Burnout Estimates.

    Science.gov (United States)

    Knox, Margae; Willard-Grace, Rachel; Huang, Beatrice; Grumbach, Kevin

    2018-06-04

    Clinicians and healthcare staff report high levels of burnout. Two common burnout assessments are the Maslach Burnout Inventory (MBI) and a single-item, self-defined burnout measure. Relatively little is known about how the measures compare. To identify the sensitivity, specificity, and concurrent validity of the self-defined burnout measure compared to the more established MBI measure. Cross-sectional survey (November 2016-January 2017). Four hundred forty-four primary care clinicians and 606 staff from three San Francisco Aarea healthcare systems. The MBI measure, calculated from a high score on either the emotional exhaustion or cynicism subscale, and a single-item measure of self-defined burnout. Concurrent validity was assessed using a validated, 7-item team culture scale as reported by Willard-Grace et al. (J Am Board Fam Med 27(2):229-38, 2014) and a standard question about workplace atmosphere as reported by Rassolian et al. (JAMA Intern Med 177(7):1036-8, 2017) and Linzer et al. (Ann Intern Med 151(1):28-36, 2009). Similar to other nationally representative burnout estimates, 52% of clinicians (95% CI: 47-57%) and 46% of staff (95% CI: 42-50%) reported high MBI emotional exhaustion or high MBI cynicism. In contrast, 29% of clinicians (95% CI: 25-33%) and 31% of staff (95% CI: 28-35%) reported "definitely burning out" or more severe symptoms on the self-defined burnout measure. The self-defined measure's sensitivity to correctly identify MBI-assessed burnout was 50.4% for clinicians and 58.6% for staff; specificity was 94.7% for clinicians and 92.3% for staff. Area under the receiver operator curve was 0.82 for clinicians and 0.81 for staff. Team culture and atmosphere were significantly associated with both self-defined burnout and the MBI, confirming concurrent validity. Point estimates of burnout notably differ between the self-defined and MBI measures. Compared to the MBI, the self-defined burnout measure misses half of high-burnout clinicians and more

  19. A Comparison of Survey Measures and Biomarkers of Secondhand Tobacco Smoke Exposure among Nonsmokers.

    Science.gov (United States)

    Okoli, Chizimuzo

    2016-01-01

    Secondhand tobacco smoke (SHS) exposure causes several adverse physical health outcomes. Conceptual differences in survey measures of 'psychosocial' (SHS exposure from smokers in an individual's life) and 'physical' (environments where an individual is exposed to SHS) SHS exposure exist. Few studies have examined the association between psychosocial and physical SHS exposures measures in comparison to biomarkers of SHS exposure. A secondary analysis of cross-sectional data was examined among a convenience sample of 20 adults. Data included survey items on SHS exposure and hair nicotine and saliva cotinine levels. Spearman analysis was used to assess correlations among variables. Medium and strong correlations were found among SHS exposure measures with the exception of saliva cotinine levels. Strong correlations were found among and between psychosocial and physical SHS exposure measures. Hair nicotine levels had medium strength associations with only perceived frequency of SHS exposure. As psychosocial measures of exposure were associated with biomarkers, such measures (particularly perceived frequency of SHS exposure) should be added to surveys in addition to physical SHS exposure measures to enhance accuracy of SHS measurement. Future explorations with robust sample sizes should further examine the strength of relationship between psychosocial and physical SHS exposure measures. © 2015 Wiley Periodicals, Inc.

  20. Relationship between handling heavy items during pregnancy and spontaneous abortion: a cross-sectional survey of working women in South Korea.

    Science.gov (United States)

    Lee, Bokim; Jung, Hye-Sun

    2012-01-01

    The researchers conducted a cross-sectional survey to determine the relationship between handling heavy items during pregnancy and spontaneous abortion among working women in South Korea. One thousand working women were selected from a database of those eligible for maternity benefits under the National Employment Insurance Plan. Study results showed that handling heavy items during pregnancy was associated with an increased risk of spontaneous abortion after adjusting for general characteristics of the participants and their work environment. A collective effort is needed on the parts of employers, employees, occupational health nurses, and the government to protect working women from lifting heavy items while pregnant. Copyright 2012, SLACK Incorporated.

  1. Dutch-Flemish translation of nine pediatric item banks from the Patient-Reported Outcomes Measurement Information System (PROMIS)®.

    Science.gov (United States)

    Haverman, Lotte; Grootenhuis, Martha A; Raat, Hein; van Rossum, Marion A J; van Dulmen-den Broeder, Eline; Hoppenbrouwers, Karel; Correia, Helena; Cella, David; Roorda, Leo D; Terwee, Caroline B

    2016-03-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS(®)) is a new, state-of-the-art assessment system for measuring patient-reported health and well-being of adults and children. It has the potential to be more valid, reliable, and responsive than existing PROMs. The items banks are designed to be self-reported and completed by children aged 8-18 years. The PROMIS items can be administered in short forms or through computerized adaptive testing. This paper describes the translation and cultural adaption of nine PROMIS item banks (151 items) for children in Dutch-Flemish. The translation was performed by FACITtrans using standardized PROMIS methodology and approved by the PROMIS Statistical Center. The translation included four forward translations, two back-translations, three independent reviews (at least two Dutch, one Flemish), and pretesting in 24 children from the Netherlands and Flanders. For some items, it was necessary to have separate translations for Dutch and Flemish: physical function-mobility (three items), anger (one item), pain interference (two items), and asthma impact (one item). Challenges faced in the translation process included scarcity or overabundance of possible translations, unclear item descriptions, constructs broader/smaller in the target language, difficulties in rank ordering items, differences in unit of measurement, irrelevant items, or differences in performance of activities. By addressing these challenges, acceptable translations were obtained for all items. The Dutch-Flemish PROMIS items are linguistically equivalent to the original USA version. Short forms are now available for use, and entire item banks are ready for cross-cultural validation in the Netherlands and Flanders.

  2. U.S. Naval Unit Behavioral Health Needs Assessment Survey, Overview of Survey Items and Measures

    Science.gov (United States)

    2014-05-20

    all Soldiers. The BHNAS and MHAT surveys have yielded valuable information regarding the effects of combat and deployment on service members...and Barriers to Care • Amount of Sleep and Sleep Deficit • Sleep Difficulties • Military Specialty • Positive Effects of Assignment • Contribution...nonopioid prescription painkillers was added; (3) the definition of “constantly and frequent” was omitted in the question; and (4) the NUBHNAS

  3. An Item Bank to Measure Systems, Services, and Policies: Environmental Factors Affecting People With Disabilities.

    Science.gov (United States)

    Lai, Jin-Shei; Hammel, Joy; Jerousek, Sara; Goldsmith, Arielle; Miskovic, Ana; Baum, Carolyn; Wong, Alex W; Dashner, Jessica; Heinemann, Allen W

    2016-12-01

    To develop a measure of perceived systems, services, and policies facilitators (see Chapter 5 of the International Classification of Functioning, Disability and Health) for people with neurologic disabilities and to evaluate the effect of perceived systems, services, and policies facilitators on health-related quality of life. Qualitative approaches to develop and refine items. Confirmatory factor analysis including 1-factor confirmatory factor analysis and bifactor analysis to evaluate unidimensionality of items. Rasch analysis to identify misfitting items. Correlational and analysis of variance methods to evaluate construct validity. Community-dwelling individuals participated in telephone interviews or traveled to the academic medical centers where this research took place. Participants (N=571) had a diagnosis of spinal cord injury, stroke, or traumatic brain injury. They were 18 years or older and English speaking. Not applicable. An item bank to evaluate environmental access and support levels of services, systems, and policies for people with disabilities. We identified a general factor defined as "access and support levels of the services, systems, and policies at the level of community living" and 3 local factors defined as "health services," "community living," and "community resources." The systems, services, and policies measure correlated moderately with participation measures: Community Participation Indicators (CPI) - Involvement, CPI - Control over Participation, Quality of Life in Neurological Disorders - Ability to Participate, Quality of Life in Neurological Disorders - Satisfaction with Role Participation, Patient-Reported Outcomes Measurement Information System (PROMIS) Ability to Participate, PROMIS Satisfaction with Role Participation, and PROMIS Isolation. The measure of systems, services, and policies facilitators contains items pertaining to health services, community living, and community resources. Investigators and clinicians can measure

  4. Item wording and internal consistency of a measure of cohesion: the group environment questionnaire.

    Science.gov (United States)

    Eys, Mark A; Carron, Albert V; Bray, Steven R; Brawley, Lawrence R

    2007-06-01

    A common practice for counteracting response acquiescence in psychological measures has been to employ both negatively and positively worded items. However, previous research has highlighted that the reliability of measures can be affected by this practice (Spector, 1992). The purpose of the present study was to examine the effect that the presence of negatively worded items has on the internal reliability of the Group Environment Questionnaire (GEQ). Two samples (N = 276) were utilized, and participants were asked to complete the GEQ (original and revised) on separate occasions. Results demonstrated that the revised questionnaire (containing all positively worded items) had significantly higher Cronbach alpha values for three of the four dimensions of the GEQ. Implications, alternatives, and future directions are discussed.

  5. Evaluation of a survey tool to measure safety climate in Australian hospital pharmacy staff.

    Science.gov (United States)

    Walpola, Ramesh L; Chen, Timothy F; Fois, Romano A; Ashcroft, Darren M; Lalor, Daniel J

    Safety climate evaluation is increasingly used by hospitals as part of quality improvement initiatives. Consequently, it is necessary to have validated tools to measure changes. To evaluate the construct validity and internal consistency of a survey tool to measure Australian hospital pharmacy patient safety climate. A 42 item cross-sectional survey was used to evaluate the patient safety climate of 607 Australian hospital pharmacy staff. Survey responses were initially mapped to the factor structure previously identified in European community pharmacy. However, as the data did not adequately fit the community pharmacy model, participants were randomly split into two groups with exploratory factor analysis performed on the first group (n = 302) and confirmatory factor analyses performed on the second group (n = 305). Following exploratory factor analysis (59.3% variance explained) and confirmatory factor analysis, a 6-factor model containing 28 items was obtained with satisfactory model fit (χ 2 (335) = 664.61 p  0.643) and model nesting between the groups (Δχ 2 (22) = 30.87, p = 0.10). Three factors (blame culture, organisational learning and working conditions) were similar to those identified in European community pharmacy and labelled identically. Three additional factors (preoccupation with improvement; comfort to question authority; and safety issues being swept under the carpet) highlight hierarchical issues present in hospital settings. This study has demonstrated the validity of a survey to evaluate patient safety climate of Australian hospital pharmacy staff. Importantly, this validated factor structure may be used to evaluate changes in safety climate over time. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. An evaluation of computerized adaptive testing for general psychological distress: combining GHQ-12 and Affectometer-2 in an item bank for public mental health research.

    Science.gov (United States)

    Stochl, Jan; Böhnke, Jan R; Pickett, Kate E; Croudace, Tim J

    2016-05-20

    Recent developments in psychometric modeling and technology allow pooling well-validated items from existing instruments into larger item banks and their deployment through methods of computerized adaptive testing (CAT). Use of item response theory-based bifactor methods and integrative data analysis overcomes barriers in cross-instrument comparison. This paper presents the joint calibration of an item bank for researchers keen to investigate population variations in general psychological distress (GPD). Multidimensional item response theory was used on existing health survey data from the Scottish Health Education Population Survey (n = 766) to calibrate an item bank consisting of pooled items from the short common mental disorder screen (GHQ-12) and the Affectometer-2 (a measure of "general happiness"). Computer simulation was used to evaluate usefulness and efficacy of its adaptive administration. A bifactor model capturing variation across a continuum of population distress (while controlling for artefacts due to item wording) was supported. The numbers of items for different required reliabilities in adaptive administration demonstrated promising efficacy of the proposed item bank. Psychometric modeling of the common dimension captured by more than one instrument offers the potential of adaptive testing for GPD using individually sequenced combinations of existing survey items. The potential for linking other item sets with alternative candidate measures of positive mental health is discussed since an optimal item bank may require even more items than these.

  7. Assessing the internal validity of a household survey-based food security measure adapted for use in Iran

    Directory of Open Access Journals (Sweden)

    Sadeghizadeh Atefeh

    2009-06-01

    Full Text Available Abstract Background The prevalence of food insecurity is an indicator of material well-being in an area of basic need. The U.S. Food Security Module has been adapted for use in a wide variety of cultural and linguistic settings around the world. We assessed the internal validity of the adapted U.S. Household Food Security Survey Module to measure adult and child food insecurity in Isfahan, Iran, using statistical methods based on the Rasch measurement model. Methods The U.S. Household Food Security Survey Module was translated into Farsi and after adaptation, administered to a representative sample. Data were provided by 2,004 randomly selected households from all sectors of the population of Isfahan, Iran, during 2005. Results 53.1 percent reported that their food had run out at some time during the previous 12 months and they did not have money to buy more, while 26.7 percent reported that an adult had cut the size of a meal or skipped a meal because there was not enough money for food, and 7.2 percent reported that an adult did not eat for a whole day because there was not enough money for food. The severity of the items in the adult scale, estimated under Rasch-model assumptions, covered a range of 6.65 logistic units, and those in the child scale 11.68 logistic units. Most Item-infit statistics were near unity, and none exceeded 1.20. Conclusion The range of severity of items provides measurement coverage across a wide range of severity of food insecurity for both adults and children. Both scales demonstrated acceptable levels of internal validity, although several items should be improved. The similarity of the response patterns in the Isfahan and the U.S. suggests that food insecurity is experienced, managed, and described similarly in the two countries.

  8. The Iranian version of 12-item Short Form Health Survey (SF-12: factor structure, internal consistency and construct validity

    Directory of Open Access Journals (Sweden)

    Mousavi Sayed

    2009-09-01

    Full Text Available Abstract Background The 12-item Short Form Health Survey (SF-12 as a shorter alternative of the SF-36 is largely used in health outcomes surveys. The aim of this study was to validate the SF-12 in Iran. Methods A random sample of the general population aged 15 years and over living in Tehran, Iran completed the SF-12. Reliability was estimated using internal consistency and validity was assessed using known groups comparison and convergent validity. In addition, the factor structure of the questionnaire was extracted by performing both exploratory factor analysis (EFA and confirmatory factor analysis (CFA. Results: In all, 5587 individuals were studied (2721 male and 2866 female. The mean age and formal education of the respondents were 35.1 (SD = 15.4 and 10.2 (SD = 4.4 years respectively. The results showed satisfactory internal consistency for both summary measures, that are the Physical Component Summary (PCS and the Mental Component Summary (MCS; Cronbach's α for PCS-12 and MCS-12 was 0.73 and 0.72, respectively. Known-groups comparison showed that the SF-12 discriminated well between men and women and those who differed in age and educational status (P Conclusion In general the findings suggest that the SF-12 is a reliable and valid measure of health related quality of life among Iranian population. However, further studies are needed to establish stronger psychometric properties for this alternative form of the SF-36 Health Survey in Iran.

  9. The PROMIS fatigue item bank has good measurement properties in patients with fibromyalgia and severe fatigue.

    Science.gov (United States)

    Yost, Kathleen J; Waller, Niels G; Lee, Minji K; Vincent, Ann

    2017-06-01

    Efficient management of fibromyalgia (FM) requires precise measurement of FM-specific symptoms. Our objective was to assess the measurement properties of the Patient-Reported Outcome Measurement Information System (PROMIS) fatigue item bank (FIB) in people with FM. We applied classical psychometric and item response theory methods to cross-sectional PROMIS-FIB data from two samples. Data on the clinical FM sample were obtained at a tertiary medical center. Data for the U.S. general population sample were obtained from the PROMIS network. The full 95-item bank was administered to both samples. We investigated dimensionality of the item bank in both samples by separately fitting a bifactor model with two group factors; experience and impact. We assessed measurement invariance between samples, and we explored an alternate factor structure with the normative sample and subsequently confirmed that structure in the clinical sample. Finally, we assessed whether reporting FM subdomain scores added value over reporting a single total score. The item bank was dominated by a general fatigue factor. The fit of the initial bifactor model and evidence of measurement invariance indicated that the same constructs were measured across the samples. An alternative bifactor model with three group factors demonstrated slightly improved fit. Subdomain scores add value over a total score. We demonstrated that the PROMIS-FIB is appropriate for measuring fatigue in clinical samples of FM patients. The construct can be presented by a single score; however, subdomain scores for the three group factors identified in the alternative model may also be reported.

  10. Measuring physical and mental health using the SF-12: implications for community surveys of mental health.

    Science.gov (United States)

    Windsor, Timothy D; Rodgers, Bryan; Butterworth, Peter; Anstey, Kaarin J; Jorm, Anthony F

    2006-09-01

    The effects of using different approaches to scoring the SF-12 summary scales of physical and mental health were examined with a view to informing the design and interpretation of community-based survey research. Data from a population-based study of 7485 participants in three cohorts aged 20-24, 40-44 and 60-64 years were used to examine relationships among measures of physical and mental health calculated from the same items using the SF-12 and RAND-12 approaches to scoring, and other measures of chronic physical conditions and psychological distress. A measure of physical health constructed using the RAND-12 scoring showed a monotonic negative association with psychological distress as measured by the Goldberg depression and anxiety scales. However, a non-monotonic association was evident in the relationship between SF-12 physical health scores and distress, with very high SF-12 physical health scores corresponding with high levels of distress. These relationships highlight difficulties in interpretation that can arise when using the SF-12 summary scales in some analytical contexts. It is recommended that community surveys that measure physical and mental functioning using the SF-12 items generate summary scores using the RAND-12 protocol in addition to the SF-12 approach. In general, researchers should be wary of using factor scores based on orthogonal rotation, which assumes that measures are uncorrelated, to represent constructs that have an actual association.

  11. What’s hampering measurement invariance: Detecting non-invariant items using clusterwise simultaneous component analysis

    Directory of Open Access Journals (Sweden)

    Kim eDe Roover

    2014-06-01

    Full Text Available The issue of measurement invariance is ubiquitous in the behavioral sciences nowadays as more and more studies yield multivariate multigroup data. When measurement invariance cannot be established across groups, this is often due to different loadings on only a few items. Within the multigroup CFA framework, methods have been proposed to trace such non-invariant items, but these methods have some disadvantages in that they require researchers to run a multitude of analyses and in that they imply assumptions that are often questionable. In this paper, we propose an alternative strategy which builds on clusterwise simultaneous component analysis (SCA. Clusterwise SCA, being an exploratory technique, assigns the groups under study to a few clusters based on differences and similarities in the covariance matrices, and thus based on the component structure of the items. Non-invariant items can then be traced by comparing the cluster-specific component loadings via congruence coefficients, which is far more parsimonious than comparing the component structure of all separate groups. In this paper we present a heuristic for this procedure. Afterwards, one can return to the multigroup CFA framework and check whether removing the non-invariant items or removing some of the equality restrictions for these items, yields satisfactory invariance test results. An empirical application concerning cross-cultural emotion data is used to demonstrate that this novel approach is useful and can co-exist with the traditional CFA approaches.

  12. The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency

    DEFF Research Database (Denmark)

    Rose, Matthias; Bjørner, Jakob; Gandek, Barbara

    2014-01-01

    OBJECTIVE: To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. STUDY DESIGN AND SETTING: The items were evaluated using qualitative and quantitative methods. A total...... response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. RESULTS: The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living...... to identify differences between age and disease groups. CONCLUSION: The item bank provides a common metric and can improve the measurement of PF by facilitating the standardization of patient-reported outcome measures and implementation of CATs for more efficient PF assessments over a larger range....

  13. A photographic method to measure food item intake. Validation in geriatric institutions.

    Science.gov (United States)

    Pouyet, Virginie; Cuvelier, Gérard; Benattar, Linda; Giboreau, Agnès

    2015-01-01

    From both a clinical and research perspective, measuring food intake is an important issue in geriatric institutions. However, weighing food in this context can be complex, particularly when the items remaining on a plate (side dish, meat or fish and sauce) need to be weighed separately following consumption. A method based on photography that involves taking photographs after a meal to determine food intake consequently seems to be a good alternative. This method enables the storage of raw data so that unhurried analyses can be performed to distinguish the food items present in the images. Therefore, the aim of this paper was to validate a photographic method to measure food intake in terms of differentiating food item intake in the context of a geriatric institution. Sixty-six elderly residents took part in this study, which was performed in four French nursing homes. Four dishes of standardized portions were offered to the residents during 16 different lunchtimes. Three non-trained assessors then independently estimated both the total and specific food item intakes of the participants using images of their plates taken after the meal (photographic method) and a reference image of one plate taken before the meal. Total food intakes were also recorded by weighing the food. To test the reliability of the photographic method, agreements between different assessors and agreements among various estimates made by the same assessor were evaluated. To test the accuracy and specificity of this method, food intake estimates for the four dishes were compared with the food intakes determined using the weighed food method. To illustrate the added value of the photographic method, food consumption differences between the dishes were explained by investigating the intakes of specific food items. Although they were not specifically trained for this purpose, the results demonstrated that the assessor estimates agreed between assessors and among various estimates made by the same

  14. The Aphasia Communication Outcome Measure (ACOM): Dimensionality, Item Bank Calibration, and Initial Validation

    Science.gov (United States)

    Hula, William D.; Doyle, Patrick J.; Stone, Clement A.; Hula, Shannon N. Austermann; Kellough, Stacey; Wambaugh, Julie L.; Ross, Katherine B.; Schumacher, James G.; St. Jacque, Ann

    2015-01-01

    Purpose: The purpose of this study is to investigate the structure and measurement properties of the Aphasia Communication Outcome Measure (ACOM), a patient-reported outcome measure of communicative functioning for persons with aphasia. Method: Three hundred twenty-nine participants with aphasia responded to 177 items asking about communicative…

  15. Language-related differential item functioning between English and German PROMIS Depression items is negligible.

    Science.gov (United States)

    Fischer, H Felix; Wahl, Inka; Nolte, Sandra; Liegl, Gregor; Brähler, Elmar; Löwe, Bernd; Rose, Matthias

    2017-12-01

    To investigate differential item functioning (DIF) of PROMIS Depression items between US and German samples we compared data from the US PROMIS calibration sample (n = 780), a German general population survey (n = 2,500) and a German clinical sample (n = 621). DIF was assessed in an ordinal logistic regression framework, with 0.02 as criterion for R 2 -change and 0.096 for Raju's non-compensatory DIF. Item parameters were initially fixed to the PROMIS Depression metric; we used plausible values to account for uncertainty in depression estimates. Only four items showed DIF. Accounting for DIF led to negligible effects for the full item bank as well as a post hoc simulated computer-adaptive test (German general population sample was considerably lower compared to the US reference value of 50. Overall, we found little evidence for language DIF between US and German samples, which could be addressed by either replacing the DIF items by items not showing DIF or by scoring the short form in German samples with the corrected item parameters reported. Copyright © 2016 John Wiley & Sons, Ltd.

  16. Using automatic item generation to create multiple-choice test items.

    Science.gov (United States)

    Gierl, Mark J; Lai, Hollis; Turner, Simon R

    2012-08-01

    Many tests of medical knowledge, from the undergraduate level to the level of certification and licensure, contain multiple-choice items. Although these are efficient in measuring examinees' knowledge and skills across diverse content areas, multiple-choice items are time-consuming and expensive to create. Changes in student assessment brought about by new forms of computer-based testing have created the demand for large numbers of multiple-choice items. Our current approaches to item development cannot meet this demand. We present a methodology for developing multiple-choice items based on automatic item generation (AIG) concepts and procedures. We describe a three-stage approach to AIG and we illustrate this approach by generating multiple-choice items for a medical licensure test in the content area of surgery. To generate multiple-choice items, our method requires a three-stage process. Firstly, a cognitive model is created by content specialists. Secondly, item models are developed using the content from the cognitive model. Thirdly, items are generated from the item models using computer software. Using this methodology, we generated 1248 multiple-choice items from one item model. Automatic item generation is a process that involves using models to generate items using computer technology. With our method, content specialists identify and structure the content for the test items, and computer technology systematically combines the content to generate new test items. By combining these outcomes, items can be generated automatically. © Blackwell Publishing Ltd 2012.

  17. What's hampering measurement invariance : Detecting non-invariant items using clusterwise simultaneous component analysis

    NARCIS (Netherlands)

    De Roover, K.; Timmerman, Marieke; De Leersnyder, J.; Mesquita, B.; Ceulemans, Eva

    2014-01-01

    The issue of measurement invariance is ubiquitous in the behavioral sciences nowadays as more and more studies yield multivariate multigroup data. When measurement invariance cannot be established across groups, this is often due to different loadings on only a few items. Within the multigroup CFA

  18. Measuring Diversity and Inclusion in Academic Medicine: The Diversity Engagement Survey (DES)

    Science.gov (United States)

    Person, Sharina D.; Jordan, C. Greer; Allison, Jeroan J.; Fink Ogawa, Lisa M.; Castillo-Page, Laura; Conrad, Sarah; Nivet, Marc A.; Plummer, Deborah L.

    2018-01-01

    Purpose To produce a physician and scientific workforce capable of delivering high quality, culturally competent health care and research, academic medical centers must assess their capacity for diversity and inclusion and respond to identified opportunities. Thus, the Diversity Engagement Survey (DES) is presented as a diagnostic and benchmarking tool. Method The 22-item DES connects workforce engagement theory with inclusion and diversity constructs. Face and content validity were established based on decades of previous work to promote institutional diversity. The survey was pilot tested at a single academic medical center and subsequently administered at 13 additional academic medical centers. Cronbach alphas assessed internal consistency and Confirmatory Factor Analysis (CFA) established construct validity. Criterion validity was assessed by observed separation in scores for groups traditionally recognized to have less workforce engagement. Results The sample consisted of 13,694 individuals at 14 medical schools from across the U.S. who responded to the survey administered between 2011– 2012. The Cronbach alphas for inclusion and engagement factors (range: 0.68 to 0.85), CFA fit indices, and item correlations with latent constructs, indicated an acceptable model fit and that questions measured the intended concepts. DES scores clearly distinguished higher and lower performing institutions. The DES detected important disparities for black, women, and those who did not have heterosexual orientation. Conclusions This study demonstrated that the DES is a reliable and valid instrument for internal assessment and evaluation or external benchmarking of institutional progress in building inclusion and engagement. PMID:26466376

  19. ITEM LEVEL DIAGNOSTICS AND MODEL - DATA FIT IN ITEM ...

    African Journals Online (AJOL)

    Global Journal

    Item response theory (IRT) is a framework for modeling and analyzing item response ... data. Though, there is an argument that the evaluation of fit in IRT modeling has been ... National Council on Measurement in Education ... model data fit should be based on three types of ... prediction should be assessed through the.

  20. Item response theory analysis of Centers for Disease Control and Prevention Health-Related Quality of Life (CDC HRQOL) items in adults with arthritis.

    Science.gov (United States)

    Mielenz, Thelma J; Callahan, Leigh F; Edwards, Michael C

    2016-03-12

    Examine the feasibility of performing an item response theory (IRT) analysis on two of the Centers for Disease Control and Prevention health-related quality of life (CDC HRQOL) modules - the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM). Previous principal components analyses confirm that the two scales both assess a mix of mental (CDC-MH) and physical health (CDC-PH). The purpose is to conduct item response theory (IRT) analysis on the CDC-MH and CDC-PH scales separately. 2182 patients with self-reported or physician-diagnosed arthritis completed a cross-sectional survey including HDCM and HDSM items. Besides global health, the other 8 items ask the number of days that some statement was true; we chose to recode the data into 8 categories based on observed clustering. The IRT assumptions were assessed using confirmatory factor analysis and the data could be modeled using an unidimensional IRT model. The graded response model was used for IRT analyses and CDC-MH and CDC-PH scales were analyzed separately in flexMIRT. The IRT parameter estimates for the five-item CDC-PH all appeared reasonable. The three-item CDC-MH did not have reasonable parameter estimates. The CDC-PH scale is amenable to IRT analysis but the existing The CDC-MH scale is not. We suggest either using the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM) as they currently stand or the CDC-PH scale alone if the primary goal is to measure physical health related HRQOL.

  1. AAN Epilepsy Quality Measures in clinical practice: a survey of neurologists.

    Science.gov (United States)

    Wasade, Vibhangini S; Spanaki, Marianna; Iyengar, Revathi; Barkley, Gregory L; Schultz, Lonni

    2012-08-01

    Epilepsy Quality Measures (EQM) were developed by the American Academy of Neurology (AAN) to convey standardization and eliminate gaps and variations in the delivery of epilepsy care (Fountain et al., 2011 [1]). The aim of this study was to identify adherence to these measures and other emerging practice standards in epilepsy care. A 15-item survey was mailed to neurologists in Michigan, USA, inquiring about their practice patterns in relation to EQM. One hundred thirteen of the 792 surveyed Michigan Neurologists responded (14%). The majority (83% to 94%) addressed seizure type and frequency, reviewed EEG and MRI, and provided pregnancy counseling to women of childbearing potential. Our survey identified gaps in practice patterns such as counseling about antiepileptic drug (AED) side effects and knowledge about referral for surgical therapy of intractable epilepsy. Statistical significance in the responses on the AAN EQM was noted in relation to number of years in practice, number of epilepsy patients seen, and additional fellowship training in epilepsy. Practice patterns assessment in relation to other comorbidities revealed that although bone health and sudden unexplained death in epilepsy are addressed mainly in patients at risk, depression is infrequently discussed. The findings in this study indicate that additional educational efforts are needed to increase awareness and to improve quality of epilepsy care at various points of health care delivery. Copyright © 2012 Elsevier Inc. All rights reserved.

  2. Better assessment of physical function: item improvement is neglected but essential.

    Science.gov (United States)

    Bruce, Bonnie; Fries, James F; Ambrosini, Debbie; Lingala, Bharathi; Gandek, Barbara; Rose, Matthias; Ware, John E

    2009-01-01

    Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models

  3. The physical examination content of the Japanese National Health and Nutrition Survey: temporal changes.

    Science.gov (United States)

    Tanaka, Hisako; Imai, Shino; Nakade, Makiko; Imai, Eri; Takimoto, Hidemi

    2016-12-01

    Survey items of the Japan National Nutrition Survey (J-NNS) have changed over time. Several papers on dietary surveys have been published; however, to date, there are no in-depth papers regarding physical examinations. Therefore, we investigated changes in the survey items in the physical examinations performed in the J-NNS and the National Health and Nutrition Survey (NHNS), with the aim of incorporating useful data for future policy decisions. We summarized the description of physical examinations and marshalled the changes of survey items from the J-NNS and NHNS from 1946 to 2012. The physical examination is roughly classified into the following six components: some are relevant to anthropometric measurements, clinical measurements, physical symptoms, blood tests, lifestyle and medication by interview, and others. Items related to nutritional deficiency, such as anaemia and tendon reflex disappearance, and body weight measurements were collected during the early period, according to the instructions of the General Headquarters. From 1989, blood tests and measurement of physical activity were added, and serum total protein, total cholesterol, triglycerides, HDL-cholesterol, blood glucose, red blood corpuscles and haemoglobin measurements have been performed continuously for more than 20 years. This is the first report on the items of physical examination in the J-NNS and NHNS. Our research results provide basic information for the utilization of the J-NNS and NHNS, to researchers, clinicians or policy makers. Monitoring the current state correctly is essential for national health promotion, and also for improvement of the investigation methods to apply country-by-country comparisons.

  4. Thorndike, Thurstone and Rasch: A Comparison of Their Approaches to Item-Invariant Measurement.

    Science.gov (United States)

    Englehard, George, Jr.

    The methods used by E. L. Thorndike, L. L. Thurstone, and G. Rasch to address issues related to item-invariant measurement and the scoring of individual performance are compared. The analyses highlight the close connection among the three methods, and suggest that progress in measurement theory reflects the movement from essentially ad hoc methods…

  5. A Factor Analysis of Need-Fulfillment Items Designed to Measure Maslow Need Categories

    Science.gov (United States)

    Waters, L. K.; Roach, Darrell

    1973-01-01

    The purpose of the present study was to factor analyze a set of items frequently used to measure Maslow need categories to obtain further information on their structure in relation to the Maslow system. (Author)

  6. Geriatric Anxiety Scale: item response theory analysis, differential item functioning, and creation of a ten-item short form (GAS-10).

    Science.gov (United States)

    Mueller, Anne E; Segal, Daniel L; Gavett, Brandon; Marty, Meghan A; Yochim, Brian; June, Andrea; Coolidge, Frederick L

    2015-07-01

    The Geriatric Anxiety Scale (GAS; Segal et al. (Segal, D. L., June, A., Payne, M., Coolidge, F. L. and Yochim, B. (2010). Journal of Anxiety Disorders, 24, 709-714. doi:10.1016/j.janxdis.2010.05.002) is a self-report measure of anxiety that was designed to address unique issues associated with anxiety assessment in older adults. This study is the first to use item response theory (IRT) to examine the psychometric properties of a measure of anxiety in older adults. A large sample of older adults (n = 581; mean age = 72.32 years, SD = 7.64 years, range = 60 to 96 years; 64% women; 88% European American) completed the GAS. IRT properties were examined. The presence of differential item functioning (DIF) or measurement bias by age and sex was assessed, and a ten-item short form of the GAS (called the GAS-10) was created. All GAS items had discrimination parameters of 1.07 or greater. Items from the somatic subscale tended to have lower discrimination parameters than items on the cognitive or affective subscales. Two items were flagged for DIF, but the impact of the DIF was negligible. Women scored significantly higher than men on the GAS and its subscales. Participants in the young-old group (60 to 79 years old) scored significantly higher on the cognitive subscale than participants in the old-old group (80 years old and older). Results from the IRT analyses indicated that the GAS and GAS-10 have strong psychometric properties among older adults. We conclude by discussing implications and future research directions.

  7. A single-item global job satisfaction measure is associated with quantitative blood immune indices in white-collar employees.

    Science.gov (United States)

    Nakata, Akinori; Irie, Masahiro; Takahashi, Masaya

    2013-01-01

    Although a single-item job satisfaction measure has been shown to be reliable and inclusive as multiple-item scales in relation to health, studies including immunological data are few. The purpose of this study was to evaluate the validity of single-item job and family life satisfaction based on its association with immune indices. A total of 189 white-collar employees (70% men) underwent a blood draw for the measurement of natural killer (NK), total T, and B cell counts as well as plasma immunoglobulin (Ig) G concentrations and completed single-item job and family life satisfaction measures, respectively. The response options for satisfaction measures were 'dissatisfied' (coded 1) to 'satisfied' (coded 4). Spearman's partial correlations controlling for cofactors revealed that increased job satisfaction was positively associated with NK cells (rsp=0.201, p=0.007) and IgG (rsp=0.178, p=0.018), while family life satisfaction was unrelated to immune indices. Those who reported a combination of low job/low family life satisfaction had significantly lower NK and higher B cell counts than those with a high job/high family life satisfaction. Our study suggests that the single-item summary measure of job satisfaction, but not family life satisfaction, may be a valid tool to evaluate immune status in healthy white-collar employees.

  8. Items to be reflected to the nuclear power safety measures in Japan (concerning the examination, design and operation management) (excluding the items to be reflected to the standards)

    Energy Technology Data Exchange (ETDEWEB)

    1980-10-01

    In connection with the Three Mile Island nuclear power accident in March, 1979, in the United States, in order to introduce the lessons from it in the nuclear power safety regulations in Japan, 52 items to be reflected to the nuclear power safety measures were chosen by the Nuclear Safety Commission. Of these, 16 items were examined by the Committee on Examination of Reactor Safety. It was decided that these results would be introduced in the nuclear safety regulations, by the Nuclear Safety Commission. The following 16 items are described. For the examination, four items concerning the automatic operation of safety systems and others; for the design, five items concerning a small rupture accident, the monitoring of the state of primary coolant, control room layout and others; for the operation management, seven items concerning the inspection at the time of repair, the prevention of faulty handlings by operators and others.

  9. Assessing Impact, DIF, and DFF in Accommodated Item Scores: A Comparison of Multilevel Measurement Model Parameterizations

    Science.gov (United States)

    Beretvas, S. Natasha; Cawthon, Stephanie W.; Lockhart, L. Leland; Kaye, Alyssa D.

    2012-01-01

    This pedagogical article is intended to explain the similarities and differences between the parameterizations of two multilevel measurement model (MMM) frameworks. The conventional two-level MMM that includes item indicators and models item scores (Level 1) clustered within examinees (Level 2) and the two-level cross-classified MMM (in which item…

  10. Psychometric evaluation of Persian Nomophobia Questionnaire: Differential item functioning and measurement invariance across gender.

    Science.gov (United States)

    Lin, Chung-Ying; Griffiths, Mark D; Pakpour, Amir H

    2018-03-01

    Background and aims Research examining problematic mobile phone use has increased markedly over the past 5 years and has been related to "no mobile phone phobia" (so-called nomophobia). The 20-item Nomophobia Questionnaire (NMP-Q) is the only instrument that assesses nomophobia with an underlying theoretical structure and robust psychometric testing. This study aimed to confirm the construct validity of the Persian NMP-Q using Rasch and confirmatory factor analysis (CFA) models. Methods After ensuring the linguistic validity, Rasch models were used to examine the unidimensionality of each Persian NMP-Q factor among 3,216 Iranian adolescents and CFAs were used to confirm its four-factor structure. Differential item functioning (DIF) and multigroup CFA were used to examine whether males and females interpreted the NMP-Q similarly, including item content and NMP-Q structure. Results Each factor was unidimensional according to the Rach findings, and the four-factor structure was supported by CFA. Two items did not quite fit the Rasch models (Item 14: "I would be nervous because I could not know if someone had tried to get a hold of me;" Item 9: "If I could not check my smartphone for a while, I would feel a desire to check it"). No DIF items were found across gender and measurement invariance was supported in multigroup CFA across gender. Conclusions Due to the satisfactory psychometric properties, it is concluded that the Persian NMP-Q can be used to assess nomophobia among adolescents. Moreover, NMP-Q users may compare its scores between genders in the knowledge that there are no score differences contributed by different understandings of NMP-Q items.

  11. Measuring Collective Efficacy: A Multilevel Measurement Model for Nested Data

    Science.gov (United States)

    Matsueda, Ross L.; Drakulich, Kevin M.

    2016-01-01

    This article specifies a multilevel measurement model for survey response when data are nested. The model includes a test-retest model of reliability, a confirmatory factor model of inter-item reliability with item-specific bias effects, an individual-level model of the biasing effects due to respondent characteristics, and a neighborhood-level…

  12. Assessing cross-cultural item bias in questionnaires: Acculturation and the Measurement of Social Support and Family Cohesion for Adolescents

    OpenAIRE

    Hemert, Dianne A. van; Baerveldt, Chris; Vermande, Marjolijn

    2001-01-01

    Amethod is presented for evaluating the presence and size of cross-cultural item biases. The examined items concern parental support and family cohesion in a Likert-type questionnaire for adolescents in The Netherlands. Each evaluated item has two versions, a collectivist and an individualistic one, that measure the same theoretical construct. The standardized difference between the score means of the item versions, called the ?e score, gives an indication of the cultural bias of the item. As...

  13. Individuals with knee impairments identify items in need of clarification in the Patient Reported Outcomes Measurement Information System (PROMIS®) pain interference and physical function item banks - a qualitative study.

    Science.gov (United States)

    Lynch, Andrew D; Dodds, Nathan E; Yu, Lan; Pilkonis, Paul A; Irrgang, James J

    2016-05-11

    The content and wording of the Patient Reported Outcome Measurement Information System (PROMIS) Physical Function and Pain Interference item banks have not been qualitatively assessed by individuals with knee joint impairments. The purpose of this investigation was to identify items in the PROMIS Physical Function and Pain Interference Item Banks that are irrelevant, unclear, or otherwise difficult to respond to for individuals with impairment of the knee and to suggest modifications based on cognitive interviews. Twenty-nine individuals with knee joint impairments qualitatively assessed items in the Pain Interference and Physical Function Item Banks in a mixed-methods cognitive interview. Field notes were analyzed to identify themes and frequency counts were calculated to identify items not relevant to individuals with knee joint impairments. Issues with clarity were identified in 23 items in the Physical Function Item Bank, resulting in the creation of 43 new or modified items, typically changing words within the item to be clearer. Interpretation issues included whether or not the knee joint played a significant role in overall health and age/gender differences in items. One quarter of the original items (31 of 124) in the Physical Function Item Bank were identified as irrelevant to the knee joint. All 41 items in the Pain Interference Item Bank were identified as clear, although individuals without significant pain substituted other symptoms which interfered with their life. The Physical Function Item Bank would benefit from additional items that are relevant to individuals with knee joint impairments and, by extension, to other lower extremity impairments. Several issues in clarity were identified that are likely to be present in other patient cohorts as well.

  14. Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory

    Science.gov (United States)

    Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi

    2016-01-01

    High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…

  15. Items to be reflected to the nuclear power safety measures in Japan (concerning the examination, design and operation management) (excluding the items to be reflected to the standards)

    International Nuclear Information System (INIS)

    1980-01-01

    In connection with the Three Mile Island nuclear power accident in March, 1979, in the United States, in order to introduce the lessons from it in the nuclear power safety regulations in Japan, 52 items to be reflected to the nuclear power safety measures were chosen by the Nuclear Safety Commission. Of these, 16 items were examined by the Committee on Examination of Reactor Safety. It was decided that these results would be introduced in the nuclear safety regulations, by the Nuclear Safety Commission. The following 16 items are described. For the examination, four items concerning the automatic operation of safety systems and others; for the design, five items concerning a small rupture accident, the monitoring of the state of primary coolant, control room layout and others; for the operation management, seven items concerning the inspection at the time of repair, the prevention of faulty handlings by operators and others. (J.P.N.)

  16. Normative data for the 12 item WHO Disability Assessment Schedule 2.0.

    Directory of Open Access Journals (Sweden)

    Gavin Andrews

    Full Text Available BACKGROUND: The World Health Organization Disability Assessment Schedule (WHODAS 2.0 measures disability due to health conditions including diseases, illnesses, injuries, mental or emotional problems, and problems with alcohol or drugs. METHOD: The 12 Item WHODAS 2.0 was used in the second Australian Survey of Mental Health and Well-being. We report the overall factor structure and the distribution of scores and normative data (means and SDs for people with any physical disorder, any mental disorder and for people with neither. FINDINGS: A single second order factor justifies the use of the scale as a measure of global disability. People with mental disorders had high scores (mean 6.3, SD 7.1, people with physical disorders had lower scores (mean 4.3, SD 6.1. People with no disorder covered by the survey had low scores (mean 1.4, SD 3.6. INTERPRETATION: The provision of normative data from a population sample of adults will facilitate use of the WHODAS 2.0 12 item scale in clinical and epidemiological research.

  17. An Item Bank for Abuse of Prescription Pain Medication from the Patient-Reported Outcomes Measurement Information System (PROMIS®).

    Science.gov (United States)

    Pilkonis, Paul A; Yu, Lan; Dodds, Nathan E; Johnston, Kelly L; Lawrence, Suzanne M; Hilton, Thomas F; Daley, Dennis C; Patkar, Ashwin A; McCarty, Dennis

    2017-08-01

    There is a need to monitor patients receiving prescription opioids to detect possible signs of abuse. To address this need, we developed and calibrated an item bank for severity of abuse of prescription pain medication as part of the Patient-Reported Outcomes Measurement Information System (PROMIS ® ). Comprehensive literature searches yielded an initial bank of 5,310 items relevant to substance use and abuse, including abuse of prescription pain medication, from over 80 unique instruments. After qualitative item analysis (i.e., focus groups, cognitive interviewing, expert review, and item revision), 25 items for abuse of prescribed pain medication were included in field testing. Items were written in a first-person, past-tense format, with a three-month time frame and five response options reflecting frequency or severity. The calibration sample included 448 respondents, 367 from the general population (ascertained through an internet panel) and 81 from community treatment programs participating in the National Drug Abuse Treatment Clinical Trials Network. A final bank of 22 items was calibrated using the two-parameter graded response model from item response theory. A seven-item static short form was also developed. The test information curve showed that the PROMIS ® item bank for abuse of prescription pain medication provided substantial information in a broad range of severity. The initial psychometric characteristics of the item bank support its use as a computerized adaptive test or short form, with either version providing a brief, precise, and efficient measure relevant to both clinical and community samples. © 2016 American Academy of Pain Medicine. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

  18. The effect of sociodemographic (mis)match between interviewers and respondents on unit and item nonresponse in Belgium.

    Science.gov (United States)

    Vercruyssen, Anina; Wuyts, Celine; Loosveldt, Geert

    2017-09-01

    Interviewer characteristics affect nonresponse and measurement errors in face-to-face surveys. Some studies have shown that mismatched sociodemographic characteristics - for example gender - affect people's behavior when interacting with an interviewer at the door and during the survey interview, resulting in more nonresponse. We investigate the effect of sociodemographic (mis)matching on nonresponse in two successive rounds of the European Social Survey in Belgium. As such, we replicate the analyses of the effect of (mis)matching gender and age on unit nonresponse on the one hand, and of gender, age and education level (mis)matching on item nonresponse on the other hand. Recurring effects of sociodemographic (mis)match are found for both unit and item nonresponse. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

    Science.gov (United States)

    Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

    2018-04-10

    To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading .3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.

  20. Grouping of Items in Mobile Web Questionnaires

    Science.gov (United States)

    Mavletova, Aigul; Couper, Mick P.

    2016-01-01

    There is some evidence that a scrolling design may reduce breakoffs in mobile web surveys compared to a paging design, but there is little empirical evidence to guide the choice of the optimal number of items per page. We investigate the effect of the number of items presented on a page on data quality in two types of questionnaires: with or…

  1. Intake of natural radioactivity through dietary items: a prelude to preoperational environmental survey at Kudankulam

    International Nuclear Information System (INIS)

    Varughese, K.G.; Kumar, M.; George, Thomas; Sunder Rajan, P.; Vijay Kumar, B.; Rajan, M.P.

    2008-01-01

    High background radiation are found in nature at some parts of Australia, Brazil, China, Iran, India etc. Kanyakumari district in the southern peninsular India is such a NHBRA (Natural high background radiation area) having monazite placers along the coast. Although general radiation levels in this area has been investigated by many researchers in the past, the impact of this high background radioactivity on the flora and fauna is scarce. In the present investigations radiation survey has been done at high background areas with special attention to vegetables and crops grown in this area. The studies are centered at the 2x1000 MWe, Kudankulam Nuclear Power Project site which is about 25 km from Kanyakumari. Samples of soil, sand, vegetations and other food items are collected from the 30 km radial zone of KKNPP site and analysed for naturally occurring radionuclides such as 238 U, 232 Th and 40 K. The intake of natural radioactivity through food items produced in this area is found to be very small, and the internal dose to general population staying at this high natural background area is insignificant. (author)

  2. Measuring depression after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Depression item bank and linkage with PHQ-9.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Bombardier, Charles H; Pohlig, Ryan T; Heinemann, Allen W; Carle, Adam; Choi, Seung W

    2015-05-01

    To develop a calibrated spinal cord injury-quality of life (SCI-QOL) item bank, computer adaptive test (CAT), and short form to assess depressive symptoms experienced by individuals with SCI, transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a crosswalk to the Patient Health Questionnaire (PHQ)-9. We used grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, item response theory (IRT) analyses, and statistical linking techniques to transform scores to a PROMIS metric and to provide a crosswalk with the PHQ-9. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. Spinal Cord Injury--Quality of Life (SCI-QOL) Depression Item Bank Individuals with SCI were involved in all phases of SCI-QOL development. A sample of 716 individuals with traumatic SCI completed 35 items assessing depression, 18 of which were PROMIS items. After removing 7 non-PROMIS items, factor analyses confirmed a unidimensional pool of items. We used a graded response IRT model to estimate slopes and thresholds for the 28 retained items. The SCI-QOL Depression measure correlated 0.76 with the PHQ-9. The SCI-QOL Depression item bank provides a reliable and sensitive measure of depressive symptoms with scores reported in terms of general population norms. We provide a crosswalk to the PHQ-9 to facilitate comparisons between measures. The item bank may be administered as a CAT or as a short form and is suitable for research and clinical applications.

  3. Measuring Constructs in Family Science: How Can Item Response Theory Improve Precision and Validity?

    Science.gov (United States)

    Gordon, Rachel A.

    2015-01-01

    This article provides family scientists with an understanding of contemporary measurement perspectives and the ways in which item response theory (IRT) can be used to develop measures with desired evidence of precision and validity for research uses. The article offers a nontechnical introduction to some key features of IRT, including its…

  4. Macrostructural Treatment of Multi-word Lexical Items

    Directory of Open Access Journals (Sweden)

    Alenka Vrbinc

    2011-05-01

    Full Text Available The paper discusses the macrostructural treatment of multi-word lexical items in mono- and bilingual dictionaries. First, the classification of multi-word lexical items is presented, and special attention is paid to the discussion of compounds – a specific group of multi-word lexical items that is most commonly afforded headword status but whose inclusion in the headword list may also depend on spelling. Then the inclusion of multi-word lexical items in monolingual dictionaries is dealt with in greater detail, while the results of a short survey on the inclusion of five randomly chosen multi-word lexical items in seven English monolingual dictionaries are presented. The proposals as to how to treat these five multi-word lexical items in bilingual dictionaries are presented in the section about the inclusion of multi-word lexical items in bilingual dictionaries. The conclusion is that it is most important to take the users’ needs into consideration and to make any dictionary as user friendly as possible.

  5. The Spanish version of the Self-Determination Inventory Student Report: application of item response theory to self-determination measurement.

    Science.gov (United States)

    Mumbardó-Adam, C; Guàrdia-Olmos, J; Giné, C; Raley, S K; Shogren, K A

    2018-04-01

    A new measure of self-determination, the Self-Determination Inventory: Student Report (Spanish version), has recently been adapted and empirically validated in Spanish language. As it is the first instrument intended to measure self-determination in youth with and without disabilities, there is a need to further explore and strengthen its psychometric analysis based on item response patterns. Through item response theory approach, this study examined item observed distributions across the essential characteristics of self-determination. The results demonstrated satisfactory to excellent item functioning patterns across characteristics, particularly within agentic action domains. Increased variability across items was also found within action-control beliefs dimensions, specifically within the self-realisation subdomain. These findings further support the instrument's psychometric properties and outline future research directions. © 2017 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.

  6. Surveying for "artifacts": the susceptibility of the OCB-performance evaluation relationship to common rater, item, and measurement context effects.

    Science.gov (United States)

    Podsakoff, Nathan P; Whiting, Steven W; Welsh, David T; Mai, Ke Michael

    2013-09-01

    Despite the increased attention paid to biases attributable to common method variance (CMV) over the past 50 years, researchers have only recently begun to systematically examine the effect of specific sources of CMV in previously published empirical studies. Our study contributes to this research by examining the extent to which common rater, item, and measurement context characteristics bias the relationships between organizational citizenship behaviors and performance evaluations using a mixed-effects analytic technique. Results from 173 correlations reported in 81 empirical studies (N = 31,146) indicate that even after controlling for study-level factors, common rater and anchor point number similarity substantially biased the focal correlations. Indeed, these sources of CMV (a) led to estimates that were between 60% and 96% larger when comparing measures obtained from a common rater, versus different raters; (b) led to 39% larger estimates when a common source rated the scales using the same number, versus a different number, of anchor points; and (c) when taken together with other study-level predictors, accounted for over half of the between-study variance in the focal correlations. We discuss the implications for researchers and practitioners and provide recommendations for future research. PsycINFO Database Record (c) 2013 APA, all rights reserved

  7. Development and Validation of the Poverty Attributions Survey

    Science.gov (United States)

    Bennett, Robert M.; Raiz, Lisa; Davis, Tamara S.

    2016-01-01

    This article describes the process of developing and testing the Poverty Attribution Survey (PAS), a measure of poverty attributions. The PAS is theory based and includes original items as well as items from previously tested poverty attribution instruments. The PAS was electronically administered to a sample of state-licensed professional social…

  8. Improving Inpatient Surveys: Web-Based Computer Adaptive Testing Accessed via Mobile Phone QR Codes.

    Science.gov (United States)

    Chien, Tsair-Wei; Lin, Weir-Sen

    2016-03-02

    The National Health Service (NHS) 70-item inpatient questionnaire surveys inpatients on their perceptions of their hospitalization experience. However, it imposes more burden on the patient than other similar surveys. The literature shows that computerized adaptive testing (CAT) based on item response theory can help shorten the item length of a questionnaire without compromising its precision. Our aim was to investigate whether CAT can be (1) efficient with item reduction and (2) used with quick response (QR) codes scanned by mobile phones. After downloading the 2008 inpatient survey data from the Picker Institute Europe website and analyzing the difficulties of this 70-item questionnaire, we used an author-made Excel program using the Rasch partial credit model to simulate 1000 patients' true scores followed by a standard normal distribution. The CAT was compared to two other scenarios of answering all items (AAI) and the randomized selection method (RSM), as we investigated item length (efficiency) and measurement accuracy. The author-made Web-based CAT program for gathering patient feedback was effectively accessed from mobile phones by scanning the QR code. We found that the CAT can be more efficient for patients answering questions (ie, fewer items to respond to) than either AAI or RSM without compromising its measurement accuracy. A Web-based CAT inpatient survey accessed by scanning a QR code on a mobile phone was viable for gathering inpatient satisfaction responses. With advances in technology, patients can now be offered alternatives for providing feedback about hospitalization satisfaction. This Web-based CAT is a possible option in health care settings for reducing the number of survey items, as well as offering an innovative QR code access.

  9. Assessing nicotine dependence in adolescent E-cigarette users: The 4-item Patient-Reported Outcomes Measurement Information System (PROMIS) Nicotine Dependence Item Bank for electronic cigarettes.

    Science.gov (United States)

    Morean, Meghan E; Krishnan-Sarin, Suchitra; S O'Malley, Stephanie

    2018-04-26

    Adolescent e-cigarette use (i.e., "vaping") likely confers risk for developing nicotine dependence. However, there have been no studies assessing e-cigarette nicotine dependence in youth. We evaluated the psychometric properties of the 4-item Patient-Reported Outcomes Measurement Information System Nicotine Dependence Item Bank for E-cigarettes (PROMIS-E) for assessing youth e-cigarette nicotine dependence and examined risk factors for experiencing stronger dependence symptoms. In 2017, 520 adolescent past-month e-cigarette users completed the PROMIS-E during a school-based survey (50.5% female, 84.8% White, 16.22[1.19] years old). Adolescents also reported on sex, grade, race, age at e-cigarette use onset, vaping frequency, nicotine e-liquid use, and past-month cigarette smoking. Analyses included conducting confirmatory factor analysis and examining the internal consistency of the PROMIS-E. Bivariate correlations and independent-samples t-tests were used to examine unadjusted relationships between e-cigarette nicotine dependence and the proposed risk factors. Regression models were run in which all potential risk factors were entered as simultaneous predictors of PROMIS-E scores. The single-factor structure of the PROMIS-E was confirmed and evidenced good internal consistency. Across models, larger PROMIS-E scores were associated with being in a higher grade, initiating e-cigarette use at an earlier age, vaping more frequently, using nicotine e-liquid (and higher nicotine concentrations), and smoking cigarettes. Adolescent e-cigarette users reported experiencing nicotine dependence, which was assessed using the psychometrically sound PROMIS-E. Experiencing stronger nicotine dependence symptoms was associated with characteristics that previously have been shown to confer risk for frequent vaping and tobacco cigarette dependence. Copyright © 2018 Elsevier B.V. All rights reserved.

  10. Using item response theory to address vulnerabilities in FFQ.

    Science.gov (United States)

    Kazman, Josh B; Scott, Jonathan M; Deuster, Patricia A

    2017-09-01

    The limitations for self-reporting of dietary patterns are widely recognised as a major vulnerability of FFQ and the dietary screeners/scales derived from FFQ. Such instruments can yield inconsistent results to produce questionable interpretations. The present article discusses the value of psychometric approaches and standards in addressing these drawbacks for instruments used to estimate dietary habits and nutrient intake. We argue that a FFQ or screener that treats diet as a 'latent construct' can be optimised for both internal consistency and the value of the research results. Latent constructs, a foundation for item response theory (IRT)-based scales (e.g. Patient Reported Outcomes Measurement Information System) are typically introduced in the design stage of an instrument to elicit critical factors that cannot be observed or measured directly. We propose an iterative approach that uses such modelling to refine FFQ and similar instruments. To that end, we illustrate the benefits of psychometric modelling by using items and data from a sample of 12 370 Soldiers who completed the 2012 US Army Global Assessment Tool (GAT). We used factor analysis to build the scale incorporating five out of eleven survey items. An IRT-driven assessment of response category properties indicates likely problems in the ordering or wording of several response categories. Group comparisons, examined with differential item functioning (DIF), provided evidence of scale validity across each Army sub-population (sex, service component and officer status). Such an approach holds promise for future FFQ.

  11. Teoria da Resposta ao Item Teoria de la respuesta al item Item response theory

    Directory of Open Access Journals (Sweden)

    Eutalia Aparecida Candido de Araujo

    2009-12-01

    Full Text Available A preocupação com medidas de traços psicológicos é antiga, sendo que muitos estudos e propostas de métodos foram desenvolvidos no sentido de alcançar este objetivo. Entre os trabalhos propostos, destaca-se a Teoria da Resposta ao Item (TRI que, a princípio, veio completar limitações da Teoria Clássica de Medidas, empregada em larga escala até hoje na medida de traços psicológicos. O ponto principal da TRI é que ela leva em consideração o item particularmente, sem relevar os escores totais; portanto, as conclusões não dependem apenas do teste ou questionário, mas de cada item que o compõe. Este artigo propõe-se a apresentar esta Teoria que revolucionou a teoria de medidas.La preocupación con las medidas de los rasgos psicológicos es antigua y muchos estudios y propuestas de métodos fueron desarrollados para lograr este objetivo. Entre estas propuestas de trabajo se incluye la Teoría de la Respuesta al Ítem (TRI que, en principio, vino a completar las limitaciones de la Teoría Clásica de los Tests, ampliamente utilizada hasta hoy en la medida de los rasgos psicológicos. El punto principal de la TRI es que se tiene en cuenta el punto concreto, sin relevar las puntuaciones totales; por lo tanto, los resultados no sólo dependen de la prueba o cuestionario, sino que de cada ítem que lo compone. En este artículo se propone presentar la Teoría que revolucionó la teoría de medidas.The concern with measures of psychological traits is old and many studies and proposals of methods were developed to achieve this goal. Among these proposed methods highlights the Item Response Theory (IRT that, in principle, came to complete limitations of the Classical Test Theory, which is widely used until nowadays in the measurement of psychological traits. The main point of IRT is that it takes into account the item in particular, not relieving the total scores; therefore, the findings do not only depend on the test or questionnaire

  12. Evaluating construct validity of the second version of the Copenhagen Psychosocial Questionnaire through analysis of differential item functioning and differential item effect

    DEFF Research Database (Denmark)

    Bjorner, Jakob Bue; Pejtersen, Jan Hyld

    2010-01-01

    AIMS: To evaluate the construct validity of the Copenhagen Psychosocial Questionnaire II (COPSOQ II) by means of tests for differential item functioning (DIF) and differential item effect (DIE). METHODS: We used a Danish general population postal survey (n = 4,732 with 3,517 wage earners) with a ...

  13. Measurement properties of a novel survey to assess stages of organizational readiness for evidence-based interventions in community chronic disease prevention settings

    Directory of Open Access Journals (Sweden)

    Stamatakis Katherine A

    2012-07-01

    Full Text Available Abstract Background There is a great deal of variation in the existing capacity of primary prevention programs and policies addressing chronic disease to deliver evidence-based interventions (EBIs. In order to develop and evaluate implementation strategies that are tailored to the appropriate level of capacity, there is a need for an easy-to-administer tool to stage organizational readiness for EBIs. Methods Based on theoretical frameworks, including Rogers’ Diffusion of Innovations, we developed a survey instrument to measure four domains representing stages of readiness for EBI: awareness, adoption, implementation, and maintenance. A separate scale representing organizational climate as a potential mediator of readiness for EBIs was also included in the survey. Twenty-three questions comprised the four domains, with four to nine items each, using a seven-point response scale. Representatives from obesity, asthma, diabetes, and tobacco prevention programs serving diverse populations in the United States were surveyed (N = 243; test-retest reliability was assessed with 92 respondents. Results Confirmatory factor analysis (CFA was used to test and refine readiness scales. Test-retest reliability of the readiness scales, as measured by intraclass correlation, ranged from 0.47–0.71. CFA found good fit for the five-item adoption and implementation scales and resulted in revisions of the awareness and maintenance scales. The awareness scale was split into two two-item scales, representing community and agency awareness. The maintenance scale was split into five- and four-item scales, representing infrastructural maintenance and evaluation maintenance, respectively. Internal reliability of scales (Cronbach’s α ranged from 0.66–0.78. The model for the final revised scales approached good fit, with most factor loadings >0.6 and all >0.4. Conclusions The lack of adequate measurement tools hinders progress in dissemination and implementation

  14. Improving Measurement Efficiency of the Inner EAR Scale with Item Response Theory.

    Science.gov (United States)

    Jessen, Annika; Ho, Andrew D; Corrales, C Eduardo; Yueh, Bevan; Shin, Jennifer J

    2018-02-01

    Objectives (1) To assess the 11-item Inner Effectiveness of Auditory Rehabilitation (Inner EAR) instrument with item response theory (IRT). (2) To determine whether the underlying latent ability could also be accurately represented by a subset of the items for use in high-volume clinical scenarios. (3) To determine whether the Inner EAR instrument correlates with pure tone thresholds and word recognition scores. Design IRT evaluation of prospective cohort data. Setting Tertiary care academic ambulatory otolaryngology clinic. Subjects and Methods Modern psychometric methods, including factor analysis and IRT, were used to assess unidimensionality and item properties. Regression methods were used to assess prediction of word recognition and pure tone audiometry scores. Results The Inner EAR scale is unidimensional, and items varied in their location and information. Information parameter estimates ranged from 1.63 to 4.52, with higher values indicating more useful items. The IRT model provided a basis for identifying 2 sets of items with relatively lower information parameters. Item information functions demonstrated which items added insubstantial value over and above other items and were removed in stages, creating a 8- and 3-item Inner EAR scale for more efficient assessment. The 8-item version accurately reflected the underlying construct. All versions correlated moderately with word recognition scores and pure tone averages. Conclusion The 11-, 8-, and 3-item versions of the Inner EAR scale have strong psychometric properties, and there is correlational validity evidence for the observed scores. Modern psychometric methods can help streamline care delivery by maximizing relevant information per item administered.

  15. Using Item Response Theory to Develop Measures of Acquisitive and Protective Self-Monitoring From the Original Self-Monitoring Scale.

    Science.gov (United States)

    Wilmot, Michael P; Kostal, Jack W; Stillwell, David; Kosinski, Michal

    2017-07-01

    For the past 40 years, the conventional univariate model of self-monitoring has reigned as the dominant interpretative paradigm in the literature. However, recent findings associated with an alternative bivariate model challenge the conventional paradigm. In this study, item response theory is used to develop measures of the bivariate model of acquisitive and protective self-monitoring using original Self-Monitoring Scale (SMS) items, and data from two large, nonstudent samples ( Ns = 13,563 and 709). Results indicate that the new acquisitive (six-item) and protective (seven-item) self-monitoring scales are reliable, unbiased in terms of gender and age, and demonstrate theoretically consistent relations to measures of personality traits and cognitive ability. Additionally, by virtue of using original SMS items, previously collected responses can be reanalyzed in accordance with the alternative bivariate model. Recommendations for the reanalysis of archival SMS data, as well as directions for future research, are provided.

  16. Cognitive interviewing methodology in the development of a pediatric item bank: a patient reported outcomes measurement information system (PROMIS study

    Directory of Open Access Journals (Sweden)

    DeWalt Darren A

    2009-01-01

    Full Text Available Abstract Background The evaluation of patient-reported outcomes (PROs in health care has seen greater use in recent years, and methods to improve the reliability and validity of PRO instruments are advancing. This paper discusses the cognitive interviewing procedures employed by the Patient Reported Outcomes Measurement Information System (PROMIS pediatrics group for the purpose of developing a dynamic, electronic item bank for field testing with children and adolescents using novel computer technology. The primary objective of this study was to conduct cognitive interviews with children and adolescents to gain feedback on items measuring physical functioning, emotional health, social health, fatigue, pain, and asthma-specific symptoms. Methods A total of 88 cognitive interviews were conducted with 77 children and adolescents across two sites on 318 items. From this initial item bank, 25 items were deleted and 35 were revised and underwent a second round of cognitive interviews. A total of 293 items were retained for field testing. Results Children as young as 8 years of age were able to comprehend the majority of items, response options, directions, recall period, and identify problems with language that was difficult for them to understand. Cognitive interviews indicated issues with item comprehension on several items which led to alternative wording for these items. Conclusion Children ages 8–17 years were able to comprehend most item stems and response options in the present study. Field testing with the resulting items and response options is presently being conducted as part of the PROMIS Pediatric Item Bank development process.

  17. Clusters of cultures: diversity in meaning of family value and gender role items across Europe.

    Science.gov (United States)

    van Vlimmeren, Eva; Moors, Guy B D; Gelissen, John P T M

    2017-01-01

    Survey data are often used to map cultural diversity by aggregating scores of attitude and value items across countries. However, this procedure only makes sense if the same concept is measured in all countries. In this study we argue that when (co)variances among sets of items are similar across countries, these countries share a common way of assigning meaning to the items. Clusters of cultures can then be observed by doing a cluster analysis on the (co)variance matrices of sets of related items. This study focuses on family values and gender role attitudes. We find four clusters of cultures that assign a distinct meaning to these items, especially in the case of gender roles. Some of these differences reflect response style behavior in the form of acquiescence. Adjusting for this style effect impacts on country comparisons hence demonstrating the usefulness of investigating the patterns of meaning given to sets of items prior to aggregating scores into cultural characteristics.

  18. Measuring resilience after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Resilience item bank and short form.

    Science.gov (United States)

    Victorson, David; Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Weiland, Brian; Choi, Seung W

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury--Quality of Life (SCI-QOL) Resilience item bank and short form. Using a mixed-methods design, we developed and tested a resilience item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory based analytic approaches, including tests of model fit and differential item functioning (DIF). We tested a 32-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Department of Veterans Affairs medical center. A total of 717 individuals with SCI completed the Resilience items. A unidimensional model was observed (CFI=0.968; RMSEA=0.074) and measurement precision was good (theta range between -3.1 and 0.9). Ten items were flagged for DIF, however, after examination of effect sizes we found this to be negligible with little practical impact on score estimates. The final calibrated item bank resulted in 21 retained items. This study indicates that the SCI-QOL Resilience item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.

  19. Trends in Sexual Orientation Missing Data Over a Decade of the California Health Interview Survey

    Science.gov (United States)

    Viana, Joseph; Grant, David; Cochran, Susan D.; Lee, Annie C.; Ponce, Ninez A.

    2015-01-01

    Objectives. We explored changes in sexual orientation question item completion in a large statewide health survey. Methods. We used 2003 to 2011 California Health Interview Survey data to investigate sexual orientation item nonresponse and sexual minority self-identification trends in a cross-sectional sample representing the noninstitutionalized California household population aged 18 to 70 years (n = 182 812 adults). Results. Asians, Hispanics, limited-English-proficient respondents, and those interviewed in non-English languages showed the greatest declines in sexual orientation item nonresponse. Asian women, regardless of English-proficiency status, had the highest odds of item nonresponse. Spanish interviews produced more nonresponse than English interviews and Asian-language interviews produced less nonresponse when we controlled for demographic factors and survey cycle. Sexual minority self-identification increased in concert with the item nonresponse decline. Conclusions. Sexual orientation nonresponse declines and the increase in sexual minority identification suggest greater acceptability of sexual orientation assessment in surveys. Item nonresponse rate convergence among races/ethnicities, language proficiency groups, and interview languages shows that sexual orientation can be measured in surveys of diverse populations. PMID:25790399

  20. An emotional functioning item bank of 24 items for computerized adaptive testing (CAT) was established

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Gamper, Eva-Maria; Costantini, Anna

    2016-01-01

    of the widely used EORTC Quality of Life questionnaire (QLQ-C30). STUDY DESIGN AND SETTING: On the basis of literature search and evaluations by international samples of experts and cancer patients, 38 candidate items were developed. The psychometric properties of the items were evaluated in a large...... international sample of cancer patients. This included evaluations of dimensionality, item response theory (IRT) model fit, differential item functioning (DIF), and of measurement precision/statistical power. RESULTS: Responses were obtained from 1,023 cancer patients from four countries. The evaluations showed...... that 24 items could be included in a unidimensional IRT model. DIF did not seem to have any significant impact on the estimation of EF. Evaluations indicated that the CAT measure may reduce sample size requirements by up to 50% compared to the QLQ-C30 EF scale without reducing power. CONCLUSION...

  1. Does item overlap render measured relationships between pain and challenging behaviour trivial? Results from a multicentre cross-sectional study in 13 German nursing homes.

    Science.gov (United States)

    Kutschar, Patrick; Bauer, Zsuzsa; Gnass, Irmela; Osterbrink, Jürgen

    2017-07-01

    Several studies suggest that pain is a trigger for challenging behaviour in older adults with cognitive impairment. However, such measured relationships might be confounded due to item overlap as instruments share similar or identical items. The purpose of this study was to examine whether the frequently observed association between pain and challenging behaviour might be traced back to item overlap. This multicentre cross-sectional study was conducted in 13 nursing homes and examined pain (measure: Pain Assessment in Advanced Dementia Scale) and challenging behaviour (measure: Cohen-Mansfield Agitation Inventory) in 150 residents with severe cognitive impairment. The extent of item overlap was determined by juxtaposition of both measures' original items. As expected, comparison between these instruments revealed an extensive item overlap. The statistical relationship between the two phenomena can be traced back mainly to the contribution of the overlapping items, which renders the frequently stated relationship between pain and challenging behaviour trivial. The status quo of measuring such associations must be contested: constructs' discrimination and instruments' discrimination have to be discussed critically as item overlap may lead to biased conclusions and assumptions in research as well as to inadequate care measures in nursing practice. © 2017 John Wiley & Sons Ltd.

  2. An Introduction to Item Response Theory for Patient-Reported Outcome Measurement

    Science.gov (United States)

    Nguyen, Tam H.; Han, Hae-Ra; Kim, Miyong T.

    2015-01-01

    The growing emphasis on patient-centered care has accelerated the demand for high-quality data from patient-reported outcome (PRO) measures. Traditionally, the development and validation of these measures has been guided by classical test theory. However, item response theory (IRT), an alternate measurement framework, offers promise for addressing practical measurement problems found in health-related research that have been difficult to solve through classical methods. This paper introduces foundational concepts in IRT, as well as commonly used models and their assumptions. Existing data on a combined sample (n = 636) of Korean American and Vietnamese American adults who responded to the High Blood Pressure Health Literacy Scale and the Patient Health Questionnaire-9 are used to exemplify typical applications of IRT. These examples illustrate how IRT can be used to improve the development, refinement, and evaluation of PRO measures. Greater use of methods based on this framework can increase the accuracy and efficiency with which PROs are measured. PMID:24403095

  3. Evaluation of psychometric properties and differential item functioning of 8-item Child Perceptions Questionnaires using item response theory.

    Science.gov (United States)

    Yau, David T W; Wong, May C M; Lam, K F; McGrath, Colman

    2015-08-19

    Four-factor structure of the two 8-item short forms of Child Perceptions Questionnaire CPQ11-14 (RSF:8 and ISF:8) has been confirmed. However, the sum scores are typically reported in practice as a proxy of Oral health-related Quality of Life (OHRQoL), which implied a unidimensional structure. This study first assessed the unidimensionality of 8-item short forms of CPQ11-14. Item response theory (IRT) was employed to offer an alternative and complementary approach of validation and to overcome the limitations of classical test theory assumptions. A random sample of 649 12-year-old school children in Hong Kong was analyzed. Unidimensionality of the scale was tested by confirmatory factor analysis (CFA), principle component analysis (PCA) and local dependency (LD) statistic. Graded response model was fitted to the data. Contribution of each item to the scale was assessed by item information function (IIF). Reliability of the scale was assessed by test information function (TIF). Differential item functioning (DIF) across gender was identified by Wald test and expected score functions. Both CPQ11-14 RSF:8 and ISF:8 did not deviate much from the unidimensionality assumption. Results from CFA indicated acceptable fit of the one-factor model. PCA indicated that the first principle component explained >30 % of the total variation with high factor loadings for both RSF:8 and ISF:8. Almost all LD statistic items suggesting little contribution of information to the scale and item removal caused little practical impact. Comparing the TIFs, RSF:8 showed slightly better information than ISF:8. In addition to oral symptoms items, the item "Concerned with what other people think" demonstrated a uniform DIF (p Items related to oral symptoms were not informative to OHRQoL and deletion of these items is suggested. The impact of DIF across gender on the overall score was minimal. CPQ11-14 RSF:8 performed slightly better than ISF:8 in measurement precision. The 6-item short forms

  4. Development of the Oxford Participation and Activities Questionnaire: constructing an item pool

    Directory of Open Access Journals (Sweden)

    Kelly L

    2015-05-01

    Full Text Available Laura Kelly, Crispin Jenkinson, Sarah Dummett, Jill Dawson, Ray Fitzpatrick, David Morley Health Services Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK Purpose: The Oxford Participation and Activities Questionnaire is a patient-reported outcome measure in development that is grounded on the World Health Organization International Classification of Functioning, Disability, and Health (ICF. The study reported here aimed to inform and generate an item pool for the new measure, which is specifically designed for the assessment of participation and activity in patients experiencing a range of health conditions. Methods: Items were informed through in-depth interviews conducted with 37 participants spanning a range of conditions. Interviews aimed to identify how their condition impacted their ability to participate in meaningful activities. Conditions included arthritis, cancer, chronic back pain, diabetes, motor neuron disease, multiple sclerosis, Parkinson's disease, and spinal cord injury. Transcripts were analyzed using the framework method. Statements relating to ICF themes were recast as questionnaire items and shown for review to an expert panel. Cognitive debrief interviews (n=13 were used to assess items for face and content validity. Results: ICF themes relevant to activities and participation in everyday life were explored, and a total of 222 items formed the initial item pool. This item pool was refined by the research team and 28 generic items were mapped onto all nine chapters of the ICF construct, detailing activity and participation. Cognitive interviewing confirmed the questionnaire instructions, items, and response options were acceptable to participants. Conclusion: Using a clear conceptual basis to inform item generation, 28 items have been identified as suitable to undergo further psychometric testing. A large-scale postal survey will follow in order to refine the instrument further and

  5. ABORTION ATTITUDES, 1984-1987-1988 - EFFECTS OF ITEM ORDER AND DIMENSIONALITY

    NARCIS (Netherlands)

    TENVERGERT, E; GILLESPIE, MW; KINGMA, J; KLASEN, H

    The comparability of surveys is often hampered by differences in the item order of presentation. The major focus of the present study was to investigate whether a general item or a specific item at the beginning of the questionnaire would affect the endorsement as well as the scalability of a set of

  6. Measuring the quality of life in hypertension according to Item Response Theory.

    Science.gov (United States)

    Borges, José Wicto Pereira; Moreira, Thereza Maria Magalhães; Schmitt, Jeovani; Andrade, Dalton Francisco de; Barbetta, Pedro Alberto; Souza, Ana Célia Caetano de; Lima, Daniele Braz da Silva; Carvalho, Irialda Saboia

    2017-05-04

    To analyze the Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL - Mini-questionnaire of Quality of Life in Hypertension) using the Item Response Theory. This is an analytical study conducted with 712 persons with hypertension treated in thirteen primary health care units of Fortaleza, State of Ceará, Brazil, in 2015. The steps of the analysis by the Item Response Theory were: evaluation of dimensionality, estimation of parameters of items, and construction of scale. The study of dimensionality was carried out on the polychoric correlation matrix and confirmatory factor analysis. To estimate the item parameters, we used the Gradual Response Model of Samejima. The analyses were conducted using the free software R with the aid of psych and mirt. The analysis has allowed the visualization of item parameters and their individual contributions in the measurement of the latent trait, generating more information and allowing the construction of a scale with an interpretative model that demonstrates the evolution of the worsening of the quality of life in five levels. Regarding the item parameters, the items related to the somatic state have had a good performance, as they have presented better power to discriminate individuals with worse quality of life. The items related to mental state have been those which contributed with less psychometric data in the MINICHAL. We conclude that the instrument is suitable for the identification of the worsening of the quality of life in hypertension. The analysis of the MINICHAL using the Item Response Theory has allowed us to identify new sides of this instrument that have not yet been addressed in previous studies. Analisar o Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL) por meio da Teoria da Resposta ao Item. Estudo analítico realizado com 712 pessoas com hipertensão arterial atendidas em 13 unidades de atenção primária em saúde de Fortaleza, CE, em 2015. As etapas da an

  7. Factorial Structure and Age-Related Psychometrics of the MIDUS Personality Adjective Items across the Lifespan

    Science.gov (United States)

    Zimprich, Daniel; Allemand, Mathias; Lachman, Margie E.

    2014-01-01

    The present study addresses issues of measurement invariance and comparability of factor parameters of Big Five personality adjective items across age. Data from the Midlife in the United States (MIDUS) survey were used to investigate age-related developmental psychometrics of the MIDUS personality adjective items in two large cross-sectional samples (exploratory sample: N = 862; analysis sample: N = 3,000). After having established and replicated a comprehensive five-factor structure of the measure, increasing levels of measurement invariance were tested across ten age groups. Results indicate that the measure demonstrates strict measurement invariance in terms of number of factors and factor loadings. Also, we found that factor variances and covariances were equal across age groups. By contrast, a number of age-related factor mean differences emerged. The practical implications of these results are discussed and future research is suggested. PMID:21910548

  8. Measuring values for cross-cultural research

    NARCIS (Netherlands)

    Maseland, R.K.J.; Hoorn, A.A.J. van

    2009-01-01

    This paper investigates the empirical relevance of the recent critique that values surveys, as they are, suffer from the problem of measuring marginal preferences rather than values. By surveying items from cross-cultural surveys by Hofstede, Inglehart and GLOBE, we show that the marginal

  9. Survey research.

    Science.gov (United States)

    Alderman, Amy K; Salem, Barbara

    2010-10-01

    Survey research is a unique methodology that can provide insight into individuals' perspectives and experiences and can be collected on a large population-based sample. Specifically, in plastic surgery, survey research can provide patients and providers with accurate and reproducible information to assist with medical decision-making. When using survey methods in research, researchers should develop a conceptual model that explains the relationships of the independent and dependent variables. The items of the survey are of primary importance. Collected data are only useful if they accurately measure the concepts of interest. In addition, administration of the survey must follow basic principles to ensure an adequate response rate and representation of the intended target sample. In this article, the authors review some general concepts important for successful survey research and discuss the many advantages this methodology has for obtaining limitless amounts of valuable information.

  10. Item analysis of single-peaked response data : the psychometric evaluation of bipolar measurement scales

    NARCIS (Netherlands)

    Polak, Maaike Geertruida

    2011-01-01

    The thesis explains the fundamental difference between unipolar and bipolar measurement scales for psychological characteristics. We explore the use of correspondence analysis (CA), a technique that is similar to principal component analysis and is available in SAS and SPSS, to select items that

  11. Development of Rasch-based item banks for the assessment of work performance in patients with musculoskeletal diseases.

    Science.gov (United States)

    Mueller, Evelyn A; Bengel, Juergen; Wirtz, Markus A

    2013-12-01

    This study aimed to develop a self-description assessment instrument to measure work performance in patients with musculoskeletal diseases. In terms of the International Classification of Functioning, Disability and Health (ICF), work performance is defined as the degree of meeting the work demands (activities) at the actual workplace (environment). To account for the fact that work performance depends on the work demands of the job, we strived to develop item banks that allow a flexible use of item subgroups depending on the specific work demands of the patients' jobs. Item development included the collection of work tasks from literature and content validation through expert surveys and patient interviews. The resulting 122 items were answered by 621 patients with musculoskeletal diseases. Exploratory factor analysis to ascertain dimensionality and Rasch analysis (partial credit model) for each of the resulting dimensions were performed. Exploratory factor analysis resulted in four dimensions, and subsequent Rasch analysis led to the following item banks: 'impaired productivity' (15 items), 'impaired cognitive performance' (18), 'impaired coping with stress' (13) and 'impaired physical performance' (low physical workload 20 items, high physical workload 10 items). The item banks exhibited person separation indices (reliability) between 0.89 and 0.96. The assessment of work performance adds the activities component to the more commonly employed participation component of the ICF-model. The four item banks can be adapted to specific jobs where necessary without losing comparability of person measures, as the item banks are based on Rasch analysis.

  12. IDENTIFICATION OF MEASUREMENT ITEMS OF DESIGN REQUIREMENTS FOR LEAN AND AGILE SUPPLY CHAIN-CONFIRMATORY FACTOR ANALYSIS

    Directory of Open Access Journals (Sweden)

    D.Venkata Ramana

    2013-06-01

    Full Text Available This study examines the consistency approaches by confirmatory factor analysis that determines the construct validity, convergent validity, construct reliability and internal consistency of the items of strategic design requirements. The design requirements includes use of information technology, sourcing procedures, new product development, flexible manufacturing functions and demand management supply chain net work design, management, commitment and inventory management policies among manufacturers of volatile and unforeseeable products in Andhraadesh, India. This study suggested that the seven factor model with 20 items of the leagile supply chain design requirements had a good fit. Further, the study showed a val id and reliable measurement to identify critical items among the design requirements of leagile supply chains.

  13. The importance of rating scale design in the measurement of patient-reported outcomes using questionnaires or item banks.

    Science.gov (United States)

    Khadka, Jyoti; McAlinden, Colm; Gothwal, Vijaya K; Lamoureux, Ecosse L; Pesudovs, Konrad

    2012-06-26

    To investigate the effect of rating scale designs (question formats and response categories) on item difficulty calibrations and assess the impact that rating scale differences have on overall vision-related activity limitation (VRAL) scores. Sixteen existing patient-reported outcome instruments (PROs) suitable for cataract assessment, with different rating scales, were self-administered by patients on a cataract surgery waiting list. A total of 226 VRAL items from these PROs in their native rating scales were included in an item bank and calibrated using Rasch analysis. Fifteen item/content areas (e.g., reading newspapers) appearing in at least three different PROs were identified. Within each content area, item calibrations were compared and their range calculated. Similarly, five PROs having at least three items in common with the Visual Function (VF-14) were compared in terms of average item measures. A total of 614 patients (mean age ± SD, 74.1 ± 9.4 years) participated. Items with the same content varied in their calibration by as much as two logits; "reading the small print" had the largest range (1.99 logits) followed by "watching TV" (1.60). Compared with the VF-14 (0.00 logits), the rating scale of the Visual Disability Assessment (1.13 logits) produced the most difficult items and the Cataract Symptom Scale (0.24 logits) produced the least difficult items. The VRAL item bank was suboptimally targeted to the ability level of the participants (2.00 logits). Rating scale designs have a significant effect on item calibrations. Therefore, constructing item banks from existing items in their native formats carries risks to face validity and transmission of problems inherent in existing instruments, such as poor targeting.

  14. Validation Study for the Brief Measure of Quality of Life and Quality of Care: A Questionnaire for the National Random Sampling Hospital Survey.

    Science.gov (United States)

    Shimizu, Megumi; Fujisawa, Daisuke; Kurihara, Miho; Sato, Kazuki; Morita, Tatsuya; Kato, Masashi; Miyashita, Mitsunori

    2017-08-01

    To monitor quality of life (QOL) for patients with cancer in a large population-based survey, we developed a short QOL and quality-of-care (QOC) questionnaire. To determine the validity and reliability of this new questionnaire for evaluating QOL in patients with cancer. Outpatients and inpatients at National Cancer Center Hospital East were administered a questionnaire, including the following items-the short QOL and QOC questionnaire (physical distress, pain, emotional distress, walk burden, and need for help with self-care; perceived general health status; and satisfaction with medical care and treatment by doctor, communication with doctor, support by health-care staff other than doctor, care for physical symptoms such as pain, and psychological care), the Functional Assessment of Cancer Therapy-General (FACT-G), the Cancer Care Evaluation Scale (CCES) for patients, and demographic and medical data. We then readministered the short QOL and QOC questionnaire. In total, 329 outpatients and 239 inpatients completed the survey (response rates: 80% and 90%, respectively). Total Cronbach α for the short QOL and QOC questionnaire was 0.83 for outpatients and 0.82 for inpatients. Items of the questionnaire correlated with cancer-specific measurements, FACT-G, and CCES. Intraclass correlation coefficients for all items of the questionnaire were 0.79 and 0.89 in each setting. Items of QOL and QOC did not correlate with each other. The validity and reliability of the short QOL and QOC questionnaire appear sufficient. This questionnaire enables continuous monitoring of patient QOL in large population-based surveys.

  15. Measuring determinants of career satisfaction of anesthesiologists: validation of a survey instrument.

    Science.gov (United States)

    Afonso, Anoushka M; Diaz, James H; Scher, Corey S; Beyl, Robbie A; Nair, Singh R; Kaye, Alan David

    2013-06-01

    To measure the parameter of job satisfaction among anesthesiologists. Survey instrument. Academic anesthesiology departments in the United States. 320 anesthesiologists who attended the annual meeting of the ASA in 2009 (95% response rate). The anonymous 50-item survey collected information on 26 independent demographic variables and 24 dependent ranked variables of career satisfaction among practicing anesthesiologists. Mean survey scores were calculated for each demographic variable and tested for statistically significant differences by analysis of variance. Questions within each domain that were internally consistent with each other within domains were identified by Cronbach's alpha ≥ 0.7. P-values ≤ 0.05 were considered statistically significant. Cronbach's alpha analysis showed strong internal consistency for 10 dependent outcome questions in the practice factor-related domain (α = 0.72), 6 dependent outcome questions in the peer factor-related domain (α = 0.71), and 8 dependent outcome questions in the personal factor-related domain (α = 0.81). Although age was not a variable, full-time status, early satisfaction within the first 5 years of practice, working with respected peers, and personal choice factors were all significantly associated with anesthesiologist job satisfaction. Improvements in factors related to job satisfaction among anesthesiologists may lead to higher early and current career satisfaction. Copyright © 2013 Elsevier Inc. All rights reserved.

  16. Item response theory - A first approach

    Science.gov (United States)

    Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

    2017-07-01

    The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.

  17. Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

    Science.gov (United States)

    Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina

    2015-06-01

    This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism. © The Author(s) 2014.

  18. [Wing 1 radiation survey and contamination report

    International Nuclear Information System (INIS)

    Olsen, K.

    1991-01-01

    We have completed the 5480.11 survey for Wing 1. All area(s)/item(s) requested by the 5480.11 committee have been thoroughly surveyed and documented. Decontamination/disposal of contaminated items has been accomplished. The wing 1 survey was started on 8/13/90 and completed 9/18/90. However, the follow-up surveys were not completed until 2/18/91. We received the final set of smear samples for wing 1 on 1/13/91. A total of 5,495 smears were taken from wing 1 and total of 465 smears were taken during the follow-up surveys. There were a total 122 items found to have fixed contamination and 4 items with smearable contamination in excess of the limits specified in DOE ORDER 5480.11 (AR 3-7). The following area(s)/item(s) were not included in the 5480.11 survey: Hallways, Access panels, Men's and women's change rooms, Janitor closets, Wall lockers and item(s) stored in wing 1 hallways and room 1116. If our contract is renewed, we will include those areas in our survey according to your request of April 15, 1991

  19. Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations.

    Science.gov (United States)

    Teresi, Jeanne A; Ocepek-Welikson, Katja; Cook, Karon F; Kleinman, Marjorie; Ramirez, Mildred; Reid, M Carrington; Siu, Albert

    2016-01-01

    Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System ® (PROMIS ® ) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, "How much did pain interfere with enjoyment of social activities?" was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and sensitivity

  20. Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations

    Science.gov (United States)

    Teresi, Jeanne A.; Ocepek-Welikson, Katja; Cook, Karon F.; Kleinman, Marjorie; Ramirez, Mildred; Reid, M. Carrington; Siu, Albert

    2017-01-01

    Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System® (PROMIS®) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. Methods DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. Results The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, “How much did pain interfere with enjoyment of social activities?” was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and

  1. Modeling Item-Level and Step-Level Invariance Effects in Polytomous Items Using the Partial Credit Model

    Science.gov (United States)

    Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D.

    2012-01-01

    Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…

  2. Psychometric evaluation of the pediatric and parent-proxy Patient-Reported Outcomes Measurement Information System and the Neurology and Traumatic Brain Injury Quality of Life measurement item banks in pediatric traumatic brain injury.

    Science.gov (United States)

    Bertisch, Hilary; Rivara, Frederick P; Kisala, Pamela A; Wang, Jin; Yeates, Keith Owen; Durbin, Dennis; Zonfrillo, Mark R; Bell, Michael J; Temkin, Nancy; Tulsky, David S

    2017-07-01

    The primary objective is to provide evidence of convergent and discriminant validity for the pediatric and parent-proxy versions of the Patient-Reported Outcomes Measurement Information System (PROMIS) Anxiety, Depression, Anger, Peer Relations, Mobility, Pain Interference, and Fatigue item banks, the Neurology Quality of Life measurement system (Neuro-QOL) Cognition-General Concerns and Stigma item banks, and the Traumatic Brain Injury Quality of Life (TBI-QOL) Executive Function and Headache item banks in a pediatric traumatic brain injury (TBI) sample. Participants were 134 parent-child (ages 8-18 years) days. Children all sustained TBI and the dyads completed outcome ratings 6 months after injury at one of six medical centers across the United States. Ratings included PROMIS, Neuro-QOL, and TBI-QOL item banks, as well as the Pediatric Quality of Life inventory (PedsQL), the Health Behavior Inventory (HBI), and the Strengths and Difficulties Questionnaire (SDQ) as legacy criterion measures against which these item banks were validated. The PROMIS, Neuro-QOL, and TBI-QOL item banks demonstrated good convergent validity, as evidenced by moderate to strong correlations with comparable scales on the legacy measures. PROMIS, Neuro-QOL, and TBI-QOL item banks showed weaker correlations with ratings of unrelated constructs on legacy measures, providing evidence of discriminant validity. Our results indicate that the constructs measured by the PROMIS, Neuro-QOL, and TBI-QOL item banks are valid in our pediatric TBI sample and that it is appropriate to use these standardized scores for our primary study analyses.

  3. Validation of a 15-item care-related regret coping scale for health-care professionals (RCS-HCP).

    Science.gov (United States)

    Courvoisier, Delphine Sophie; Cullati, Stephane; Ouchi, Rieko; Schmidt, Ralph Eric; Haller, Guy; Chopard, Pierre; Agoritsas, Thomas; Perneger, Thomas V

    2014-01-01

    Coping with difficult care-related situations is a common challenge for health-care professionals. How these professionals deal with the regrets they may experience following one of the many decisions and interventions they must make every day can have an impact on their own health and quality of life, and also on their patient care practices. To identify professionals most at need for extra support, development and validation of a tool measuring coping style are needed. We performed a survey of physicians and nurses of a French-speaking University hospital; 469 health-care professionals responded to the survey, and 175 responded to the same survey one-month later. Regret was assessed with the regret coping scale developed for this study, self-report questions on the frequency of regretted situations and the intensity of regret. Construct validity was assessed using measures of health-care professionals' quality of life (including job and life satisfaction, and self-reported health) as well as sleep problems and depression. Based on factor analysis and item response analysis, the initial 31-item scale was shortened to 15 items, which measured three types of strategies: problem-focused strategies (i.e., trying to find solutions, talking to colleagues) and two types of emotion-focused strategies, A (i.e., self-blame, rumination) and B (e.g., acceptance, emotional distance). All subscales showed high internal consistency (α >0.85). Overall, as expected, problem-focused and emotion-focused B strategies correlated with higher quality of life, fewer sleep problems and less depression, and emotion-focused A strategies showed the opposite pattern. The regret coping scale (RCS-HCP) is a valid and reliable measure of coping abilities of hospital-based health-care professionals.

  4. Evaluation of item candidates for a diabetic retinopathy quality of life item bank.

    Science.gov (United States)

    Fenwick, Eva K; Pesudovs, Konrad; Khadka, Jyoti; Rees, Gwyn; Wong, Tien Y; Lamoureux, Ecosse L

    2013-09-01

    We are developing an item bank assessing the impact of diabetic retinopathy (DR) on quality of life (QoL) using a rigorous multi-staged process combining qualitative and quantitative methods. We describe here the first two qualitative phases: content development and item evaluation. After a comprehensive literature review, items were generated from four sources: (1) 34 previously validated patient-reported outcome measures; (2) five published qualitative articles; (3) eight focus groups and 18 semi-structured interviews with 57 DR patients; and (4) seven semi-structured interviews with diabetes or ophthalmic experts. Items were then evaluated during 3 stages, namely binning (grouping) and winnowing (reduction) based on key criteria and panel consensus; development of item stems and response options; and pre-testing of items via cognitive interviews with patients. The content development phase yielded 1,165 unique items across 7 QoL domains. After 3 sessions of binning and winnowing, items were reduced to a minimally representative set (n = 312) across 9 domains of QoL: visual symptoms; ocular surface symptoms; activity limitation; mobility; emotional; health concerns; social; convenience; and economic. After 8 cognitive interviews, 42 items were amended resulting in a final set of 314 items. We have employed a systematic approach to develop items for a DR-specific QoL item bank. The psychometric properties of the nine QoL subscales will be assessed using Rasch analysis. The resulting validated item bank will allow clinicians and researchers to better understand the QoL impact of DR and DR therapies from the patient's perspective.

  5. Measuring genetic knowledge: a brief survey instrument for adolescents and adults.

    Science.gov (United States)

    Fitzgerald-Butt, S M; Bodine, A; Fry, K M; Ash, J; Zaidi, A N; Garg, V; Gerhardt, C A; McBride, K L

    2016-02-01

    Basic knowledge of genetics is essential for understanding genetic testing and counseling. The lack of a written, English language, validated, published measure has limited our ability to evaluate genetic knowledge of patients and families. Here, we begin the psychometric analysis of a true/false genetic knowledge measure. The 18-item measure was completed by parents of children with congenital heart defects (CHD) (n = 465) and adolescents and young adults with CHD (age: 15-25, n = 196) with a mean total correct score of 12.6 [standard deviation (SD) = 3.5, range: 0-18]. Utilizing exploratory factor analysis, we determined that one to three correlated factors, or abilities, were captured by our measure. Through confirmatory factor analysis, we determined that the two factor model was the best fit. Although it was necessary to remove two items, the remaining items exhibited adequate psychometric properties in a multidimensional item response theory analysis. Scores for each factor were computed, and a sum-score conversion table was derived. We conclude that this genetic knowledge measure discriminates best at low knowledge levels and is therefore well suited to determine a minimum adequate amount of genetic knowledge. However, further reliability testing and validation in diverse research and clinical settings is needed. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  6. Quality Control in Survey Design: Evaluating a Survey of Educators’ Attitudes Concerning Differentiated Compensation

    OpenAIRE

    Kelly D. Bradley; Michael Peabody; Shannon O. Sampson

    2015-01-01

    This study utilized the Rasch model to assess the quality of a survey instrument designed to measure attitudes of administrators and teachers concerning a differentiated teacher compensation program piloted in Kentucky.  Researchers addressing potentially contentious issues should ensure their methods stand up to rigorous criticism.  The results indicate that the rating scale does not function as expected, with items being too easy to endorse.  Future iterations of this survey should be revis...

  7. Measuring stigma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Stigma item bank and short form.

    Science.gov (United States)

    Kisala, Pamela A; Tulsky, David S; Pace, Natalie; Victorson, David; Choi, Seung W; Heinemann, Allen W

    2015-05-01

    To develop a calibrated item bank and computer adaptive test (CAT) to assess the effects of stigma on health-related quality of life in individuals with spinal cord injury (SCI). Grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, and item response theory (IRT)-based psychometric analyses. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. SCI-QOL Stigma Item Bank A sample of 611 individuals with traumatic SCI completed 30 items assessing SCI-related stigma. After 7 items were iteratively removed, factor analyses confirmed a unidimensional pool of items. Graded Response Model IRT analyses were used to estimate slopes and thresholds for the final 23 items. The SCI-QOL Stigma item bank is unique not only in the assessment of SCI-related stigma but also in the inclusion of individuals with SCI in all phases of its development. Use of confirmatory factor analytic and IRT methods provide flexibility and precision of measurement. The item bank may be administered as a CAT or as a 10-item fixed-length short form and can be used for research and clinical applications.

  8. Assessment of the psychometrics of a PROMIS item bank: self-efficacy for managing daily activities.

    Science.gov (United States)

    Hong, Ickpyo; Velozo, Craig A; Li, Chih-Ying; Romero, Sergio; Gruber-Baldini, Ann L; Shulman, Lisa M

    2016-09-01

    The aim of this study is to investigate the psychometrics of the Patient-Reported Outcomes Measurement Information System self-efficacy for managing daily activities item bank. The item pool was field tested on a sample of 1087 participants via internet (n = 250) and in-clinic (n = 837) surveys. All participants reported having at least one chronic health condition. The 35 item pool was investigated for dimensionality (confirmatory factor analyses, CFA and exploratory factor analysis, EFA), item-total correlations, local independence, precision, and differential item functioning (DIF) across gender, race, ethnicity, age groups, data collection modes, and neurological chronic conditions (McFadden Pseudo R (2) less than 10 %). The item pool met two of the four CFA fit criteria (CFI = 0.952 and SRMR = 0.07). EFA analysis found a dominant first factor (eigenvalue = 24.34) and the ratio of first to second eigenvalue was 12.4. The item pool demonstrated good item-total correlations (0.59-0.85) and acceptable internal consistency (Cronbach's alpha = 0.97). The item pool maintained its precision (reliability over 0.90) across a wide range of theta (3.70), and there was no significant DIF. The findings indicated the item pool has sound psychometric properties and the test items are eligible for development of computerized adaptive testing and short forms.

  9. Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank

    NARCIS (Netherlands)

    Oude Voshaar, Martijn A.H.; Ten Klooster, Peter M.; Vonkeman, Harald E.; van de Laar, Mart A.F.J.

    2017-01-01

    Objective: Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Study

  10. Structural Validation of a French Food Frequency Questionnaire of 94 Items.

    Science.gov (United States)

    Gazan, Rozenn; Vieux, Florent; Darmon, Nicole; Maillot, Matthieu

    2017-01-01

    Food frequency questionnaires (FFQs) are used to estimate the usual food and nutrient intakes over a period of time. Such estimates can suffer from measurement errors, either due to bias induced by respondent's answers or to errors induced by the structure of the questionnaire (e.g., using a limited number of food items and an aggregated food database with average portion sizes). The "structural validation" presented in this study aims to isolate and quantify the impact of the inherent structure of a FFQ on the estimation of food and nutrient intakes, independently of respondent's perception of the questionnaire. A semi-quantitative FFQ ( n  = 94 items, including 50 items with questions on portion sizes) and an associated aggregated food composition database (named the item-composition database) were developed, based on the self-reported weekly dietary records of 1918 adults (18-79 years-old) in the French Individual and National Dietary Survey 2 (INCA2), and the French CIQUAL 2013 food-composition database of all the foods ( n  = 1342 foods) declared as consumed in the population. Reference intakes of foods ("REF_FOOD") and nutrients ("REF_NUT") were calculated for each adult using the food-composition database and the amounts of foods self-reported in his/her dietary record. Then, answers to the FFQ were simulated for each adult based on his/her self-reported dietary record. "FFQ_FOOD" and "FFQ_NUT" intakes were estimated using the simulated answers and the item-composition database. Measurement errors (in %), spearman correlations and cross-classification were used to compare "REF_FOOD" with "FFQ_FOOD" and "REF_NUT" with "FFQ_NUT". Compared to "REF_NUT," "FFQ_NUT" total quantity and total energy intake were underestimated on average by 198 g/day and 666 kJ/day, respectively. "FFQ_FOOD" intakes were well estimated for starches, underestimated for most of the subgroups, and overestimated for some subgroups, in particular vegetables. Underestimation were

  11. Do Self Concept Tests Test Self Concept? An Evaluation of the Validity of Items on the Piers Harris and Coopersmith Measures.

    Science.gov (United States)

    Lynch, Mervin D.; Chaves, John

    Items from Peirs-Harris and Coopersmith self-concept tests were evaluated against independent measures on three self-constructs, idealized, empathic, and worth. Construct measurements were obtained with the semantic differential and D statistic. Ratings were obtained from 381 children, grades 4-6. For each test, item ratings and construct measures…

  12. Sharing the cost of redundant items

    DEFF Research Database (Denmark)

    Hougaard, Jens Leth; Moulin, Hervé

    2014-01-01

    We ask how to share the cost of finitely many public goods (items) among users with different needs: some smaller subsets of items are enough to serve the needs of each user, yet the cost of all items must be covered, even if this entails inefficiently paying for redundant items. Typical examples...... are network connectivity problems when an existing (possibly inefficient) network must be maintained. We axiomatize a family cost ratios based on simple liability indices, one for each agent and for each item, measuring the relative worth of this item across agents, and generating cost allocation rules...... additive in costs....

  13. Comparison on Computed Tomography using industrial items

    DEFF Research Database (Denmark)

    Angel, Jais Andreas Breusch; De Chiffre, Leonardo

    2014-01-01

    In a comparison involving 27 laboratories from 8 countries, measurements on two common industrial items, a polymer part and a metal part, were carried out using X-ray Computed Tomography. All items were measured using coordinate measuring machines before and after circulation, with reference...

  14. Measuring Integration of Information and Communication Technology in Education: An Item Response Modeling Approach

    Science.gov (United States)

    Peeraer, Jef; Van Petegem, Peter

    2012-01-01

    This research describes the development and validation of an instrument to measure integration of Information and Communication Technology (ICT) in education. After literature research on definitions of integration of ICT in education, a comparison is made between the classical test theory and the item response modeling approach for the…

  15. Measuring sport experiences in children and youth to better understand the impact of sport on health and positive youth development: designing a brief measure for population health surveys.

    Science.gov (United States)

    Cairney, John; Clark, Heather J; Kwan, Matthew Y W; Bruner, Mark; Tamminen, Katherine

    2018-04-03

    Despite the proliferation of studies examining youth sport participation, there are significant gaps in knowledge regarding the impact of youth sport participation on health and development. These gaps are not new, but have persisted due to limitations with how sport participation is measured. Much of the research to date has measured sport participation as binary (yes/no) or count measures. This has been especially true in survey-based research. Yet, at the same time, research has investigated youths' experiences in sport such as the influence of coaches, teammates, and parents. The ability to measure these experiences is constrained by the need to use a number of measures along with gaps in the content covered in existing measures. We propose to develop and test the Sport Experiences Measure: Children and Youth (SEM:CY) as a population survey-based measure that captures the salient aspects of youths' experience in sport. The SEM:CY will be developed and tested across three phases. Phase I includes qualitative research with members of the sport community and engagement with an expert group to generate and obtain feedback on the initial item pool. In Phase II will recruit two consecutive samples of students from schools to complete the draft measure. Analysis will focus on assessing the items and factor structure of the measure. Factor structure will be assessed first with exploratory factor analysis and then confirmatory factor analysis. In phase III we will test the association between the SEM:CY with a measure of perceived competence, sport anxiety, and positive youth development to assess construct validity. We will also examine whether the structure of the measure varies by age or gender. The SEM:CY measure will provide a meaningful contribution to the measurement and understanding of youth sport participation. The SEM:CY can be used as a stand-alone measure to understand youth experiences in sport programs, or in combination with other health and development

  16. Measuring Instrument Constructs of Return Factors for Green Office Building Investments Variables Using Rasch Measurement Model

    Directory of Open Access Journals (Sweden)

    Isa Mona

    2016-01-01

    Full Text Available This paper is a preliminary study on rationalising green office building investments in Malaysia. The aim of this paper is attempt to introduce the application of Rasch measurement model analysis to determine the validity and reliability of each construct in the questionnaire. In achieving this objective, a questionnaire survey was developed consists of 6 sections and a total of 106 responses were received from various investors who own and lease office buildings in Kuala Lumpur. The Rasch Measurement analysis is used to measure the quality control of item constructs in the instrument by measuring the specific objectivity within the same dimension, to reduce ambiguous measures, and a realistic estimation of precision and implicit quality. The Rasch analysis consists of the summary statistics, item unidimensionality and item measures. A result shows the items and respondent (person reliability is at 0.91 and 0.95 respectively.

  17. Measuring anxiety after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Anxiety item bank and linkage with GAD-7.

    Science.gov (United States)

    Kisala, Pamela A; Tulsky, David S; Kalpakjian, Claire Z; Heinemann, Allen W; Pohlig, Ryan T; Carle, Adam; Choi, Seung W

    2015-05-01

    To develop a calibrated item bank and computer adaptive test to assess anxiety symptoms in individuals with spinal cord injury (SCI), transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a statistical linkage with the Generalized Anxiety Disorder (GAD)-7, a widely used anxiety measure. Grounded-theory based qualitative item development methods; large-scale item calibration field testing; confirmatory factor analysis; graded response model item response theory analyses; statistical linking techniques to transform scores to a PROMIS metric; and linkage with the GAD-7. Setting Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Participants Adults with traumatic SCI. Spinal Cord Injury-Quality of Life (SCI-QOL) Anxiety Item Bank Seven hundred sixteen individuals with traumatic SCI completed 38 items assessing anxiety, 17 of which were PROMIS items. After 13 items (including 2 PROMIS items) were removed, factor analyses confirmed unidimensionality. Item response theory analyses were used to estimate slopes and thresholds for the final 25 items (15 from PROMIS). The observed Pearson correlation between the SCI-QOL Anxiety and GAD-7 scores was 0.67. The SCI-QOL Anxiety item bank demonstrates excellent psychometric properties and is available as a computer adaptive test or short form for research and clinical applications. SCI-QOL Anxiety scores have been transformed to the PROMIS metric and we provide a method to link SCI-QOL Anxiety scores with those of the GAD-7.

  18. Item-focussed Trees for the Identification of Items in Differential Item Functioning.

    Science.gov (United States)

    Tutz, Gerhard; Berger, Moritz

    2016-09-01

    A novel method for the identification of differential item functioning (DIF) by means of recursive partitioning techniques is proposed. We assume an extension of the Rasch model that allows for DIF being induced by an arbitrary number of covariates for each item. Recursive partitioning on the item level results in one tree for each item and leads to simultaneous selection of items and variables that induce DIF. For each item, it is possible to detect groups of subjects with different item difficulties, defined by combinations of characteristics that are not pre-specified. The way a DIF item is determined by covariates is visualized in a small tree and therefore easily accessible. An algorithm is proposed that is based on permutation tests. Various simulation studies, including the comparison with traditional approaches to identify items with DIF, show the applicability and the competitive performance of the method. Two applications illustrate the usefulness and the advantages of the new method.

  19. The Laboratory Course Assessment Survey: A Tool to Measure Three Dimensions of Research-Course Design

    Science.gov (United States)

    Corwin, Lisa A.; Runyon, Christopher; Robinson, Aspen; Dolan, Erin L.

    2015-01-01

    Course-based undergraduate research experiences (CUREs) are increasingly being offered as scalable ways to involve undergraduates in research. Yet few if any design features that make CUREs effective have been identified. We developed a 17-item survey instrument, the Laboratory Course Assessment Survey (LCAS), that measures students’ perceptions of three design features of biology lab courses: 1) collaboration, 2) discovery and relevance, and 3) iteration. We assessed the psychometric properties of the LCAS using established methods for instrument design and validation. We also assessed the ability of the LCAS to differentiate between CUREs and traditional laboratory courses, and found that the discovery and relevance and iteration scales differentiated between these groups. Our results indicate that the LCAS is suited for characterizing and comparing undergraduate biology lab courses and should be useful for determining the relative importance of the three design features for achieving student outcomes. PMID:26466990

  20. Safety climate in Swiss hospital units: Swiss version of the Safety Climate Survey

    Science.gov (United States)

    Gehring, Katrin; Mascherek, Anna C.; Bezzola, Paula

    2015-01-01

    Abstract Rationale, aims and objectives Safety climate measurements are a broadly used element of improvement initiatives. In order to provide a sound and easy‐to‐administer instrument for the use in Swiss hospitals, we translated the Safety Climate Survey into German and French. Methods After translating the Safety Climate Survey into French and German, a cross‐sectional survey study was conducted with health care professionals (HCPs) in operating room (OR) teams and on OR‐related wards in 10 Swiss hospitals. Validity of the instrument was examined by means of Cronbach's alpha and missing rates of the single items. Item‐descriptive statistics group differences and percentage of ‘problematic responses’ (PPR) were calculated. Results 3153 HCPs completed the survey (response rate: 63.4%). 1308 individuals were excluded from the analyses because of a profession other than doctor or nurse or invalid answers (n = 1845; nurses = 1321, doctors = 523). Internal consistency of the translated Safety Climate Survey was good (Cronbach's alpha G erman = 0.86; Cronbach's alpha F rench = 0.84). Missing rates at item level were rather low (0.23–4.3%). We found significant group differences in safety climate values regarding profession, managerial function, work area and time spent in direct patient care. At item level, 14 out of 21 items showed a PPR higher than 10%. Conclusions Results indicate that the French and German translations of the Safety Climate Survey might be a useful measurement instrument for safety climate in Swiss hospital units. Analyses at item level allow for differentiating facets of safety climate into more positive and critical safety climate aspects. PMID:25656302

  1. A Comparison of the 27-Item and 12-Item Intolerance of Uncertainty Scales

    Science.gov (United States)

    Khawaja, Nigar G.; Yu, Lai Ngo Heidi

    2010-01-01

    The 27-item Intolerance of Uncertainty Scale (IUS) has become one of the most frequently used measures of Intolerance of Uncertainty. More recently, an abridged, 12-item version of the IUS has been developed. The current research used clinical (n = 50) and non-clinical (n = 56) samples to examine and compare the psychometric properties of both…

  2. Readability and Comprehension of the Geriatric Depression Scale and PROMIS® Physical Function Items in Older African Americans and Latinos.

    Science.gov (United States)

    Paz, Sylvia H; Jones, Loretta; Calderón, José L; Hays, Ron D

    2017-02-01

    Depression and physical function are particularly important health domains for the elderly. The Geriatric Depression Scale (GDS) and the Patient-Reported Outcomes Measurement Information System (PROMIS ® ) physical function item bank are two surveys commonly used to measure these domains. It is unclear if these two instruments adequately measure these aspects of health in minority elderly. The aim of this study was to estimate the readability of the GDS and PROMIS ® physical function items and to assess their comprehensibility using a sample of African American and Latino elderly. Readability was estimated using the Flesch-Kincaid and Flesch Reading Ease (FRE) formulae for English versions, and a Spanish adaptation of the FRE formula for the Spanish versions. Comprehension of the GDS and PROMIS ® items by minority elderly was evaluated with 30 cognitive interviews. Readability estimates of a number of items in English and Spanish of the GDS and PROMIS ® physical functioning items exceed the U.S. recommended 5th-grade threshold for vulnerable populations, or were rated as 'fairly difficult', 'difficult', or 'very difficult' to read. Cognitive interviews revealed that many participants felt that more than the two (yes/no) GDS response options were needed to answer the questions. Wording of several PROMIS ® items was considered confusing, and interpreting responses was problematic because they were based on using physical aids. Problems with item wording and response options of the GDS and PROMIS ® physical function items may reduce reliability and validity of measurement when used with minority elderly.

  3. Measurement invariance across educational levels and gender in 12-item Zarit Burden Interview (ZBI) on caregivers of people with dementia.

    Science.gov (United States)

    Lin, Chung-Ying; Ku, Li-Jung Elizabeth; Pakpour, Amir H

    2017-11-01

    The Zarit Burden Interview (ZBI) is a commonly used self-report to assess caregiver burden. A 12-item short form of the ZBI has been developed; however, its measurement invariance has not been examined across some different demographics. It is unclear whether different genders and educational levels of a population interpret the ZBI items similarly. Therefore, this study aimed to examine the measurement invariance of the 12-item ZBI across gender and educational levels in a Taiwanese sample. Caregivers who had a family member with dementia (n = 270) completed the ZBI through telephone interviews. Three confirmatory factor analysis (CFA) models were conducted: Model 1 was the configural model, Model 2 constrained all factor loadings, Model 3 constrained all factor loadings and item intercepts. Multiple group CFAs and the differential item functioning (DIF) contrast under Rasch analyses were used to detect measurement invariance across males (n = 100) and females (n = 170) and across educational levels of junior high schools and below (n = 86) and senior high schools and above (n = 183). The fit index differences between models supported the measurement invariance across gender and across educational levels (∆ comparative fit index (CFI) = -0.010 and 0.003; ∆ root mean square error of approximation (RMSEA) = -0.006 to 0.004). No substantial DIF contrast was found across gender and educational levels (value = -0.36 to 0.29). The ZBI is appropriate for combined use and for comparisons in caregivers across gender and different educational levels in Taiwan.

  4. Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form.

    Science.gov (United States)

    Kalpakjian, Claire Z; Tate, Denise G; Kisala, Pamela A; Tulsky, David S

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory-(IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. A total of 717 individuals with SCI completed the self-esteem items. A unidimensional model was observed (CFI=0.946; RMSEA=0.087) and measurement precision was good (theta range between -2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.

  5. P values in display items are ubiquitous and almost invariably significant: A survey of top science journals.

    Science.gov (United States)

    Cristea, Ioana Alina; Ioannidis, John P A

    2018-01-01

    P values represent a widely used, but pervasively misunderstood and fiercely contested method of scientific inference. Display items, such as figures and tables, often containing the main results, are an important source of P values. We conducted a survey comparing the overall use of P values and the occurrence of significant P values in display items of a sample of articles in the three top multidisciplinary journals (Nature, Science, PNAS) in 2017 and, respectively, in 1997. We also examined the reporting of multiplicity corrections and its potential influence on the proportion of statistically significant P values. Our findings demonstrated substantial and growing reliance on P values in display items, with increases of 2.5 to 14.5 times in 2017 compared to 1997. The overwhelming majority of P values (94%, 95% confidence interval [CI] 92% to 96%) were statistically significant. Methods to adjust for multiplicity were almost non-existent in 1997, but reported in many articles relying on P values in 2017 (Nature 68%, Science 48%, PNAS 38%). In their absence, almost all reported P values were statistically significant (98%, 95% CI 96% to 99%). Conversely, when any multiplicity corrections were described, 88% (95% CI 82% to 93%) of reported P values were statistically significant. Use of Bayesian methods was scant (2.5%) and rarely (0.7%) articles relied exclusively on Bayesian statistics. Overall, wider appreciation of the need for multiplicity corrections is a welcome evolution, but the rapid growth of reliance on P values and implausibly high rates of reported statistical significance are worrisome.

  6. Using classical test theory, item response theory, and Rasch measurement theory to evaluate patient-reported outcome measures: a comparison of worked examples.

    Science.gov (United States)

    Petrillo, Jennifer; Cano, Stefan J; McLeod, Lori D; Coon, Cheryl D

    2015-01-01

    To provide comparisons and a worked example of item- and scale-level evaluations based on three psychometric methods used in patient-reported outcome development-classical test theory (CTT), item response theory (IRT), and Rasch measurement theory (RMT)-in an analysis of the National Eye Institute Visual Functioning Questionnaire (VFQ-25). Baseline VFQ-25 data from 240 participants with diabetic macular edema from a randomized, double-masked, multicenter clinical trial were used to evaluate the VFQ at the total score level. CTT, RMT, and IRT evaluations were conducted, and results were assessed in a head-to-head comparison. Results were similar across the three methods, with IRT and RMT providing more detailed diagnostic information on how to improve the scale. CTT led to the identification of two problematic items that threaten the validity of the overall scale score, sets of redundant items, and skewed response categories. IRT and RMT additionally identified poor fit for one item, many locally dependent items, poor targeting, and disordering of over half the response categories. Selection of a psychometric approach depends on many factors. Researchers should justify their evaluation method and consider the intended audience. If the instrument is being developed for descriptive purposes and on a restricted budget, a cursory examination of the CTT-based psychometric properties may be all that is possible. In a high-stakes situation, such as the development of a patient-reported outcome instrument for consideration in pharmaceutical labeling, however, a thorough psychometric evaluation including IRT or RMT should be considered, with final item-level decisions made on the basis of both quantitative and qualitative results. Copyright © 2015. Published by Elsevier Inc.

  7. Few items in the thyroid-related quality of life instrument ThyPRO exhibited differential item functioning

    DEFF Research Database (Denmark)

    Watt, Torquil; Grønvold, Mogens; Hegedüs, Laszlo

    2014-01-01

    To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis.......To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis....

  8. What Do You Think You Are Measuring? A Mixed-Methods Procedure for Assessing the Content Validity of Test Items and Theory-Based Scaling

    Science.gov (United States)

    Koller, Ingrid; Levenson, Michael R.; Glück, Judith

    2017-01-01

    The valid measurement of latent constructs is crucial for psychological research. Here, we present a mixed-methods procedure for improving the precision of construct definitions, determining the content validity of items, evaluating the representativeness of items for the target construct, generating test items, and analyzing items on a theoretical basis. To illustrate the mixed-methods content-scaling-structure (CSS) procedure, we analyze the Adult Self-Transcendence Inventory, a self-report measure of wisdom (ASTI, Levenson et al., 2005). A content-validity analysis of the ASTI items was used as the basis of psychometric analyses using multidimensional item response models (N = 1215). We found that the new procedure produced important suggestions concerning five subdimensions of the ASTI that were not identifiable using exploratory methods. The study shows that the application of the suggested procedure leads to a deeper understanding of latent constructs. It also demonstrates the advantages of theory-based item analysis. PMID:28270777

  9. Measurement of smoking: surveys and some recommendations.

    Science.gov (United States)

    Shipley, R H; Rosen, T J; Williams, C

    1982-01-01

    A survey of smoking cessation researchers found considerable disagreement in the measurement procedures used to determine treatment outcome. The survey investigated (1) the duration of the measurement interval used to determine abstinence and smoking rate; (2) procedures for classifying people who smoke after treatment but are abstinent at follow-up; and (3) procedures for classifying people who use marijuana or tobacco products other than cigarettes. The marked disagreement among researchers' survey responses was compounded by the failure of their published articles to explain how smoking had been measured and scored. The Discussion identifies long-term abstinence as the most critical problem; its measurement was least consistent procedurally across studies yet most important for comparing them. Recommendations are made for establishing measurement and reporting conventions.

  10. Capturing and missing the patient's story through outcome measures: A thematic comparison of patient-generated items in PSYCHLOPS with CORE-OM and PHQ-9.

    Science.gov (United States)

    Sales, Célia Md; Neves, Inês Td; Alves, Paula G; Ashworth, Mark

    2017-11-22

    There is increasing interest in individualized patient-reported outcome measures (I-PROMS), where patients themselves indicate the specific problems they want to address in therapy and these problems are used as items within the outcome measurement tool. This paper examined the extent to which 279 items reported in an I-PROM (PSYCHLOPS) added qualitative information which was not captured by two well-established outcome measures (CORE-OM and PHQ-9). Comparison of items was only conducted for patients scoring above the "caseness" threshold on the standardized measures. 107 patients were participating in therapy within addiction and general psychiatric clinical settings. Almost every patient (95%) reported at least one item whose content was not covered by PHQ-9, and 71% reported at least one item not covered by CORE-OM. Results demonstrate the relevance of individualized outcome assessment for capturing data describing the issues of greatest concern to patients, as nomothetic measures do not always seem to capture the whole story. © 2017 The Authors Health Expectations Published by John Wiley & Sons Ltd.

  11. Psychometric properties of the PROMIS Physical Function item bank in patients receiving physical therapy.

    Directory of Open Access Journals (Sweden)

    Martine H P Crins

    Full Text Available The Patient-Reported Outcomes Measurement Information System (PROMIS is a universally applicable set of instruments, including item banks, short forms and computer adaptive tests (CATs, measuring patient-reported health across different patient populations. PROMIS CATs are highly efficient and the use in practice is considered feasible with little administration time, offering standardized and routine patient monitoring. Before an item bank can be used as CAT, the psychometric properties of the item bank have to be examined. Therefore, the objective was to assess the psychometric properties of the Dutch-Flemish PROMIS Physical Function item bank (DF-PROMIS-PF in Dutch patients receiving physical therapy.Cross-sectional study.805 patients >18 years, who received any kind of physical therapy in primary care in the past year, completed the full DF-PROMIS-PF (121 items.Unidimensionality was examined by Confirmatory Factor Analysis and local dependence and monotonicity were evaluated. A Graded Response Model was fitted. Construct validity was examined with correlations between DF-PROMIS-PF T-scores and scores on two legacy instruments (SF-36 Health Survey Physical Functioning scale [SF36-PF10] and the Health Assessment Questionnaire Disability-Index [HAQ-DI]. Reliability (standard errors of theta was assessed.The results for unidimensionality were mixed (scaled CFI = 0.924, TLI = 0.923, RMSEA = 0.045, 1th factor explained 61.5% of variance. Some local dependence was found (8.2% of item pairs. The item bank showed a broad coverage of the physical function construct (threshold-parameters range: -4.28-2.33 and good construct validity (correlation with SF36-PF10 = 0.84 and HAQ-DI = -0.85. Furthermore, the DF-PROMIS-PF showed greater reliability over a broader score-range than the SF36-PF10 and HAQ-DI.The psychometric properties of the DF-PROMIS-PF item bank are sufficient. The DF-PROMIS-PF can now be used as short forms or CAT to measure the level of

  12. A hierarchy of distress and invariant item ordering in the General Health Questionnaire-12.

    Science.gov (United States)

    Doyle, F; Watson, R; Morgan, K; McBride, O

    2012-06-01

    Invariant item ordering (IIO) is defined as the extent to which items have the same ordering (in terms of item difficulty/severity - i.e. demonstrating whether items are difficult [rare] or less difficult [common]) for each respondent who completes a scale. IIO is therefore crucial for establishing a scale hierarchy that is replicable across samples, but no research has demonstrated IIO in scales of psychological distress. We aimed to determine if a hierarchy of distress with IIO exists in a large general population sample who completed a scale measuring distress. Data from 4107 participants who completed the 12-item General Health Questionnaire (GHQ-12) from the Northern Ireland Health and Social Wellbeing Survey 2005-6 were analysed. Mokken scaling was used to determine the dimensionality and hierarchy of the GHQ-12, and items were investigated for IIO. All items of the GHQ-12 formed a single, strong unidimensional scale (H=0.58). IIO was found for six of the 12 items (H-trans=0.55), and these symptoms reflected the following hierarchy: anhedonia, concentration, participation, coping, decision-making and worthlessness. The cross-sectional analysis needs replication. The GHQ-12 showed a hierarchy of distress, but IIO is only demonstrated for six of the items, and the scale could therefore be shortened. Adopting brief, hierarchical scales with IIO may be beneficial in both clinical and research contexts. Copyright © 2011 Elsevier B.V. All rights reserved.

  13. Citizens' perceptions of political processes. A critical evaluation of preference consistency and survey items

    Directory of Open Access Journals (Sweden)

    Bengtsson, Åsa

    2012-12-01

    Full Text Available The current state of research does not tell us much about citizens’ expectations of political decision making. Most surveys allow respondents to evaluate how the current system is working, but do not inquire about alternative political decision-making procedures. The lack of established survey items can be explained by the fact that radical changes in decision-making procedures have been hard to envisage, but also by a general scepticism regarding people’s ability to form opinions on these matters. Political processes are, without doubt, complex matters that do not lend themselves very well to simplistic survey questions. Moreover, previous research has convincingly shown that most people in general have difficulties forming single, coherent and stable attitudes even towards far more straightforward political issues. In order to determine if trying to grasp attitudes towards political decision-making in future empirical studies can be considered a fruitful endeavour, this study sets out to critically assess the extent to which people express coherent preferences on these matters, and if preferences are in line with expectations in previous, rather scattered research. The study is based on the Finnish National Election Study 2011; a study which, contrary to most other election studies, includes a rich variety of survey items on the topic, and utilises a combination of strategies in order to explore patterns in the opinions held by citizens.

    El estado actual de las investigaciones no nos dice mucho sobre las expectativas de los ciudadanos con respecto a la toma de decisiones políticas. La mayoría de las encuestas permiten que quienes las responden evalúen cómo funciona el sistema actual, pero no preguntan por procedimientos alternativos de decisión política. La falta de preguntas de encuesta contrastadas se puede explicar tanto por el hecho de que los cambios en los procedimientos de toma de decisiones han resultado difíciles de

  14. Measuring health-related quality of life: psychometric evaluation of the Tunisian version of the SF-12 health survey.

    Science.gov (United States)

    Younsi, Moheddine; Chakroun, Mohamed

    2014-09-01

    The 12-item short-form health survey (SF-12) was developed as a shorter alternative to the SF-36 for use in large-scale studies as an applicable instrument for measuring health-related quality of life. The main purpose of this study was to evaluate the psychometric properties of the Tunisian version of the SF-12. A stratified representative sample (N = 3,582) of the general Tunisian population aged 18 years and over was interviewed. SF-12 summary scores were derived using the standard US algorithm. Factor analysis was used to confirm the hypothesized component structure of the SF-12 items. Reliability was estimated using internal consistency, and construct validity was investigated with "known groups" validity testing and via convergent and divergent validity. SF-12 summary scores distinguished well, and in the expected manner, between groups of respondents on the basis of gender, age, education and socioeconomic status, thus providing evidence of construct validity. Mean scores in the total sample were 50.11 (SD 8.53) for the physical component summary (PCS) score and 47.96 (SD 9.82) for the mental component summary (MCS) score. The results showed satisfactory internal consistency and acceptable convergent validity for both summary scores. Cronbach's α coefficient for PCS-12 and MCS-12 was 0.73 and 0.72, respectively. Known groups comparison showed that the SF-12 discriminated well between groups of respondents on the basis of gender, age, education and socioeconomic status. In addition, no floor or ceiling effects at baseline were observed. The PCA confirmed the two-factor structure of the SF-12 items. Items belonging to the physical component correlated more strongly with the PCS-12 than those with the MCS-12. Similarly, items belonging to the mental component correlated more strongly with the MCS-12 than those with the PCS-12. The findings suggest that the SF-12 appears to be a valid and reliable measure that can be used for measuring of population health

  15. A Multilevel Multidimensional Item Response Theory Model to Address the Role of Response Style on Measurement of Attitudes in PISA 2006

    Science.gov (United States)

    Lu, Yi

    2012-01-01

    Cross-national comparisons of responses to survey items are often affected by response style, particularly extreme response style (ERS). ERS varies across cultures, and has the potential to bias inferences in cross-national comparisons. For example, in both PISA and TIMSS assessments, it has been documented that when examined within countries,…

  16. Development and validation of Neonatal Satisfaction Survey--NSS-13.

    Science.gov (United States)

    Hagen, Inger H; Vadset, Tove B; Barstad, Johan; Svindseth, Marit F

    2015-06-01

    The purpose of this study was to develop and validate a survey to investigate parents' satisfaction with neonatal wards in a population of parents of children with a gestation age of ≥24 weeks to 3 months after full-term birth. We explored the literature and conducted three focus groups: two with expert health personnel and one with parents. We tested the survey in a parent population (N = 105) and report the different stages in the validation process along with the full survey, the Neonatal Satisfaction Survey - 13 categories (NSS-13). We found 13 subcategories in the Neonatal Satisfaction Survey. The subcategories measure parents' satisfaction with neonatal units based on staff, admission, nurses, anxiety, siblings (parents' perceptions of caring for the siblings of the newborn), information, timeout, doctors, facilities, nutrition, preparation for discharge, trust and visitors. Each subcategory showed acceptable internal consistency. The full version of the Neonatal Satisfaction Survey presents 69 items, and each subcategory contains two to eleven items. The Neonatal Satisfaction Survey seems suitable to measure parents' satisfaction with neonatal units and can be used in full, but it can also measure subcategories. Parents' satisfaction with neonatal units can be used to improve the quality in such wards. We consider this study as the first in a series to validate the NSS-13. The full survey with subcategories is presented in this paper. © 2014 Nordic College of Caring Science.

  17. The validity of the Satisfaction with Life Scale in adolescents and a comparison with single-item life satisfaction measures: a preliminary study.

    Science.gov (United States)

    Jovanović, Veljko

    2016-12-01

    The validity of the life satisfaction measures commonly used among adults has been rarely examined in adolescent samples. The present research had two main goals: (1) to evaluate the structural validity of the Satisfaction with Life Scale (SWLS) among adolescents and to test measurement invariance across gender; (2) to compare the criterion and convergent validity of the SWLS and single-item life satisfaction measures among adolescents. Three samples of Serbian adolescents were recruited for the present research. Study 1 (N = 481, M age  = 17.01 years) examined the structure of the SWLS via confirmatory factor analysis (CFA) and evaluated measurement invariance of the SWLS across gender by a multi-group CFA. Study 2 (N = 283, M age  = 17.34 years) and Study 3 (N = 220, M age  = 16.73 years) compared the convergent validity of the SWLS and single-item life satisfaction measures. The results of Study 1 supported the original one-factor model of the SWLS among adolescents and provided evidence for strong measurement invariance of the SWLS across gender. The findings of Study 2 and Study 3 showed that the SWLS and single-item measures were equally valid and strongly associated (r = .734 in Study 2 and r = .668 in Study 3). No substantial differences in correlations with school success and well-being indicators were found between the SWLS and single-item measures. Our findings support the use of the SWLS among adolescents and indicate that single-item life satisfaction measures perform as well as the SWLS in adolescent samples.

  18. Measuring job satisfaction among healthcare staff in the United States: a confirmatory factor analysis of the Satisfaction of Employees in Health Care (SEHC) survey.

    Science.gov (United States)

    Chang, Eva; Cohen, Julia; Koethe, Benjamin; Smith, Kevin; Bir, Anupa

    2017-04-01

    To validate the Satisfaction of Employees in Health Care (SEHC) survey with multidisciplinary, healthcare staff in the United States (U.S.). A cross-sectional psychometric study using confirmatory factor analysis. The original three-factor model was tested and modified using half-samples. Models were assessed using goodness-of-fit measures. Scale reliability and validity were tested with Cronbach's α coefficient and correlation of total SEHC score with two global satisfaction items, respectively. We administered a web-based survey from January to May 2015 to healthcare staff participating in initiatives aimed at delivering better care and reducing costs. The overall response rate was 38% (N = 1089), and respondents were from 86 healthcare projects. A total of 928 respondents completed the SEHC survey in full and were used in this study. Model fit of 18 SEHC items and total SEHC score. The mean SEHC score was 77.6 (SD: 19.0). A one-factor model of job satisfaction had high loadings on all items, and demonstrated adequate model fit (second half-sample RMSEA: 0.069). The scale demonstrated high reliability (Cronbach's alpha = 0.942) and validity (r = 0.77 and 0.76, both P job satisfaction construct. The scale has adequate reliability and validity to recommend its use to assess satisfaction among multidisciplinary, U.S. healthcare staff. Our findings suggest that this survey is a good candidate for reduction to a short-form, and future research should validate this survey in other healthcare populations. © The Author 2017. Published by Oxford University Press in association with the International Society for Quality in Health Care. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

  19. Further Investigating Method Effects Associated with Negatively Worded Items on Self-Report Surveys

    Science.gov (United States)

    DiStefano, Christine; Motl, Robert W.

    2006-01-01

    This article used multitrait-multimethod methodology and covariance modeling for an investigation of the presence and correlates of method effects associated with negatively worded items on the Rosenberg Self-Esteem (RSE) scale (Rosenberg, 1989) using a sample of 757 adults. Results showed that method effects associated with negative item phrasing…

  20. Single-item measures for depression and anxiety: Validation of the Screening Tool for Psychological Distress in an inpatient cardiology setting.

    Science.gov (United States)

    Young, Quincy-Robyn; Nguyen, Michelle; Roth, Susan; Broadberry, Ann; Mackay, Martha H

    2015-12-01

    Depression and anxiety are common among patients with cardiovascular disease (CVD) and confer significant cardiac risk, contributing to CVD morbidity and mortality. Unfortunately, due to the lack of screening tools that address the specific needs of hospitalized patients, few cardiac inpatient programs offer routine screening for these forms of psychological distress, despite recommendations to do so. The purpose of this study was to validate single-item measures for depression and anxiety among cardiac inpatients. Consecutive inpatients were recruited from the cardiology and cardiac surgery step-down units at a university-affiliated, quaternary-care hospital. Subjects completed a questionnaire that included: (a) demographics, (b) single-item-measures for depression and anxiety (from the Screening Tool for Psychological Distress (STOP-D)), and (c) Hospital Anxiety and Depression Scale (HADS). One hundred and five participants were recruited with a wide variety of cardiac diagnoses, having a mean age of 66 years, and 28% were women. Both STOP-D items were highly correlated with their corresponding validated measures and demonstrated robust receiver-operator characteristic curves. Severity scores on both items correlated well with established severity cut-off scores on the corresponding subscales of the HADS. The STOP-D is a self-administered, self-report measure using two independent items that provide severity scores for depression and anxiety. The tool performs very well compared with other previously validated measures. Requiring no additional scoring and being free, STOP-D offers a simple and valid method for identifying hospitalized cardiac patients who are experiencing psychological distress. This crucial first step triggers initiation of appropriate monitoring and intervention, thus reducing the likelihood of the adverse cardiac outcomes associated with psychological distress. © The European Society of Cardiology 2014.

  1. Item response theory analysis of the mechanics baseline test

    Science.gov (United States)

    Cardamone, Caroline N.; Abbott, Jonathan E.; Rayyan, Saif; Seaton, Daniel T.; Pawl, Andrew; Pritchard, David E.

    2012-02-01

    Item response theory is useful in both the development and evaluation of assessments and in computing standardized measures of student performance. In item response theory, individual parameters (difficulty, discrimination) for each item or question are fit by item response models. These parameters provide a means for evaluating a test and offer a better measure of student skill than a raw test score, because each skill calculation considers not only the number of questions answered correctly, but the individual properties of all questions answered. Here, we present the results from an analysis of the Mechanics Baseline Test given at MIT during 2005-2010. Using the item parameters, we identify questions on the Mechanics Baseline Test that are not effective in discriminating between MIT students of different abilities. We show that a limited subset of the highest quality questions on the Mechanics Baseline Test returns accurate measures of student skill. We compare student skills as determined by item response theory to the more traditional measurement of the raw score and show that a comparable measure of learning gain can be computed.

  2. Development and Validation of the 34-Item Disability Screening Questionnaire (DSQ-34 for Use in Low and Middle Income Countries Epidemiological and Development Surveys.

    Directory of Open Access Journals (Sweden)

    Jean-François Trani

    Full Text Available Although 80% of persons with disabilities live in low and middle-income countries, there is still a lack of comprehensive, cross-culturally validated tools to identify persons facing activity limitations and functioning difficulties in these settings. In absence of such a tool, disability estimates vary considerably according to the methodology used, and policies are based on unreliable estimates.The Disability Screening Questionnaire composed of 27 items (DSQ-27 was initially designed by a group of international experts in survey development and disability in Afghanistan for a national survey. Items were selected based on major domains of activity limitations and functioning difficulties linked to an impairment as defined by the International Classification of Functioning, Disability and Health. Face, content and construct validity, as well as sensitivity and specificity were examined. Based on the results obtained, the tool was subsequently refined and expanded to 34 items, tested and validated in Darfur, Sudan. Internal consistency for the total DSQ-34 using a raw and standardized Cronbach's Alpha and within each domain using a standardized Cronbach's Alpha was examined in the Asian context (India and Nepal. Exploratory factor analysis (EFA using principal axis factoring (PAF evaluated the lowest number of factors to account for the common variance among the questions in the screen. Test-retest reliability was determined by calculating intraclass correlation (ICC and inter-rater reliability by calculating the kappa statistic; results were checked using Bland-Altman plots. The DSQ-34 was further tested for standard error of measurement (SEM and for the minimum detectable change (MDC. Good internal consistency was indicated by Cronbach's Alpha of 0.83/0.82 for India and 0.76/0.78 for Nepal. We confirmed our assumption for EFA using the Kaiser-Meyer-Olkin measure of sampling well above the accepted cutoff of 0.40 for India (0.82 and Nepal (0

  3. Development and Validation of the 34-Item Disability Screening Questionnaire (DSQ-34) for Use in Low and Middle Income Countries Epidemiological and Development Surveys.

    Science.gov (United States)

    Trani, Jean-François; Babulal, Ganesh Muneshwar; Bakhshi, Parul

    2015-01-01

    Although 80% of persons with disabilities live in low and middle-income countries, there is still a lack of comprehensive, cross-culturally validated tools to identify persons facing activity limitations and functioning difficulties in these settings. In absence of such a tool, disability estimates vary considerably according to the methodology used, and policies are based on unreliable estimates. The Disability Screening Questionnaire composed of 27 items (DSQ-27) was initially designed by a group of international experts in survey development and disability in Afghanistan for a national survey. Items were selected based on major domains of activity limitations and functioning difficulties linked to an impairment as defined by the International Classification of Functioning, Disability and Health. Face, content and construct validity, as well as sensitivity and specificity were examined. Based on the results obtained, the tool was subsequently refined and expanded to 34 items, tested and validated in Darfur, Sudan. Internal consistency for the total DSQ-34 using a raw and standardized Cronbach's Alpha and within each domain using a standardized Cronbach's Alpha was examined in the Asian context (India and Nepal). Exploratory factor analysis (EFA) using principal axis factoring (PAF) evaluated the lowest number of factors to account for the common variance among the questions in the screen. Test-retest reliability was determined by calculating intraclass correlation (ICC) and inter-rater reliability by calculating the kappa statistic; results were checked using Bland-Altman plots. The DSQ-34 was further tested for standard error of measurement (SEM) and for the minimum detectable change (MDC). Good internal consistency was indicated by Cronbach's Alpha of 0.83/0.82 for India and 0.76/0.78 for Nepal. We confirmed our assumption for EFA using the Kaiser-Meyer-Olkin measure of sampling well above the accepted cutoff of 0.40 for India (0.82) and Nepal (0.82). The

  4. Characterization of Disability in Canadians with Mental Disorders Using an Abbreviated Version of a DSM-5 Emerging Measure: The 12-Item WHO Disability Assessment Schedule (WHODAS) 2.0.

    Science.gov (United States)

    Sjonnesen, Kirsten; Bulloch, Andrew G M; Williams, Jeanne; Lavorato, Dina; B Patten, Scott

    2016-04-01

    The World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) is a disability scale included in Section 3 of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) as a possible replacement for the Global Assessment of Functioning Scale (GAF). To assist Canadian psychiatrists with interpretation of the scale, we have conducted a descriptive analysis using data from the 2012 Canadian Community Health Survey-Mental Health component (CCHS-MH). The 2012 CCHS-MH was a cross-sectional survey of the Canadian community (n = 23,757). The survey included an abbreviated 12-item version of the WHODAS 2.0. Mental disorder diagnoses were assessed for schizophrenia, other psychosis, major depressive episode (MDE), generalized anxiety disorder (GAD), bipolar I disorder, substance abuse/dependence, and alcohol abuse/dependence. Mean scores ranged from 14.2 (95% CI, 14.1 to 14.3) for the overall community population to 23.1 (95% CI, 19.5 to 26.7) for those with schizophrenia, with higher scores indicating greater disability. Furthermore, the difference in scores between those with lifetime and past-month episodes suggests that the scale is sensitive to changes occurring during the course of these disorders; for example, scores varied from 23.6 (95% CI, 22.2 to 25.1) for past-month MDE to 14.4 (95% CI, 14.2 to 14.7) in the lifetime MDE group without a past-year episode. This analysis suggests that the WHODAS 2.0 may be a suitable replacement for the GAF. As a disability measure, even though it is not a mental health-specific instrument, the 12-item WHODAS 2.0 appears to be sensitive to the impact of mental disorders and to changes over the time course of a mental disorder. However, the clinical utility of this measure requires additional assessment. © The Author(s) 2016.

  5. Shortening a Patient Experiences Survey for Medical Homes

    Directory of Open Access Journals (Sweden)

    Judy H. Ng

    2015-12-01

    Full Text Available The Consumer Assessment of Healthcare Providers and Systems—Patient-Centered Medical Home (CAHPS PCMH Survey assesses patient experiences reflecting domains of care related to general patient experience (access to care, communication with providers, office staff interaction, provider rating and PCMH-specific aspects of patient care (comprehensiveness of care, self-management support, shared decision making. The current work compares psychometric properties of the current survey and a proposed shortened version of the survey (from 52 to 26 adult survey items, from 66 to 31 child survey items. The revisions were based on initial psychometric analysis and stakeholder input regarding survey length concerns. A total of 268 practices voluntarily submitted adult surveys and 58 submitted child survey data to the National Committee for Quality Assurance in 2013. Mean unadjusted scores, practice-level item and composite reliability, and item-to-scale correlations were calculated. Results show that the shorter adult survey has lower reliability, but still it still meets general definitions of a sound survey for the adult version, and resulted in few changes to mean scores. The impact was more problematic for the pediatric version. Further testing is needed to investigate approaches to improving survey response and the relevance of survey items in informing quality improvement.

  6. Contamination of clothing and other items by sweat during exercise 201Tl myocardial perfusion scintigraphy

    International Nuclear Information System (INIS)

    Yokoo, Shigeki; Niio, Yasuo; Yamamoto, Tomoaki; Miyashita, Makoto

    1999-01-01

    We measured the radioactivity on patient's upper and lower garments, towels, broad sashes for the bust, and electrodes contaminated by sweat due to exercise 201 Tl myocardial perfusion scintigraphy. In measuring activity, a scintillation survey meter adjusted to the energy of 201 Tl was used. In measuring the radioactivity of clothing, more than 4 Bq/cm 2 was considered to be a significant level of contamination. We detected contamination in 30% of upper garments and towels, 19% of broad sashes, 8% of lower garments and 4% of electrodes. Among these materials, several items of clothing and other items showed contamination exceeding 40 Bq/cm 2 . Towels were remarkably contaminated, with one towel showing a maximum contamination level of 420 Bq/cm 2 . Examinations done by exercise 201 Tl myocardial perfusion scintigraphy often result in the contamination of clothing and other items through sweating. This contamination is especially common in summer, particularly in upper garments and towels. The contamination ratio for towels was over 50%. The contamination ratio increased as the level of exercise became more difficult. When the exercise load was more than 100 W, the contamination ratio was 50%. In cases of extreme contamination, images of contaminated upper garments could be obtained by the scintigraphy camera. The areas of high activity on the images seemed to correspond to areas of the body where sweating was profuse. Based on these results, we should pay close attention to the handling of clothing and other items used in exercise testing by 201 Tl myocardial perfusion scintigraphy and the points used in measuring contaminated clothing and other items after testing. (author)

  7. Measuring psychological trauma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Psychological Trauma item bank and short form.

    Science.gov (United States)

    Kisala, Pamela A; Victorson, David; Pace, Natalie; Heinemann, Allen W; Choi, Seung W; Tulsky, David S

    2015-05-01

    To describe the development and psychometric properties of the SCI-QOL Psychological Trauma item bank and short form. Using a mixed-methods design, we developed and tested a Psychological Trauma item bank with patient and provider focus groups, cognitive interviews, and item response theory based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a 31-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Veterans Administration hospital. A total of 716 individuals with SCI completed the trauma items The 31 items fit a unidimensional model (CFI=0.952; RMSEA=0.061) and demonstrated good precision (theta range between 0.6 and 2.5). Nine items demonstrated negligible DIF with little impact on score estimates. The final calibrated item bank contains 19 items The SCI-QOL Psychological Trauma item bank is a psychometrically robust measurement tool from which a short form and a computer adaptive test (CAT) version are available.

  8. Structural Validation of a French Food Frequency Questionnaire of 94 Items

    Directory of Open Access Journals (Sweden)

    Rozenn Gazan

    2017-12-01

    Full Text Available BackgroundFood frequency questionnaires (FFQs are used to estimate the usual food and nutrient intakes over a period of time. Such estimates can suffer from measurement errors, either due to bias induced by respondent’s answers or to errors induced by the structure of the questionnaire (e.g., using a limited number of food items and an aggregated food database with average portion sizes. The “structural validation” presented in this study aims to isolate and quantify the impact of the inherent structure of a FFQ on the estimation of food and nutrient intakes, independently of respondent’s perception of the questionnaire.MethodsA semi-quantitative FFQ (n = 94 items, including 50 items with questions on portion sizes and an associated aggregated food composition database (named the item-composition database were developed, based on the self-reported weekly dietary records of 1918 adults (18–79 years-old in the French Individual and National Dietary Survey 2 (INCA2, and the French CIQUAL 2013 food-composition database of all the foods (n = 1342 foods declared as consumed in the population. Reference intakes of foods (“REF_FOOD” and nutrients (“REF_NUT” were calculated for each adult using the food-composition database and the amounts of foods self-reported in his/her dietary record. Then, answers to the FFQ were simulated for each adult based on his/her self-reported dietary record. “FFQ_FOOD” and “FFQ_NUT” intakes were estimated using the simulated answers and the item-composition database. Measurement errors (in %, spearman correlations and cross-classification were used to compare “REF_FOOD” with “FFQ_FOOD” and “REF_NUT” with “FFQ_NUT”.ResultsCompared to “REF_NUT,” “FFQ_NUT” total quantity and total energy intake were underestimated on average by 198 g/day and 666 kJ/day, respectively. “FFQ_FOOD” intakes were well estimated for starches, underestimated for most of the subgroups, and

  9. Self-Report Measures of the Home Learning Environment in Large Scale Research: Measurement Properties and Associations with Key Developmental Outcomes

    Science.gov (United States)

    Niklas, Frank; Nguyen, Cuc; Cloney, Daniel S.; Tayler, Collette; Adams, Raymond

    2016-01-01

    Favourable home learning environments (HLEs) support children's literacy, numeracy and social development. In large-scale research, HLE is typically measured by self-report survey, but there is little consistency between studies and many different items and latent constructs are observed. Little is known about the stability of these items and…

  10. The patient safety climate in healthcare organizations (PSCHO) survey: Short-form development.

    Science.gov (United States)

    Benzer, Justin K; Meterko, Mark; Singer, Sara J

    2017-08-01

    Measures of safety climate are increasingly used to guide safety improvement initiatives. However, cost and respondent burden may limit the use of safety climate surveys. The purpose of this study was to develop a 15- to 20-item safety climate survey based on the Patient Safety Climate in Healthcare Organizations survey, a well-validated 38-item measure of safety climate. The Patient Safety Climate in Healthcare Organizations was administered to all senior managers, all physicians, and a 10% random sample of all other hospital personnel in 69 private sector hospitals and 30 Veterans Health Administration hospitals. Both samples were randomly divided into a derivation sample to identify a short-form subset and a confirmation sample to assess the psychometric properties of the proposed short form. The short form consists of 15 items represented 3 overarching domains in the long-form scale-organization, work unit, and interpersonal. The proposed short form efficiently captures 3 important sources of variance in safety climate: organizational, work-unit, and interpersonal. The short-form development process was a practical method that can be applied to other safety climate surveys. This safety climate short form may increase response rates in studies that involve busy clinicians or repeated measures. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  11. A survey of temperature measurement

    International Nuclear Information System (INIS)

    Saltvold, J.R.

    1976-03-01

    Many different techniques for measuring temperature have been surveyed and are discussed. The concept of temperature and the physical phenomena used in temperature measurement are also discussed. Extensive tables are presented in which the range and accuracy of the various techniques and other related data are included. (author)

  12. Patients With Thumb Carpometacarpal Arthritis Have Quantifiable Characteristic Expectations That Can Be Measured With a Survey.

    Science.gov (United States)

    Kang, Lana; Hashmi, Sohaib Z; Nguyen, Joseph; Lee, Steve K; Weiland, Andrew J; Mancuso, Carol A

    2016-01-01

    Although patient expectations associated with major orthopaedic conditions have shown clinically relevant and variable effects on outcomes, expectations associated with thumb carpometacarpal (CMC) arthritis have not been identified, described, or analyzed before, to our knowledge. We asked: (1) Do patients with thumb CMC arthritis express characteristic expectations that are quantifiable and have measurable frequency? (2) Can a survey on expectations developed from patient-derived data quantitate expectations in patients with thumb CMC arthritis? The study was a prospective cohort study. The first phase was a 12-month-period involving interviews of 42 patients with thumb CMC arthritis to define their expectations of treatment. The interview process used techniques and principles of qualitative methodology including open-ended interview questions, unrestricted time, and study size determined by data saturation. Verbatim responses provided content for the draft survey. The second phase was a 12-month period assessing the survey for test-retest reliability with the recruitment of 36 participants who completed the survey twice. The survey was finalized from clinically relevant content, frequency of endorsement, weighted kappa values for concordance of responses, and intraclass coefficient and Cronbach's alpha for interrater reliability and internal consistency. Thirty-two patients volunteered 256 characteristic expectations, which consisted of 21 discrete categories. Expectations with similar concepts were combined by eliminating redundancy while maintaining original terminology. These were reduced to 19 items that comprised a one-page survey. This survey showed high concordance, interrater reliability, and internal consistency, with weighted kappa values between 0.58 and 0.78 (95% CI, 0.39-0.78; p Patients with thumb CMC arthritis volunteer a characteristic and quantifiable set of expectations. Using responses recorded verbatim from patient interviews, a clinically

  13. Students' approaches to learning in a clinical practicum: A psychometric evaluation based on item response theory.

    Science.gov (United States)

    Zhao, Yue; Kuan, Hoi Kei; Chung, Joyce O K; Chan, Cecilia K Y; Li, William H C

    2018-07-01

    The investigation of learning approaches in the clinical workplace context has remained an under-researched area. Despite the validation of learning approach instruments and their applications in various clinical contexts, little is known about the extent to which an individual item, that reflects a specific learning strategy and motive, effectively contributes to characterizing students' learning approaches. This study aimed to measure nursing students' approaches to learning in a clinical practicum using the Approaches to Learning at Work Questionnaire (ALWQ). Survey research design was used in the study. A sample of year 3 nursing students (n = 208) who undertook a 6-week clinical practicum course participated in the study. Factor analyses were conducted, followed by an item response theory analysis, including model assumption evaluation (unidimensionality and local independence), item calibration and goodness-of-fit assessment. Two subscales, deep and surface, were derived. Findings suggested that: (a) items measuring the deep motive from intrinsic interest and deep strategies of relating new ideas to similar situations, and that of concept mapping served as the strongest discriminating indicators; (b) the surface strategy of memorizing facts and details without an overall picture exhibited the highest discriminating power among all surface items; and, (c) both subscales appeared to be informative in assessing a broad range of the corresponding latent trait. The 21-item ALWQ derived from this study presented an efficient, internally consistent and precise measure. Findings provided a useful psychometric evaluation of the ALWQ in the clinical practicum context, added evidence to the utility of the ALWQ for nursing education practice and research, and echoed the discussions from previous studies on the role of the contextual factors in influencing student choices of different learning strategies. They provided insights for clinical educators to measure

  14. The Deaf Acculturation Scale (DAS): Development and Validation of a 58-Item Measure

    Science.gov (United States)

    Maxwell-McCaw, Deborah; Zea, Maria Cecilia

    2011-01-01

    This study involved the development and validation of the Deaf Acculturation Scale (DAS), a new measure of cultural identity for Deaf and hard-of-hearing (hh) populations. Data for this study were collected online and involved a nation-wide sample of 3,070 deaf/hh individuals. Results indicated strong internal reliabilities for all the subscales, and construct validity was established by demonstrating that the DAS could discriminate groups based on parental hearing status, school background, and use of self-labels. Construct validity was further demonstrated through factorial analyses, and findings resulted in a final 58-item measure. Directions for future research are discussed. PMID:21263041

  15. Using Item Response Theory to Develop a 60-Item Representation of the NEO PI-R Using the International Personality Item Pool: Development of the IPIP-NEO-60.

    Science.gov (United States)

    Maples-Keller, Jessica L; Williamson, Rachel L; Sleep, Chelsea E; Carter, Nathan T; Campbell, W Keith; Miller, Joshua D

    2017-10-31

    Given advantages of freely available and modifiable measures, an increase in the use of measures developed from the International Personality Item Pool (IPIP), including the 300-item representation of the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992a ) has occurred. The focus of this study was to use item response theory to develop a 60-item, IPIP-based measure of the Five-Factor Model (FFM) that provides equal representation of the FFM facets and to test the reliability and convergent and criterion validity of this measure compared to the NEO Five Factor Inventory (NEO-FFI). In an undergraduate sample (n = 359), scores from the NEO-FFI and IPIP-NEO-60 demonstrated good reliability and convergent validity with the NEO PI-R and IPIP-NEO-300. Additionally, across criterion variables in the undergraduate sample as well as a community-based sample (n = 757), the NEO-FFI and IPIP-NEO-60 demonstrated similar nomological networks across a wide range of external variables (r ICC = .96). Finally, as expected, in an MTurk sample the IPIP-NEO-60 demonstrated advantages over the Big Five Inventory-2 (Soto & John, 2017 ; n = 342) with regard to the Agreeableness domain content. The results suggest strong reliability and validity of the IPIP-NEO-60 scores.

  16. Projective Item Response Model for Test-Independent Measurement

    Science.gov (United States)

    Ip, Edward Hak-Sing; Chen, Shyh-Huei

    2012-01-01

    The problem of fitting unidimensional item-response models to potentially multidimensional data has been extensively studied. The focus of this article is on response data that contains a major dimension of interest but that may also contain minor nuisance dimensions. Because fitting a unidimensional model to multidimensional data results in…

  17. Development and validation of a preference weight multiattribute health outcome measure for rheumatoid arthritis.

    Science.gov (United States)

    Chiou, Chiun-Fang; Suarez-Almazor, Maria E; Sherbourne, Cathy D; Chang, Chih-Hung; Reyes, Carolina; Dylan, Michelle; Ofman, Joshua; Wallace, Daniel J; Mizutani, Wesley; Weisman, Michael

    2006-12-01

    To develop and validate multiattribute measures for patients with rheumatoid arthritis (RA) to report health states and estimate preference weights. Survey materials were mailed to 748 patients. Factor analysis, an item response theory-based model, and an internal consistency test were used to identify attributes and evaluate items. Two multiattribute preference weight functions (MAPWF) were constructed. Construct validity of the new measures was then tested. Four hundred eighty-seven patients returned the survey; 24 items on 6 health attributes were selected to form the new outcomes measure. Two MAPWF were derived with preference weights measured with time tradeoff and visual analog scales as dependent variables. All validity test results were statistically significant. Our results reveal that the new measures are reliable and valid in assessing health states and associated preference weights of patients with RA.

  18. Measuring Corporate Social Responsibility in Gambling Industry: Multi-Items Stakeholder Based Scales

    Directory of Open Access Journals (Sweden)

    Jian Ming Luo

    2017-11-01

    Full Text Available Macau gambling companies included Corporate Social Responsibility (CSR information in their annual reports and websites as a marketing tool. Responsible Gambling (RG had been a recurring issue in Macau’s chief executive report since 2007 and in many of the major gambling operators’ annual report. The purpose of this study was to develop a measurement scale on CSR activities in Macau. Items on the measurement scale were based on qualitative research with data collected from employees in Macau’s gambling industry and academic literature. First and Second Order confirmatory factor analysis (CFA were used to verify the reliability and validity of the measurement scale. The results of this study were satisfactory and were supported by empirical evidence. This study provided recommendations to gambling stakeholders, including practitioners, government officers, customers and shareholders, and implications to promote CSR practice in Macau gambling industry.

  19. On Studying Common Factor Dominance and Approximate Unidimensionality in Multicomponent Measuring Instruments with Discrete Items

    Science.gov (United States)

    Raykov, Tenko; Marcoulides, George A.

    2018-01-01

    This article outlines a procedure for examining the degree to which a common factor may be dominating additional factors in a multicomponent measuring instrument consisting of binary items. The procedure rests on an application of the latent variable modeling methodology and accounts for the discrete nature of the manifest indicators. The method…

  20. Item Information in the Rasch Model

    NARCIS (Netherlands)

    Engelen, Ron J.H.; van der Linden, Willem J.; Oosterloo, Sebe J.

    1988-01-01

    Fisher's information measure for the item difficulty parameter in the Rasch model and its marginal and conditional formulations are investigated. It is shown that expected item information in the unconditional model equals information in the marginal model, provided the assumption of sampling

  1. Measuring and exposures from National Media Surveys

    DEFF Research Database (Denmark)

    Mortensen, Peter Stendahl

    2000-01-01

    Natinal media surveys inform about the number and kind of people being exposed to the media in question. This paper discusses to what extent these numbers may be used as measures for the exposure to ads in the media in question. In this context attention is also focussed on elements in the media ...... surveys themselves that might invalidate or give unreliable measures, both when measuring a single exposure and accumulated exposures. Four media types will be discussed: TV, radio, print and the internet.......Natinal media surveys inform about the number and kind of people being exposed to the media in question. This paper discusses to what extent these numbers may be used as measures for the exposure to ads in the media in question. In this context attention is also focussed on elements in the media...

  2. Development of a psychological test to measure ability-based emotional intelligence in the Indonesian workplace using an item response theory.

    Science.gov (United States)

    Fajrianthi; Zein, Rizqy Amelia

    2017-01-01

    This study aimed to develop an emotional intelligence (EI) test that is suitable to the Indonesian workplace context. Airlangga Emotional Intelligence Test (Tes Kecerdasan Emosi Airlangga [TKEA]) was designed to measure three EI domains: 1) emotional appraisal, 2) emotional recognition, and 3) emotional regulation. TKEA consisted of 120 items with 40 items for each subset. TKEA was developed based on the Situational Judgment Test (SJT) approach. To ensure its psychometric qualities, categorical confirmatory factor analysis (CCFA) and item response theory (IRT) were applied to test its validity and reliability. The study was conducted on 752 participants, and the results showed that test information function (TIF) was 3.414 (ability level = 0) for subset 1, 12.183 for subset 2 (ability level = -2), and 2.398 for subset 3 (level of ability = -2). It is concluded that TKEA performs very well to measure individuals with a low level of EI ability. It is worth to note that TKEA is currently at the development stage; therefore, in this study, we investigated TKEA's item analysis and dimensionality test of each TKEA subset.

  3. Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

    Science.gov (United States)

    Wang, Wei

    2013-01-01

    Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

  4. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions

    Directory of Open Access Journals (Sweden)

    Yoon Soo ePark

    2016-02-01

    Full Text Available This study investigates the impact of item parameter drift (IPD on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effect on item parameters and examinee ability.

  5. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions.

    Science.gov (United States)

    Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

    2016-01-01

    This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.

  6. Using the LOINC Semantic Structure to Integrate Community-based Survey Items into a Concept-based Enterprise Data Dictionary to Support Comparative Effectiveness Research.

    Science.gov (United States)

    Co, Manuel C; Boden-Albala, Bernadette; Quarles, Leigh; Wilcox, Adam; Bakken, Suzanne

    2012-01-01

    In designing informatics infrastructure to support comparative effectiveness research (CER), it is necessary to implement approaches for integrating heterogeneous data sources such as clinical data typically stored in clinical data warehouses and those that are normally stored in separate research databases. One strategy to support this integration is the use of a concept-oriented data dictionary with a set of semantic terminology models. The aim of this paper is to illustrate the use of the semantic structure of Clinical LOINC (Logical Observation Identifiers, Names, and Codes) in integrating community-based survey items into the Medical Entities Dictionary (MED) to support the integration of survey data with clinical data for CER studies.

  7. Work ability as prognostic risk marker of disability pension : Single-item work ability score versus multi-item work ability index

    NARCIS (Netherlands)

    Roelen, C.A.M.; Rhenen, van W.; Groothoff, J.W.; Klink, van der J.J.L.; Twisk, W.R.; Heymans, M.W.

    2014-01-01

    Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP.

  8. Measuring social science concepts in pharmacy education research: From definition to item analysis of self-report instruments.

    Science.gov (United States)

    Cor, M Ken

    Interpreting results from quantitative research can be difficult when measures of concepts are constructed poorly, something that can limit measurement validity. Social science steps for defining concepts, guidelines for limiting construct-irrelevant variance when writing self-report questions, and techniques for conducting basic item analysis are reviewed to inform the design of instruments to measure social science concepts in pharmacy education research. Based on a review of the literature, four main recommendations emerge: These include: (1) employ a systematic process of conceptualization to derive nominal definitions; (2) write exact and detailed operational definitions for each concept, (3) when creating self-report questionnaires, write statements and select scales to avoid introducing construct-irrelevant variance (CIV); and (4) use basic item analysis results to inform instrument revision. Employing recommendations that emerge from this review will strengthen arguments to support measurement validity which in turn will support the defensibility of study finding interpretations. An example from pharmacy education research is used to contextualize the concepts introduced. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Comparison of Self-Reported Telephone Interviewing and Web-Based Survey Responses: Findings From the Second Australian Young and Well National Survey.

    Science.gov (United States)

    Milton, Alyssa C; Ellis, Louise A; Davenport, Tracey A; Burns, Jane M; Hickie, Ian B

    2017-09-26

    Web-based self-report surveying has increased in popularity, as it can rapidly yield large samples at a low cost. Despite this increase in popularity, in the area of youth mental health, there is a distinct lack of research comparing the results of Web-based self-report surveys with the more traditional and widely accepted computer-assisted telephone interviewing (CATI). The Second Australian Young and Well National Survey 2014 sought to compare differences in respondent response patterns using matched items on CATI versus a Web-based self-report survey. The aim of this study was to examine whether responses varied as a result of item sensitivity, that is, the item's susceptibility to exaggeration on underreporting and to assess whether certain subgroups demonstrated this effect to a greater extent. A subsample of young people aged 16 to 25 years (N=101), recruited through the Second Australian Young and Well National Survey 2014, completed the identical items on two occasions: via CATI and via Web-based self-report survey. Respondents also rated perceived item sensitivity. When comparing CATI with the Web-based self-report survey, a Wilcoxon signed-rank analysis showed that respondents answered 14 of the 42 matched items in a significantly different way. Significant variation in responses (CATI vs Web-based) was more frequent if the item was also rated by the respondents as highly sensitive in nature. Specifically, 63% (5/8) of the high sensitivity items, 43% (3/7) of the neutral sensitivity items, and 0% (0/4) of the low sensitivity items were answered in a significantly different manner by respondents when comparing their matched CATI and Web-based question responses. The items that were perceived as highly sensitive by respondents and demonstrated response variability included the following: sexting activities, body image concerns, experience of diagnosis, and suicidal ideation. For high sensitivity items, a regression analysis showed respondents who were male

  10. P2-19: The Effect of item Repetition on Item-Context Association Depends on the Prior Exposure of Items

    Directory of Open Access Journals (Sweden)

    Hongmi Lee

    2012-10-01

    Full Text Available Previous studies have reported conflicting findings on whether item repetition has beneficial or detrimental effects on source memory. To reconcile such contradictions, we investigated whether the degree of pre-exposure of items can be a potential modulating factor. The experimental procedures spanned two consecutive days. On Day 1, participants were exposed to a set of unfamiliar faces. On Day 2, the same faces presented on the previous day were used again in half of the participants, whereas novel faces were used for the other half. Day 2 procedures consisted of three successive phases: item repetition, source association, and source memory test. In the item repetition phase, half of the face stimuli were repeatedly presented while participants were making male/female judgments. During the source association phase, both the repeated and the unrepeated faces appeared in one of the four locations on the screen. Finally, participants were tested on the location in which a given face was presented during the previous phase and reported the confidence of their memory. Source memory accuracy was measured as the percentage of correct non-guess trials. As results, we found a significant interaction between prior exposure and repetition. Repetition impaired source memory when the items had been pre-exposed on Day 1, while it led to greater accuracy in novel ones. These results show that pre-experimental exposure can modulate the effects of repetition on associative binding between an item and its contextual information, suggesting that pre-existing representation and novelty signal interact to form new episodic memory.

  11. Measuring Experiential Avoidance: Reliability and Validity of the Dutch 9-item Acceptance and Action Questionnaire (AAQ)

    NARCIS (Netherlands)

    Boelen, P.A.; Reijntjes, A.H.A.

    2008-01-01

    Three studies evaluated psychometric properties of the Dutch version of the 9-item Acceptance and Action Questionnaire (AAQ)—a self-report measure designed to assess experiential avoidance as conceptualized inAcceptance and Commitment Therapy (ACT). Study 1, among bereaved adults, showed that a

  12. The effect of the spatial positioning of items on the reliability of questionnaires measuring affect

    Directory of Open Access Journals (Sweden)

    Leigh Leo

    2016-08-01

    Full Text Available Orientation: Extant research has shown that the relationship between spatial location and affect may have pervasive effects on evaluation. In particular, experimental findings on embodied cognition indicate that a person is spatially orientated to position what is positive at the top and what is negative at the bottom (vertical spatial orientation, and to a lesser extent, to position what is positive on the left and what is negative on the right (horizontal spatial orientation. It is therefore hypothesised, that when there is congruence between a respondent’s spatial orientation (related to affect and the spatial positioning (layout of a questionnaire, the reliability will be higher than in the case of incongruence. Research purpose: The principal objective of the two studies reported here was to ascertain the extent to which congruence between a respondent’s spatial orientation (related to affect and the layout of the questionnaire (spatial positioning of questionnaire items may impact on the reliability of a questionnaire measuring affect. Motivation for the study: The spatial position of items on a questionnaire measuring affect may indirectly impact on the reliability of the questionnaire. Research approach, design and method: In both studies, a controlled experimental research design was conducted using a sample of university students (n = 1825. Major findings: In both experiments, evidence was found to support the hypothesis that greater congruence between a respondent’s spatial orientation (related to affect and the spatial positioning (layout of a questionnaire leads to higher reliability on a questionnaire measuring affect. Practical implications: These findings may serve to create awareness of the influence of the spatial positioning of items as a confounding variable in questionnaire design. Contribution/value-add: Overall, this research complements previous studies by confirming the metaphorical representation of affect and

  13. Psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Nicotine Dependence Item Bank for use with electronic cigarettes.

    Science.gov (United States)

    Morean, Meghan; Krishnan-Sarin, Suchitra; Sussman, Steve; Foulds, Jonathan; Fishbein, Howard; Grana, Rachel; O'Malley, Stephanie S

    2018-01-02

    Psychometrically sound measures of e-cigarette dependence are lacking. We modified the PROMIS Nicotine Dependence Item Banks for use with e-cigarettes and evaluated the psychometrics of the 22-, 8- and 4-item adapted versions. 1009 adults who reported using e-cigarettes at least weekly completed an anonymous survey in Summer 2016 (50.2% male, 77.1% White, mean age 35.81 [10.71], 66.4% daily e-cigarette users, 72.6% current cigarette smokers). Psychometric analyses included confirmatory factor analysis, internal consistency, measurement invariance, examination of mean-level differences, convergent validity, and test-criterion relationships with e-cigarette use outcomes. All PROMIS-E versions had confirmable, internally consistent latent structures that were scalar invariant by sex, race, e-cigarette use (non-daily/daily), e-liquid nicotine content (no/yes), and current cigarette smoking status (no/yes). Daily e-cigarette users, nicotine e-liquid users, and cigarette smokers reported being more dependent on e-cigarettes than their counterparts. All PROMIS-E versions correlated strongly with one another, evidenced convergent validity with the Penn State E-cigarette Dependence Index and time to first e-cigarette use in the morning, and evidenced test-criterion relationships with vaping frequency, e-liquid nicotine concentration, and e-cigarette quit attempts. Similar results were observed when analyses were conducted within subsamples of exclusive e-cigarette users and duals-users of cigarettes and e-cigarettes. Each PROMIS-E version evidenced strong psychometric properties for assessing e-cigarette dependence in adults who either use e-cigarette exclusively or who are dual-users of cigarettes and e-cigarettes. However, results indicated little benefit of the longer versions over the 4-item PROMIS-E, which provides an efficient assessment of e-cigarette dependence. The availability of the novel, psychometrically sound PROMIS-E can further research on a wide range of

  14. Work ability as prognostic risk marker of disability pension: single-item work ability score versus multi-item work ability index

    NARCIS (Netherlands)

    Roelen, C.A.M.; van Rhenen, W.; Groothoff, J.W.; van der Klink, J.J.L.; Twisk, J.W.R.; Heymans, M.W.

    2014-01-01

    Objectives Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. Methods This

  15. Work ability as prognostic risk marker of disability pension : single-item work ability score versus multi-item work ability index

    NARCIS (Netherlands)

    Roelen, Corne A. M.; van Rhenen, Willem; Groothoff, Johan W.; van der Klink, Jac J. L.; Twisk, Jos W. R.; Heymans, Martijn W.

    Objectives Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. Methods This

  16. Psychometric Properties of the Heart Disease Knowledge Scale: Evidence from Item and Confirmatory Factor Analyses.

    Science.gov (United States)

    Lim, Bee Chiu; Kueh, Yee Cheng; Arifin, Wan Nor; Ng, Kok Huan

    2016-07-01

    Heart disease knowledge is an important concept for health education, yet there is lack of evidence on proper validated instruments used to measure levels of heart disease knowledge in the Malaysian context. A cross-sectional, survey design was conducted to examine the psychometric properties of the adapted English version of the Heart Disease Knowledge Questionnaire (HDKQ). Using proportionate cluster sampling, 788 undergraduate students at Universiti Sains Malaysia, Malaysia, were recruited and completed the HDKQ. Item analysis and confirmatory factor analysis (CFA) were used for the psychometric evaluation. Construct validity of the measurement model was included. Most of the students were Malay (48%), female (71%), and from the field of science (51%). An acceptable range was obtained with respect to both the difficulty and discrimination indices in the item analysis results. The difficulty index ranged from 0.12-0.91 and a discrimination index of ≥ 0.20 were reported for the final retained 23 items. The final CFA model showed an adequate fit to the data, yielding a 23-item, one-factor model [weighted least squares mean and variance adjusted scaled chi-square difference = 1.22, degrees of freedom = 2, P-value = 0.544, the root mean square error of approximation = 0.03 (90% confidence interval = 0.03, 0.04); close-fit P-value = > 0.950]. Adequate psychometric values were obtained for Malaysian undergraduate university students using the 23-item, one-factor model of the adapted HDKQ.

  17. The Single-Item Math Anxiety Scale: An Alternative Way of Measuring Mathematical Anxiety

    Science.gov (United States)

    Núñez-Peña, M. Isabel; Guilera, Georgina; Suárez-Pellicioni, Macarena

    2014-01-01

    This study examined whether the Single-Item Math Anxiety Scale (SIMA), based on the item suggested by Ashcraft, provided valid and reliable scores of mathematical anxiety. A large sample of university students (n = 279) was administered the SIMA and the 25-item Shortened Math Anxiety Rating Scale (sMARS) to evaluate the relation between the scores…

  18. Pattern analysis of total item score and item response of the Kessler Screening Scale for Psychological Distress (K6 in a nationally representative sample of US adults

    Directory of Open Access Journals (Sweden)

    Shinichiro Tomitaka

    2017-02-01

    Full Text Available Background Several recent studies have shown that total scores on depressive symptom measures in a general population approximate an exponential pattern except for the lower end of the distribution. Furthermore, we confirmed that the exponential pattern is present for the individual item responses on the Center for Epidemiologic Studies Depression Scale (CES-D. To confirm the reproducibility of such findings, we investigated the total score distribution and item responses of the Kessler Screening Scale for Psychological Distress (K6 in a nationally representative study. Methods Data were drawn from the National Survey of Midlife Development in the United States (MIDUS, which comprises four subsamples: (1 a national random digit dialing (RDD sample, (2 oversamples from five metropolitan areas, (3 siblings of individuals from the RDD sample, and (4 a national RDD sample of twin pairs. K6 items are scored using a 5-point scale: “none of the time,” “a little of the time,” “some of the time,” “most of the time,” and “all of the time.” The pattern of total score distribution and item responses were analyzed using graphical analysis and exponential regression model. Results The total score distributions of the four subsamples exhibited an exponential pattern with similar rate parameters. The item responses of the K6 approximated a linear pattern from “a little of the time” to “all of the time” on log-normal scales, while “none of the time” response was not related to this exponential pattern. Discussion The total score distribution and item responses of the K6 showed exponential patterns, consistent with other depressive symptom scales.

  19. Market survey of level measurement equipment

    International Nuclear Information System (INIS)

    Anon.

    1993-01-01

    In a market survey of level measurement equipment from 42 manufacturers, which is based on different principles of measurement and which is used for different applications, the data of the various manufacturers is compiled. (orig./HP) [de

  20. Development of a psychological test to measure ability-based emotional intelligence in the Indonesian workplace using an item response theory

    Directory of Open Access Journals (Sweden)

    Fajrianthi

    2017-11-01

    Full Text Available Fajrianthi,1 Rizqy Amelia Zein2 1Department of Industrial and Organizational Psychology, 2Department of Personality and Social Psychology, Faculty of Psychology, Universitas Airlangga, Surabaya, East Java, Indonesia Abstract: This study aimed to develop an emotional intelligence (EI test that is suitable to the Indonesian workplace context. Airlangga Emotional Intelligence Test (Tes Kecerdasan Emosi Airlangga [TKEA] was designed to measure three EI domains: 1 emotional appraisal, 2 emotional recognition, and 3 emotional regulation. TKEA consisted of 120 items with 40 items for each subset. TKEA was developed based on the Situational Judgment Test (SJT approach. To ensure its psychometric qualities, categorical confirmatory factor analysis (CCFA and item response theory (IRT were applied to test its validity and reliability. The study was conducted on 752 participants, and the results showed that test information function (TIF was 3.414 (ability level = 0 for subset 1, 12.183 for subset 2 (ability level = -2, and 2.398 for subset 3 (level of ability = -2. It is concluded that TKEA performs very well to measure individuals with a low level of EI ability. It is worth to note that TKEA is currently at the development stage; therefore, in this study, we investigated TKEA’s item analysis and dimensionality test of each TKEA subset. Keywords: categorical confirmatory factor analysis, emotional intelligence, item response theory 

  1. Comparison of Self-Reported Telephone Interviewing and Web-Based Survey Responses: Findings From the Second Australian Young and Well National Survey

    Science.gov (United States)

    Davenport, Tracey A; Burns, Jane M; Hickie, Ian B

    2017-01-01

    Background Web-based self-report surveying has increased in popularity, as it can rapidly yield large samples at a low cost. Despite this increase in popularity, in the area of youth mental health, there is a distinct lack of research comparing the results of Web-based self-report surveys with the more traditional and widely accepted computer-assisted telephone interviewing (CATI). Objective The Second Australian Young and Well National Survey 2014 sought to compare differences in respondent response patterns using matched items on CATI versus a Web-based self-report survey. The aim of this study was to examine whether responses varied as a result of item sensitivity, that is, the item’s susceptibility to exaggeration on underreporting and to assess whether certain subgroups demonstrated this effect to a greater extent. Methods A subsample of young people aged 16 to 25 years (N=101), recruited through the Second Australian Young and Well National Survey 2014, completed the identical items on two occasions: via CATI and via Web-based self-report survey. Respondents also rated perceived item sensitivity. Results When comparing CATI with the Web-based self-report survey, a Wilcoxon signed-rank analysis showed that respondents answered 14 of the 42 matched items in a significantly different way. Significant variation in responses (CATI vs Web-based) was more frequent if the item was also rated by the respondents as highly sensitive in nature. Specifically, 63% (5/8) of the high sensitivity items, 43% (3/7) of the neutral sensitivity items, and 0% (0/4) of the low sensitivity items were answered in a significantly different manner by respondents when comparing their matched CATI and Web-based question responses. The items that were perceived as highly sensitive by respondents and demonstrated response variability included the following: sexting activities, body image concerns, experience of diagnosis, and suicidal ideation. For high sensitivity items, a regression

  2. Differential item functioning of the patient-reported outcomes information system (PROMIS®) pain interference item bank by language (Spanish versus English).

    Science.gov (United States)

    Paz, Sylvia H; Spritzer, Karen L; Reise, Steven P; Hays, Ron D

    2017-06-01

    About 70% of Latinos, 5 years old or older, in the United States speak Spanish at home. Measurement equivalence of the PROMIS ® pain interference (PI) item bank by language of administration (English versus Spanish) has not been evaluated. A sample of 527 adult Spanish-speaking Latinos completed the Spanish version of the 41-item PROMIS ® pain interference item bank. We evaluate dimensionality, monotonicity and local independence of the Spanish-language items. Then we evaluate differential item functioning (DIF) using ordinal logistic regression with item response theory scores estimated from DIF-free "anchor" items. One of the 41 items in the Spanish version of the PROMIS ® PI item bank was identified as having significant uniform DIF. English- and Spanish-speaking subjects with the same level of pain interference responded differently to 1 of the 41 items in the PROMIS ® PI item bank. This item was not retained due to proprietary issues. The original English language item parameters can be used when estimating PROMIS ® PI scores.

  3. Measuring health literacy in populations: illuminating the design and development process of the European Health Literacy Survey Questionnaire (HLS-EU-Q).

    Science.gov (United States)

    Sørensen, Kristine; Van den Broucke, Stephan; Pelikan, Jürgen M; Fullam, James; Doyle, Gerardine; Slonska, Zofia; Kondilis, Barbara; Stoffels, Vivian; Osborne, Richard H; Brand, Helmut

    2013-10-10

    Several measurement tools have been developed to measure health literacy. The tools vary in their approach and design, but few have focused on comprehensive health literacy in populations. This paper describes the design and development of the European Health Literacy Survey Questionnaire (HLS-EU-Q), an innovative, comprehensive tool to measure health literacy in populations. Based on a conceptual model and definition, the process involved item development, pre-testing, field-testing, external consultation, plain language check, and translation from English to Bulgarian, Dutch, German, Greek, Polish, and Spanish. The development process resulted in the HLS-EU-Q, which entailed two sections, a core health literacy section and a section on determinants and outcomes associated to health literacy. The health literacy section included 47 items addressing self-reported difficulties in accessing, understanding, appraising and applying information in tasks concerning decisions making in healthcare, disease prevention, and health promotion. The second section included items related to, health behaviour, health status, health service use, community participation, socio-demographic and socio-economic factors. By illuminating the detailed steps in the design and development process of the HLS-EU-Q, it is the aim to provide a deeper understanding of its purpose, its capability and its limitations for others using the tool. By stimulating a wide application it is the vision that HLS-EU-Q will be validated in more countries to enhance the understanding of health literacy in different populations.

  4. Efficient Algorithms for Segmentation of Item-Set Time Series

    Science.gov (United States)

    Chundi, Parvathi; Rosenkrantz, Daniel J.

    We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets—Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.

  5. Quality of life assessed with the medical outcomes study short form 36-item health survey of patients on renal replacement therapy: A systematic review and meta-analysis

    NARCIS (Netherlands)

    Y.S. Liem (Ylian Serina); J.L. Bosch (Johanna); L.R. Arends (Lidia); M.H. Heijenbrok-Kal (Majanka); M.G.M. Hunink (Myriam)

    2007-01-01

    textabstractObjectives: The Medical Outcomes Study Short Form 36-Item Health Survey (SF-36) is the most widely used generic instrument to estimate quality of life of patients on renal replacement therapy. Purpose of this study was to summarize and compare the published literature on quality of

  6. Assessing the Equivalence of Paper, Mobile Phone, and Tablet Survey Responses at a Community Mental Health Center Using Equivalent Halves of a 'Gold-Standard' Depression Item Bank.

    Science.gov (United States)

    Brodey, Benjamin B; Gonzalez, Nicole L; Elkin, Kathryn Ann; Sasiela, W Jordan; Brodey, Inger S

    2017-09-06

    The computerized administration of self-report psychiatric diagnostic and outcomes assessments has risen in popularity. If results are similar enough across different administration modalities, then new administration technologies can be used interchangeably and the choice of technology can be based on other factors, such as convenience in the study design. An assessment based on item response theory (IRT), such as the Patient-Reported Outcomes Measurement Information System (PROMIS) depression item bank, offers new possibilities for assessing the effect of technology choice upon results. To create equivalent halves of the PROMIS depression item bank and to use these halves to compare survey responses and user satisfaction among administration modalities-paper, mobile phone, or tablet-with a community mental health care population. The 28 PROMIS depression items were divided into 2 halves based on content and simulations with an established PROMIS response data set. A total of 129 participants were recruited from an outpatient public sector mental health clinic based in Memphis. All participants took both nonoverlapping halves of the PROMIS IRT-based depression items (Part A and Part B): once using paper and pencil, and once using either a mobile phone or tablet. An 8-cell randomization was done on technology used, order of technologies used, and order of PROMIS Parts A and B. Both Parts A and B were administered as fixed-length assessments and both were scored using published PROMIS IRT parameters and algorithms. All 129 participants received either Part A or B via paper assessment. Participants were also administered the opposite assessment, 63 using a mobile phone and 66 using a tablet. There was no significant difference in item response scores for Part A versus B. All 3 of the technologies yielded essentially identical assessment results and equivalent satisfaction levels. Our findings show that the PROMIS depression assessment can be divided into 2 equivalent

  7. Gender Invariance of the Gambling Behavior Scale for Adolescents (GBS-A): An Analysis of Differential Item Functioning Using Item Response Theory.

    Science.gov (United States)

    Donati, Maria Anna; Chiesi, Francesca; Izzo, Viola A; Primi, Caterina

    2017-01-01

    As there is a lack of evidence attesting the equivalent item functioning across genders for the most employed instruments used to measure pathological gambling in adolescence, the present study was aimed to test the gender invariance of the Gambling Behavior Scale for Adolescents (GBS-A), a new measurement tool to assess the severity of Gambling Disorder (GD) in adolescents. The equivalence of the items across genders was assessed by analyzing Differential Item Functioning within an Item Response Theory framework. The GBS-A was administered to 1,723 adolescents, and the graded response model was employed. The results attested the measurement equivalence of the GBS-A when administered to male and female adolescent gamblers. Overall, findings provided evidence that the GBS-A is an effective measurement tool of the severity of GD in male and female adolescents and that the scale was unbiased and able to relieve truly gender differences. As such, the GBS-A can be profitably used in educational interventions and clinical treatments with young people.

  8. Difference in method of administration did not significantly impact item response

    DEFF Research Database (Denmark)

    Bjorner, Jakob B; Rose, Matthias; Gandek, Barbara

    2014-01-01

    assistant (PDA), or personal computer (PC) on the Internet, and a second form by PC, in the same administration. Structural invariance, equivalence of item responses, and measurement precision were evaluated using confirmatory factor analysis and item response theory methods. RESULTS: Multigroup...... levels in IVR, PQ, or PDA administration as compared to PC. Availability of large item response theory-calibrated PROMIS item banks allowed for innovations in study design and analysis.......PURPOSE: To test the impact of method of administration (MOA) on the measurement characteristics of items developed in the Patient-Reported Outcomes Measurement Information System (PROMIS). METHODS: Two non-overlapping parallel 8-item forms from each of three PROMIS domains (physical function...

  9. Communicating Quantitative Literacy: An Examination of Open-Ended Assessment Items in TIMSS, NALS, IALS, and PISA

    Directory of Open Access Journals (Sweden)

    Karl W. Kosko

    2011-07-01

    Full Text Available Quantitative Literacy (QL has been described as the skill set an individual uses when interacting with the world in a quantitative manner. A necessary component of this interaction is communication. To this end, assessments of QL have included open-ended items as a means of including communicative aspects of QL. The present study sought to examine whether such open-ended items typically measured aspects of quantitative communication, as compared to mathematical communication, or mathematical skills. We focused on public-released items and rubrics from four of the most widely referenced assessments: the Third International Mathematics and Science Study (TIMSS-95: the National Adult Literacy Survey (NALS; now the National Assessment of Adult Literacy, NAAL in 1985 and 1992, the International Adult Literacy Skills (IALS beginning in 1994; and the Program for International Student Assessment (PISA beginning in 2000. We found that open-ended item rubrics in these QL assessments showed a strong tendency to assess answer-only responses. Therefore, while some open-ended items may have required certain levels of quantitative reasoning to find a solution, it is the solution rather than the reasoning that was often assessed.

  10. Development and Validation of a Novel Generic Health-related Quality of Life Instrument With 20 Items (HINT-20

    Directory of Open Access Journals (Sweden)

    Min-Woo Jo

    2017-01-01

    Full Text Available Objectives Few attempts have been made to develop a generic health-related quality of life (HRQoL instrument and to examine its validity and reliability in Korea. We aimed to do this in our present study. Methods After a literature review of existing generic HRQoL instruments, a focus group discussion, in-depth interviews, and expert consultations, we selected 30 tentative items for a new HRQoL measure. These items were evaluated by assessing their ceiling effects, difficulty, and redundancy in the first survey. To validate the HRQoL instrument that was developed, known-groups validity and convergent/discriminant validity were evaluated and its test-retest reliability was examined in the second survey. Results Of the 30 items originally assessed for the HRQoL instrument, four were excluded due to high ceiling effects and six were removed due to redundancy. We ultimately developed a HRQoL instrument with a reduced number of 20 items, known as the Health-related Quality of Life Instrument with 20 items (HINT-20, incorporating physical, mental, social, and positive health dimensions. The results of the HINT-20 for known-groups validity were poorer in women, the elderly, and those with a low income. For convergent/discriminant validity, the correlation coefficients of items (except vitality in the physical health dimension with the physical component summary of the Short Form 36 version 2 (SF-36v2 were generally higher than the correlations of those items with the mental component summary of the SF-36v2, and vice versa. Regarding test-retest reliability, the intraclass correlation coefficient of the total HINT-20 score was 0.813 (p<0.001. Conclusions A novel generic HRQoL instrument, the HINT-20, was developed for the Korean general population and showed acceptable validity and reliability.

  11. On Becoming Trauma-Informed: Role of the Adverse Childhood Experiences Survey in Tertiary Child and Adolescent Mental Health Services and the Association with Standard Measures of Impairment and Severity.

    Science.gov (United States)

    Rahman, Abdul; Perri, Andrea; Deegan, Avril; Kuntz, Jennifer; Cawthorpe, David

    2018-01-01

    There is a movement toward trauma-informed, trauma-focused psychiatric treatment. To examine Adverse Childhood Experiences (ACE) survey items by sex and by total scores by sex vs clinical measures of impairment to examine the clinical utility of the ACE survey as an index of trauma in a child and adolescent mental health care setting. Descriptive, polychoric factor analysis and regression analyses were employed to analyze cross-sectional ACE surveys (N = 2833) and registration-linked data using past admissions (N = 10,400) collected from November 2016 to March 2017 related to clinical data (28 independent variables), taking into account multicollinearity. Distinct ACE items emerged for males, females, and those with self-identified sex and for ACE total scores in regression analysis. In hierarchical regression analysis, the final models consisting of standard clinical measures and demographic and system variables (eg, repeated admissions) were associated with substantial ACE total score variance for females (44%) and males (38%). Inadequate sample size foreclosed on developing a reduced multivariable model for the self-identified sex group. The ACE scores relate to independent clinical measures and system and demographic variables. There are implications for clinical practice. For example, a child presenting with anxiety and a high ACE score likely requires treatment that is different from a child presenting with anxiety and an ACE score of zero. The ACE survey score is an important index of presenting clinical status that guides patient care planning and intervention in the progress toward a trauma-focused system of care.

  12. Face Validity of the Single Work Ability Item: Comparison with Objectively Measured Heart Rate Reserve over Several Days

    Science.gov (United States)

    Gupta, Nidhi; Jensen, Bjørn Søvsø; Søgaard, Karen; Carneiro, Isabella Gomes; Christiansen, Caroline Stordal; Hanisch, Christiana; Holtermann, Andreas

    2014-01-01

    Purpose: The purpose of this study was to investigate the face validity of the self-reported single item work ability with objectively measured heart rate reserve (%HRR) among blue-collar workers. Methods: We utilized data from 127 blue-collar workers (Female = 53; Male = 74) aged 18–65 years from the cross-sectional “New method for Objective Measurements of physical Activity in Daily living (NOMAD)” study. The workers reported their single item work ability and completed an aerobic capacity cycling test and objective measurements of heart rate reserve monitored with Actiheart for 3–4 days with a total of 5,810 h, including 2,640 working hours. Results: A significant moderate correlation between work ability and %HRR was observed among males (R = −0.33, P = 0.005), but not among females (R = 0.11, P = 0.431). In a gender-stratified multi-adjusted logistic regression analysis, males with high %HRR were more likely to report a reduced work ability compared to males with low %HRR [OR = 4.75, 95% confidence interval (95% CI) = 1.31 to 17.25]. However, this association was not found among females (OR = 0.26, 95% CI 0.03 to 2.16), and a significant interaction between work ability, %HRR and gender was observed (P = 0.03). Conclusions: The observed association between work ability and objectively measured %HRR over several days among male blue-collar workers supports the face validity of the single work ability item. It is a useful and valid measure of the relation between physical work demands and resources among male blue-collar workers. The contrasting association among females needs to be further investigated. PMID:24840350

  13. Face Validity of the Single Work Ability Item: Comparison with Objectively Measured Heart Rate Reserve over Several Days

    Directory of Open Access Journals (Sweden)

    Nidhi Gupta

    2014-05-01

    Full Text Available Purpose: The purpose of this study was to investigate the face validity of the self-reported single item work ability with objectively measured heart rate reserve (%HRR among blue-collar workers. Methods: We utilized data from 127 blue-collar workers (Female = 53; Male = 74 aged 18–65 years from the cross-sectional “New method for Objective Measurements of physical Activity in Daily living (NOMAD” study. The workers reported their single item work ability and completed an aerobic capacity cycling test and objective measurements of heart rate reserve monitored with Actiheart for 3–4 days with a total of 5,810 h, including 2,640 working hours. Results: A significant moderate correlation between work ability and %HRR was observed among males (R = −0.33, P = 0.005, but not among females (R = 0.11, P = 0.431. In a gender-stratified multi-adjusted logistic regression analysis, males with high %HRR were more likely to report a reduced work ability compared to males with low %HRR [OR = 4.75, 95% confidence interval (95% CI = 1.31 to 17.25]. However, this association was not found among females (OR = 0.26, 95% CI 0.03 to 2.16, and a significant interaction between work ability, %HRR and gender was observed (P = 0.03. Conclusions: The observed association between work ability and objectively measured %HRR over several days among male blue-collar workers supports the face validity of the single work ability item. It is a useful and valid measure of the relation between physical work demands and resources among male blue-collar workers. The contrasting association among females needs to be further investigated.

  14. Item Response Theory Analysis of the Psychopathic Personality Inventory-Revised.

    Science.gov (United States)

    Eichenbaum, Alexander E; Marcus, David K; French, Brian F

    2017-06-01

    This study examined item and scale functioning in the Psychopathic Personality Inventory-Revised (PPI-R) using an item response theory analysis. PPI-R protocols from 1,052 college student participants (348 male, 704 female) were analyzed. Analyses were conducted on the 131 self-report items comprising the PPI-R's eight content scales, using a graded response model. Scales collected a majority of their information about respondents possessing higher than average levels of the traits being measured. Each scale contained at least some items that evidenced limited ability to differentiate between respondents with differing levels of the trait being measured. Moreover, 80 items (61.1%) yielded significantly different responses between men and women presumably possessing similar levels of the trait being measured. Item performance was also influenced by the scoring format (directly scored vs. reverse-scored) of the items. Overall, the results suggest that the PPI-R, despite identifying psychopathic personality traits in individuals possessing high levels of those traits, may not identify these traits equally well for men and women, and scores are likely influenced by the scoring format of the individual item and scale.

  15. Item Banks for Substance Use from the Patient-Reported Outcomes Measurement Information System (PROMIS®): Severity of Use and Positive Appeal of Use*

    Science.gov (United States)

    Pilkonis, Paul A.; Yu, Lan; Dodds, Nathan E.; Johnston, Kelly L.; Lawrence, Suzanne; Hilton, Thomas F.; Daley, Dennis C.; Patkar, Ashwin A.; McCarty, Dennis

    2015-01-01

    Background Two item banks for substance use were developed as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®): severity of substance use and positive appeal of substance use. Methods Qualitative item analysis (including focus groups, cognitive interviewing, expert review, and item revision) reduced an initial pool of more than 5,300 items for substance use to 119 items included in field testing. Items were written in a first-person, past-tense format, with 5 response options reflecting frequency or severity. Both 30-day and 3-month time frames were tested. The calibration sample of 1,336 respondents included 875 individuals from the general population (ascertained through an internet panel) and 461patients from addiction treatment centers participating in the National Drug Abuse Treatment Clinical Trials Network. Results Final banks of 37 and 18 items were calibrated for severity of substance use and positive appeal of substance use, respectively, using the two-parameter graded response model from item response theory (IRT). Initial calibrations were similar for the 30-day and 3-month time frames, and final calibrations used data combined across the time frames, making the items applicable with either interval. Seven-item static short forms were also developed from each item bank. Conclusions Test information curves showed that the PROMIS item banks provided substantial information in a broad range of severity, making them suitable for treatment, observational, and epidemiological research in both clinical and community settings. PMID:26423364

  16. An empirical comparison of Item Response Theory and Classical Test Theory

    Directory of Open Access Journals (Sweden)

    Špela Progar

    2008-11-01

    Full Text Available Based on nonlinear models between the measured latent variable and the item response, item response theory (IRT enables independent estimation of item and person parameters and local estimation of measurement error. These properties of IRT are also the main theoretical advantages of IRT over classical test theory (CTT. Empirical evidence, however, often failed to discover consistent differences between IRT and CTT parameters and between invariance measures of CTT and IRT parameter estimates. In this empirical study a real data set from the Third International Mathematics and Science Study (TIMSS 1995 was used to address the following questions: (1 How comparable are CTT and IRT based item and person parameters? (2 How invariant are CTT and IRT based item parameters across different participant groups? (3 How invariant are CTT and IRT based item and person parameters across different item sets? The findings indicate that the CTT and the IRT item/person parameters are very comparable, that the CTT and the IRT item parameters show similar invariance property when estimated across different groups of participants, that the IRT person parameters are more invariant across different item sets, and that the CTT item parameters are at least as much invariant in different item sets as the IRT item parameters. The results furthermore demonstrate that, with regards to the invariance property, IRT item/person parameters are in general empirically superior to CTT parameters, but only if the appropriate IRT model is used for modelling the data.

  17. The Role of Item Models in Automatic Item Generation

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2012-01-01

    Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…

  18. Single-item measure for assessing quality of life in children with drug-resistant epilepsy.

    Science.gov (United States)

    Conway, Lauryn; Widjaja, Elysa; Smith, Mary Lou

    2018-03-01

    The current study investigated the psychometric properties of a single-item quality of life (QOL) measure, the Global Quality of Life in Childhood Epilepsy question (G-QOLCE), in children with drug-resistant epilepsy. Data came from the Impact of Pediatric Epilepsy Surgery on Health-Related Quality of Life Study (PESQOL), a multicenter prospective cohort study (n = 118) with observations collected at baseline and at 6 months of follow-up on children aged 4-18 years. QOL was measured with the QOLCE-76 and KIDSCREEN-27. The G-QOLCE was an overall QOL question derived from the QOLCE-76. Construct validity and reliability were assessed with Spearman's correlation and intraclass correlation coefficient (ICC). Responsiveness was examined through distribution-based and anchor-based methods. The G-QOLCE showed moderate (r ≥ 0.30) to strong (r ≥ 0.50) correlations with composite scores, and most subscales of the QOLCE-76 and KIDSCREEN-27 at baseline and 6-month follow-up. The G-QOLCE had moderate test-retest reliability (ICC range: 0.49-0.72) and was able to detect clinically important change in patients' QOL (standardized response mean: 0.38; probability of change: 0.65; Guyatt's responsiveness statistics: 0.62 and 0.78). Caregiver anxiety and family functioning contributed most strongly to G-QOLCE scores over time. Results offer promising preliminary evidence regarding the validity, reliability, and responsiveness of the proposed single-item QOL measure. The G-QOLCE is a potentially useful tool that can be feasibly administered in a busy clinical setting to evaluate clinical status and impact of treatment outcomes in pediatric epilepsy.

  19. Procurement Engineering Process for Commercial Grade Item Dedication

    International Nuclear Information System (INIS)

    Park, Jong-Hyuck; Park, Jong-Eun; Kwak, Tack-Hun; Yoo, Keun-Bae; Lee, Sang-Guk; Hong, Sung-Yull

    2006-01-01

    Procurement Engineering Process for commercial grade item dedication plays an increasingly important role in operation management of Korea Nuclear Power Plants. The purpose of the Procurement Engineering Process is the provision and assurance of a high quality and quantity of spare, replacement, retrofit and new parts and equipment while maximizing plant availability, minimizing downtime due to parts unavailability and providing reasonable overall program and inventory cost. In this paper, we will review the overview requirements, responsibilities and the process for demonstrating with reasonable assurance that a procured item for potential nuclear safety related services or other essential plant service is adequate with reasonable assurance for its application. This paper does not cover the details of technical evaluation, selecting critical characteristics, selecting acceptance methods, performing failure modes and effects analysis, performing source surveillance, performing quality surveys, performing special tests and inspections, and the other aspects of effective Procurement Engineering and Commercial Grade Item Dedication. The main contribution of this paper is to provide the provision of an overview of Procurement Engineering Process for commercial grade item

  20. Do people with and without medical conditions respond similarly to the short health anxiety inventory? An assessment of differential item functioning using item response theory.

    Science.gov (United States)

    LeBouthillier, Daniel M; Thibodeau, Michel A; Alberts, Nicole M; Hadjistavropoulos, Heather D; Asmundson, Gordon J G

    2015-04-01

    Individuals with medical conditions are likely to have elevated health anxiety; however, research has not demonstrated how medical status impacts response patterns on health anxiety measures. Measurement bias can undermine the validity of a questionnaire by overestimating or underestimating scores in groups of individuals. We investigated whether the Short Health Anxiety Inventory (SHAI), a widely-used measure of health anxiety, exhibits medical condition-based bias on item and subscale levels, and whether the SHAI subscales adequately assess the health anxiety continuum. Data were from 963 individuals with diabetes, breast cancer, or multiple sclerosis, and 372 healthy individuals. Mantel-Haenszel tests and item characteristic curves were used to classify the severity of item-level differential item functioning in all three medical groups compared to the healthy group. Test characteristic curves were used to assess scale-level differential item functioning and whether the SHAI subscales adequately assess the health anxiety continuum. Nine out of 14 items exhibited differential item functioning. Two items exhibited differential item functioning in all medical groups compared to the healthy group. In both Thought Intrusion and Fear of Illness subscales, differential item functioning was associated with mildly deflated scores in medical groups with very high levels of the latent traits. Fear of Illness items poorly discriminated between individuals with low and very low levels of the latent trait. While individuals with medical conditions may respond differentially to some items, clinicians and researchers can confidently use the SHAI with a variety of medical populations without concern of significant bias. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. 2012 Workplace and Gender Relations Survey of Active Duty Members. Survey Note and Briefing

    Science.gov (United States)

    2013-03-15

    items regarding unwanted attempts to establish a sexual relationship – Sexual Coercion – four items regarding classic quid pro quo instances of special...continues to emphasize sexual assault and sexual harassment response and prevention in the military. This survey note discusses findings from the... harassment in the active duty force. This survey note and accompanying briefing (Appendix) provide information on the prevalence rates of sexual

  2. Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

    Science.gov (United States)

    Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

    2013-07-01

    Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.

  3. Exploring differential item functioning in the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC

    Directory of Open Access Journals (Sweden)

    Pollard Beth

    2012-12-01

    Full Text Available Abstract Background The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC is a widely used patient reported outcome in osteoarthritis. An important, but frequently overlooked, aspect of validating health outcome measures is to establish if items exhibit differential item functioning (DIF. That is, if respondents have the same underlying level of an attribute, does the item give the same score in different subgroups or is it biased towards one subgroup or another. The aim of the study was to explore DIF in the Likert format WOMAC for the first time in a UK osteoarthritis population with respect to demographic, social, clinical and psychological factors. Methods The sample comprised a community sample of 763 people with osteoarthritis who participated in the Somerset and Avon Survey of Health. The WOMAC was explored for DIF by gender, age, social deprivation, social class, employment status, distress, body mass index and clinical factors. Ordinal regression models were used to identify DIF items. Results After adjusting for age, two items were identified for the physical functioning subscale as having DIF with age identified as the DIF factor for 2 items, gender for 1 item and body mass index for 1 item. For the WOMAC pain subscale, for people with hip osteoarthritis one item was identified with age-related DIF. The impact of the DIF items rarely had a significant effect on the conclusions of group comparisons. Conclusions Overall, the WOMAC performed well with only a small number of DIF items identified. However, as DIF items were identified in for the WOMAC physical functioning subscale it would be advisable to analyse data taking into account the possible impact of the DIF items when weight, gender or especially age effects, are the focus of interest in UK-based osteoarthritis studies. Similarly for the WOMAC pain subscale in people with hip osteoarthritis it would be worthwhile to analyse data taking into account the

  4. Validation of a mobility item bank for older patients in primary care.

    Science.gov (United States)

    Cabrero-García, Julio; Ramos-Pichardo, Juan Diego; Muñoz-Mendoza, Carmen Luz; Cabañero-Martínez, María José; González-Llopis, Lorena; Reig-Ferrer, Abilio

    2012-12-05

    To develop and validate an item bank to measure mobility in older people in primary care and to analyse differential item functioning (DIF) and differential bundle functioning (DBF) by sex. A pool of 48 mobility items was administered by interview to 593 older people attending primary health care practices. The pool contained four domains based on the International Classification of Functioning: changing and maintaining body position, carrying, lifting and pushing, walking and going up and down stairs. The Late Life Mobility item bank consisted of 35 items, and measured with a reliability of 0.90 or more across the full spectrum of mobility, except at the higher end of better functioning. No evidence was found of non-uniform DIF but uniform DIF was observed, mainly for items in the changing and maintaining body position and carrying, lifting and pushing domains. The walking domain did not display DBF, but the other three domains did, principally the carrying, lifting and pushing items. During the design and validation of an item bank to measure mobility in older people, we found that strength (carrying, lifting and pushing) items formed a secondary dimension that produced DBF. More research is needed to determine how best to include strength items in a mobility measure, or whether it would be more appropriate to design separate measures for each construct.

  5. A symptom profile of depression among Asian Americans: is there evidence for differential item functioning of depressive symptoms?

    Science.gov (United States)

    Kalibatseva, Z; Leong, F T L; Ham, E H

    2014-09-01

    Theoretical and clinical publications suggest the existence of cultural differences in the expression and experience of depression. Measurement non-equivalence remains a potential methodological explanation for the lower prevalence of depression among Asian Americans compared to European Americans. This study compared DSM-IV depressive symptoms among Asian Americans and European Americans using secondary data analysis of the Collaborative Psychiatric Epidemiology Surveys (CPES). The Composite International Diagnostic Interview (CIDI) was used for the assessment of depressive symptoms. Of the entire sample, 310 Asian Americans and 1974 European Americans reported depressive symptoms and were included in the analyses. Measurement variance was examined with an item response theory differential item functioning (IRT DIF) analysis. χ2 analyses indicated that, compared to Asian Americans, European American participants more frequently endorsed affective symptoms such as 'feeling depressed', 'feeling discouraged' and 'cried more often'. The IRT analysis detected DIF for four out of the 15 depression symptom items. At equal levels of depression, Asian Americans endorsed feeling worthless and appetite changes more easily than European Americans, and European Americans endorsed feeling nervous and crying more often than Asian Americans. Asian Americans did not seem to over-report somatic symptoms; however, European Americans seemed to report more affective symptoms than Asian Americans. The results suggest that there was measurement variance in a few of the depression items.

  6. Problems with the factor analysis of items: Solutions based on item response theory and item parcelling

    Directory of Open Access Journals (Sweden)

    Gideon P. De Bruin

    2004-10-01

    Full Text Available The factor analysis of items often produces spurious results in the sense that unidimensional scales appear multidimensional. This may be ascribed to failure in meeting the assumptions of linearity and normality on which factor analysis is based. Item response theory is explicitly designed for the modelling of the non-linear relations between ordinal variables and provides a strong alternative to the factor analysis of items. Items may also be combined in parcels that are more likely to satisfy the assumptions of factor analysis than do the items. The use of the Rasch rating scale model and the factor analysis of parcels is illustrated with data obtained with the Locus of Control Inventory. The results of these analyses are compared with the results obtained through the factor analysis of items. It is shown that the Rasch rating scale model and the factoring of parcels produce superior results to the factor analysis of items. Recommendations for the analysis of scales are made. Opsomming Die faktorontleding van items lewer dikwels misleidende resultate op, veral in die opsig dat eendimensionele skale as meerdimensioneel voorkom. Hierdie resultate kan dikwels daaraan toegeskryf word dat daar nie aan die aannames van lineariteit en normaliteit waarop faktorontleding berus, voldoen word nie. Itemresponsteorie, wat eksplisiet vir die modellering van die nie-liniêre verbande tussen ordinale items ontwerp is, bied ’n aantreklike alternatief vir die faktorontleding van items. Items kan ook in pakkies gegroepeer word wat meer waarskynlik aan die aannames van faktorontleding voldoen as individuele items. Die gebruik van die Rasch beoordelingskaalmodel en die faktorontleding van pakkies word aan die hand van data wat met die Lokus van Beheervraelys verkry is, gedemonstreer. Die resultate van hierdie ontledings word vergelyk met die resultate wat deur ‘n faktorontleding van die individuele items verkry is. Die resultate dui daarop dat die Rasch

  7. Uncontrolled Web-based administration of surveys on factual health-related knowledge: a randomized study of untimed versus timed quizzing.

    Science.gov (United States)

    Domnich, Alexander; Panatto, Donatella; Signori, Alessio; Bragazzi, Nicola Luigi; Cristina, Maria Luisa; Amicizia, Daniela; Gasparini, Roberto

    2015-04-13

    Health knowledge and literacy are among the main determinants of health. Assessment of these issues via Web-based surveys is growing continuously. Research has suggested that approximately one-fifth of respondents submit cribbed answers, or cheat, on factual knowledge items, which may lead to measurement error. However, little is known about methods of discouraging cheating in Web-based surveys on health knowledge. This study aimed at exploring the usefulness of imposing a survey time limit to prevent help-seeking and cheating. On the basis of sample size estimation, 94 undergraduate students were randomly assigned in a 1:1 ratio to complete a Web-based survey on nutrition knowledge, with or without a time limit of 15 minutes (30 seconds per item); the topic of nutrition was chosen because of its particular relevance to public health. The questionnaire consisted of two parts. The first was the validated consumer-oriented nutrition knowledge scale (CoNKS) consisting of 20 true/false items; the second was an ad hoc questionnaire (AHQ) containing 10 questions that would be very difficult for people without health care qualifications to answer correctly. It therefore aimed at measuring cribbing and not nutrition knowledge. AHQ items were somewhat encyclopedic and amenable to Web searching, while CoNKS items had more complex wording, so that simple copying/pasting of a question in a search string would not produce an immediate correct answer. A total of 72 of the 94 subjects started the survey. Dropout rates were similar in both groups (11%, 4/35 and 14%, 5/37 in the untimed and timed groups, respectively). Most participants completed the survey from portable devices, such as mobile phones and tablets. To complete the survey, participants in the untimed group took a median 2.3 minutes longer than those in the timed group; the effect size was small (Cohen's r=.29). Subjects in the untimed group scored significantly higher on CoNKS (mean difference of 1.2 points, P=.008

  8. The measurement of cyberbullying: dimensional structure and relative item severity and discrimination.

    Science.gov (United States)

    Menesini, Ersilia; Nocentini, Annalaura; Calussi, Pamela

    2011-05-01

    In relation to a sample of 1,092 Italian adolescents (50.9% females), the present study aims to: (a) analyze the most parsimonious structure of the cyberbullying and cybervictimization construct in male and female Italian adolescents through confirmatory factor analysis; and (b) analyze the severity and the discrimination parameters of each act using the item response theory. Results showed that the structure of the cyberbullying scale for perpetrated and received behaviors in both genders could best be represented by a monodimensional model where each item lies on a continuum of severity of aggressive acts. For both genders, the less severe acts are silent/prank calls and insults on instant messaging, and the most severe acts are unpleasant pictures/photos on Web sites, phone pictures/photos/videos of intimate scenes, and phone pictures/photos/videos of violent scenes. The items nasty text messages, nasty or rude e-mails, insults on Web sites, insults in chatrooms, and insults on blogs range from moderate to high levels of severity. Regarding the discrimination level of the acts, several items emerged as good indicators at various levels of cyberbullying and cybervictimization severity, with the exception of silent/prank calls. Furthermore, gender specificities underlined that the visual items can be considered good indicators of severe cyberbullies and cybervictims only in males. This information can help in understanding better the nature of the phenomenon, its severity in a given population, and to plan more specific prevention and intervention strategies.

  9. The 12-item World Health Organization Disability Assessment Schedule II (WHO-DAS II: a nonparametric item response analysis

    Directory of Open Access Journals (Sweden)

    Fernandez Ana

    2010-05-01

    Full Text Available Abstract Background Previous studies have analyzed the psychometric properties of the World Health Organization Disability Assessment Schedule II (WHO-DAS II using classical omnibus measures of scale quality. These analyses are sample dependent and do not model item responses as a function of the underlying trait level. The main objective of this study was to examine the effectiveness of the WHO-DAS II items and their options in discriminating between changes in the underlying disability level by means of item response analyses. We also explored differential item functioning (DIF in men and women. Methods The participants were 3615 adult general practice patients from 17 regions of Spain, with a first diagnosed major depressive episode. The 12-item WHO-DAS II was administered by the general practitioners during the consultation. We used a non-parametric item response method (Kernel-Smoothing implemented with the TestGraf software to examine the effectiveness of each item (item characteristic curves and their options (option characteristic curves in discriminating between changes in the underliying disability level. We examined composite DIF to know whether women had a higher probability than men of endorsing each item. Results Item response analyses indicated that the twelve items forming the WHO-DAS II perform very well. All items were determined to provide good discrimination across varying standardized levels of the trait. The items also had option characteristic curves that showed good discrimination, given that each increasing option became more likely than the previous as a function of increasing trait level. No gender-related DIF was found on any of the items. Conclusions All WHO-DAS II items were very good at assessing overall disability. Our results supported the appropriateness of the weights assigned to response option categories and showed an absence of gender differences in item functioning.

  10. A survey of anatomical items relevant to the practice of rheumatology: upper extremity, head, neck, spine, and general concepts.

    Science.gov (United States)

    Villaseñor-Ovies, Pablo; Navarro-Zarza, José Eduardo; Saavedra, Miguel Ángel; Hernández-Díaz, Cristina; Canoso, Juan J; Biundo, Joseph J; Kalish, Robert A; de Toro Santos, Francisco Javier; McGonagle, Dennis; Carette, Simon; Alvarez-Nemegyei, José

    2016-12-01

    This study aimed to identify the anatomical items of the upper extremity and spine that are potentially relevant to the practice of rheumatology. Ten rheumatologists interested in clinical anatomy who published, taught, and/or participated as active members of Clinical Anatomy Interest groups (six seniors, four juniors), participated in a one-round relevance Delphi exercise. An initial, 560-item list that included 45 (8.0 %) general concepts items; 138 (24.8 %) hand items; 100 (17.8 %) forearm and elbow items; 147 (26.2 %) shoulder items; and 130 (23.2 %) head, neck, and spine items was compiled by 5 of the participants. Each item was graded for importance with a Likert scale from 1 (not important) to 5 (very important). Thus, scores could range from 10 (1 × 10) to 50 (5 × 10). An item score of ≥40 was considered most relevant to competent practice as a rheumatologist. Mean item Likert scores ranged from 2.2 ± 0.5 to 4.6 ± 0.7. A total of 115 (20.5 %) of the 560 initial items reached relevance. Broken down by categories, this final relevant item list was composed by 7 (6.1 %) general concepts items; 32 (27.8 %) hand items; 20 (17.4 %) forearm and elbow items; 33 (28.7 %) shoulder items; and 23 (17.6 %) head, neck, and spine items. In this Delphi exercise, a group of practicing academic rheumatologists with an interest in clinical anatomy compiled a list of anatomical items that were deemed important to the practice of rheumatology. We suggest these items be considered curricular priorities when training rheumatology fellows in clinical anatomy skills and in programs of continuing rheumatology education.

  11. Desenvolvimento de uma escala para medir o potencial empreendedor utilizando a Teoria da Resposta ao Item (TRI Development of a scale to measure the entrepreneurial potential using the Item Response Theory (IRT

    Directory of Open Access Journals (Sweden)

    Luciano Ricardo Rath Alves

    2011-01-01

    Full Text Available Diversas variáveis estão relacionadas ao desenvolvimento da atividade empreendedora, verifica-se, entre elas, a importância do agente empreendedor. Dos estudos que contribuem para o seu entendimento, este segue a linha que defende que o empreendedor tem características e traços de personalidade singulares em relação à população, os quais são propícios ao sucesso do empreendedorismo. O objetivo deste trabalho é desenvolver uma escala para medir o potencial empreendedor utilizando a Teoria da Resposta ao Item. Foi utilizado o modelo logístico de dois parâmetros da TRI. As estimativas dos parâmetros foram obtidas a partir da amostra com 764 pessoas que responderam a um instrumento composto por 103 itens. A curva de informação e do erro padrão do teste e a interpretação qualitativa de níveis da escala permitiram determinar o intervalo mais apropriado para utilização do instrumento. Os resultados mostraram que a escala é mais adequada para avaliar indivíduos com baixo até moderadamente alto potencial empreendedor. Por isso, sugere-se que novos itens sejam incorporados ao instrumento para mensurar e interpretar níveis ainda mais elevados. A Teoria da Resposta ao Item permite que novos itens sejam calibrados a fim de mensurar os empreendedores com alto potencial empreendedor, aproveitando os dados já obtidos.Several variables are related to the development of entrepreneurial activities. An important one among them is the entrepreneurial agent. This study is one of many that contribute to the understanding of the entrepreneurial agent. In its line of thought, it upholds the idea that the entrepreneur has characteristics and personality traits that stand out from the general population and that are favorable to the success of the entrepreneurship. This study aims at developing a measurement scale for entrepreneurial potential using the Item Response Theory. The items were generated by Santos (2008 based on a theoretical model

  12. A Method for Individualizing the Prediction of Immunogenicity of Protein Vaccines and Biologic Therapeutics: Individualized T Cell Epitope Measure (iTEM

    Directory of Open Access Journals (Sweden)

    Tobias Cohen

    2010-01-01

    Full Text Available The promise of pharmacogenomics depends on advancing predictive medicine. To address this need in the area of immunology, we developed the individualized T cell epitope measure (iTEM tool to estimate an individual's T cell response to a protein antigen based on HLA binding predictions. In this study, we validated prospective iTEM predictions using data from in vitro and in vivo studies. We used a mathematical formula that converts DRB1∗ allele binding predictions generated by EpiMatrix, an epitope-mapping tool, into an allele-specific scoring system. We then demonstrated that iTEM can be used to define an HLA binding threshold above which immune response is likely and below which immune response is likely to be absent. iTEM's predictive power was strongest when the immune response is focused, such as in subunit vaccination and administration of protein therapeutics. iTEM may be a useful tool for clinical trial design and preclinical evaluation of vaccines and protein therapeutics.

  13. Item information and discrimination functions for trinary PCM items

    NARCIS (Netherlands)

    Akkermans, Wies; Muraki, Eiji

    1997-01-01

    For trinary partial credit items the shape of the item information and the item discrimination function is examined in relation to the item parameters. In particular, it is shown that these functions are unimodal if δ2 – δ1 < 4 ln 2 and bimodal otherwise. The locations and values of the maxima are

  14. Examination of the PROMIS upper extremity item bank.

    Science.gov (United States)

    Hung, Man; Voss, Maren W; Bounsanga, Jerry; Crum, Anthony B; Tyser, Andrew R

    Clinical measurement. The psychometric properties of the PROMIS v1.2 UE item bank were tested on various samples prior to its release, but have not been fully evaluated among the orthopaedic population. This study assesses the performance of the UE item bank within the UE orthopaedic patient population. The UE item bank was administered to 1197 adult patients presenting to a tertiary orthopaedic clinic specializing in hand and UE conditions and was examined using traditional statistics and Rasch analysis. The UE item bank fits a unidimensional model (outfit MNSQ range from 0.64 to 1.70) and has adequate reliabilities (person = 0.84; item = 0.82) and local independence (item residual correlations range from -0.37 to 0.34). Only one item exhibits gender differential item functioning. Most items target low levels of function. The UE item bank is a useful clinical assessment tool. Additional items covering higher functions are needed to enhance validity. Supplemental testing is recommended for patients at higher levels of function until more high function UE items are developed. 2c. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.

  15. MEASUREMENT OF FRICTIONAL PRESSURE DIFFERENTIALS DURING A VENTILATION SURVEY

    International Nuclear Information System (INIS)

    B.S. Prosser, PE; I.M. Loomis, PE, PhD

    2003-01-01

    During the course of a ventilation survey, both airflow quantity and frictional pressure losses are measured and quantified. The measurement of airflow has been extensively studied as the vast majority of ventilation standards/regulations are tied to airflow quantity or velocity. However, during the conduct of a ventilation survey, measurement of airflow only represents half of the necessary parameters required to directly calculate the airway resistance. The measurement of frictional pressure loss is an often misunderstood and misapplied part of the ventilation survey. This paper compares the two basic methods of frictional pressure drop measurements; the barometer and the gauge and tube. Personal experiences with each method will be detailed along with the authors' opinions regarding the applicability and conditions favoring each method

  16. Developing, testing, and implementing a survey of scientist mentoring teachers as part of an RET: The GABI RET mentor survey.

    Science.gov (United States)

    Davey, B.

    2017-12-01

    The impacts of mentoring in education have been well established. Mentors have a large impact on their mentees and have been show to affect mentee attitudes towards learning, interest in subjects, future success, and more. While mentoring has a well-documented impact on the mentees, mentoring also has an impact on the mentors themselves. However, little has been studied empirically about these impacts. When we looked for a validated instrument that measured the impact of mentoring on the scientists working with the teachers, we found many anecdotal reports but no instruments that meet our specific needs. To this end, we developed, tested, and implemented our own instrument for measuring the impacts of mentoring on our scientist mentors. Our instrument contained both quantitative and qualitative items designed to reveal the effects of mentoring in two areas: 1) cognitive domain (mentoring, teaching, understanding K-12) and 2) affective domain (professional, personal, participation). We first shared our survey with experts in survey development and mentoring, gathered their feedback, and incorporated their suggestions into our instrument. We then had a subsection of our mentors complete the survey and then complete it again three to four days later (test-retest). Our survey has a high correlation for the test-retest quantitative items (0.93) and a high correlation (0.90) between the three reviewers of the qualitative items. From our findings, we feel we have a validated instrument (face, content, and contruct validity) that answers our research questions reliably. Our contribution to the study of mentoring of science teachers reveals a broad range of impacts on the mentors themselves including an improved understanding of the challenges of classroom teaching, a recognition of the importance of scientists working with science teachers, an enhanced ability to communicate their research and findings, and an increased interest and excitement for their own work.

  17. Merit Principles Survey 2016 Data

    Data.gov (United States)

    Merit Systems Protection Board — MPS contains a combination of core items that MSPB tracks over time and special-purpose items developed to support a particular special study. This survey differs...

  18. 2012 Workplace and Gender Relations Survey of Reserve Component Members (Survey Note No. 2013-002)

    Science.gov (United States)

    2013-01-18

    items regarding unwanted attempts to establish a sexual relationship – Sexual Coercion – four items regarding classic quid pro quo instances of...Department of Defense (DoD) continues to emphasize sexual assault and sexual harassment response and prevention in the Reserve components. This survey...survey assesses the prevalence of sexual assault and sexual harassment and other gender-related issues in the National Guard and Reserves. This

  19. 2012 Workplace and Gender Relations Survey of Active Duty Members (Survey Note No. 2013-002)

    Science.gov (United States)

    2013-01-18

    Attention – four items regarding unwanted attempts to establish a sexual relationship – Sexual Coercion – four items regarding classic quid pro quo ...of Defense (DoD) continues to emphasize sexual assault and sexual harassment response and prevention in the military. This survey note discusses...assault and sexual harassment in the active duty force. This survey note and accompanying briefing (Appendix) provide information on the prevalence

  20. Utilising a multi-item questionnaire to assess household food security in Australia.

    Science.gov (United States)

    Butcher, Lucy M; O'Sullivan, Therese A; Ryan, Maria M; Lo, Johnny; Devine, Amanda

    2018-03-15

    Currently, two food sufficiency questions are utilised as a proxy measure of national food security status in Australia. These questions do not capture all dimensions of food security and have been attributed to underreporting of the problem. The purpose of this study was to investigate food security using the short form of the US Household Food Security Survey Module (HFSSM) within an Australian context; and explore the relationship between food security status and multiple socio-demographic variables. Two online surveys were completed by 2334 Australian participants from November 2014 to February 2015. Surveys contained the short form of the HFSSM and twelve socio-demographic questions. Cross-tabulations chi-square tests and a multinomial logistic regression model were employed to analyse the survey data. Food security status of the respondents was classified accordingly: High or Marginal (64%, n = 1495), Low (20%, n = 460) or Very Low (16%, n = 379). Significant independent predictors of food security were age (P important issue across Australia and that certain groups, regardless of income, are particularly vulnerable. Government policy and health promotion interventions that specifically target "at risk" groups may assist to more effectively address the problem. Additionally, the use of a multi-item measure is worth considering as a national indicator of food security in Australia. © 2018 Australian Health Promotion Association.

  1. Development and evaluation of a new survey instrument to measure the quality of colorectal cancer screening decisions.

    Science.gov (United States)

    Sepucha, Karen R; Feibelmann, Sandra; Cosenza, Carol; Levin, Carrie A; Pignone, Michael

    2014-08-20

    Guidelines for colorectal cancer screening recommend that patients be informed about options and be able to select preferred method of screening; however, there are no existing measures available to assess whether this happens. Colorectal Cancer Screening Decision Quality Instrument (CRC-DQI) includes knowledge items and patients' goals and concerns. Items were generated through literature review and qualitative work with patients and providers. Hypotheses relating to the acceptability, feasibility, discriminant validity and retest reliability of the survey were examined using data from three studies: (1) 2X2 randomized study of participants recruited online, (2) cross-sectional sample of patients recruited in community health clinics, and (3) cross-sectional sample of providers recruited from American Medical Association Master file. 338 participants were recruited online, 94 participants were recruited from community health centers, and 115 physicians were recruited. The CRC-DQI was feasible and acceptable with low missing data and high response rates for both online and paper-based administrations. The knowledge score was able to discriminate between those who had seen a decision aid or not (84% vs. 64%, p quality decisions about colon cancer screening.

  2. Item level diagnostics and model - data fit in item response theory ...

    African Journals Online (AJOL)

    Item response theory (IRT) is a framework for modeling and analyzing item response data. Item-level modeling gives IRT advantages over classical test theory. The fit of an item score pattern to an item response theory (IRT) models is a necessary condition that must be assessed for further use of item and models that best fit ...

  3. Multiple sensitive estimation and optimal sample size allocation in the item sum technique.

    Science.gov (United States)

    Perri, Pier Francesco; Rueda García, María Del Mar; Cobo Rodríguez, Beatriz

    2018-01-01

    For surveys of sensitive issues in life sciences, statistical procedures can be used to reduce nonresponse and social desirability response bias. Both of these phenomena provoke nonsampling errors that are difficult to deal with and can seriously flaw the validity of the analyses. The item sum technique (IST) is a very recent indirect questioning method derived from the item count technique that seeks to procure more reliable responses on quantitative items than direct questioning while preserving respondents' anonymity. This article addresses two important questions concerning the IST: (i) its implementation when two or more sensitive variables are investigated and efficient estimates of their unknown population means are required; (ii) the determination of the optimal sample size to achieve minimum variance estimates. These aspects are of great relevance for survey practitioners engaged in sensitive research and, to the best of our knowledge, were not studied so far. In this article, theoretical results for multiple estimation and optimal allocation are obtained under a generic sampling design and then particularized to simple random sampling and stratified sampling designs. Theoretical considerations are integrated with a number of simulation studies based on data from two real surveys and conducted to ascertain the efficiency gain derived from optimal allocation in different situations. One of the surveys concerns cannabis consumption among university students. Our findings highlight some methodological advances that can be obtained in life sciences IST surveys when optimal allocation is achieved. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Measurement and control of bias in patient reported outcomes using multidimensional item response theory.

    Science.gov (United States)

    Dowling, N Maritza; Bolt, Daniel M; Deng, Sien; Li, Chenxi

    2016-05-26

    Patient-reported outcome (PRO) measures play a key role in the advancement of patient-centered care research. The accuracy of inferences, relevance of predictions, and the true nature of the associations made with PRO data depend on the validity of these measures. Errors inherent to self-report measures can seriously bias the estimation of constructs assessed by the scale. A well-documented disadvantage of self-report measures is their sensitivity to response style (RS) effects such as the respondent's tendency to select the extremes of a rating scale. Although the biasing effect of extreme responding on constructs measured by self-reported tools has been widely acknowledged and studied across disciplines, little attention has been given to the development and systematic application of methodologies to assess and control for this effect in PRO measures. We review the methodological approaches that have been proposed to study extreme RS effects (ERS). We applied a multidimensional item response theory model to simultaneously estimate and correct for the impact of ERS on trait estimation in a PRO instrument. Model estimates were used to study the biasing effects of ERS on sum scores for individuals with the same amount of the targeted trait but different levels of ERS. We evaluated the effect of joint estimation of multiple scales and ERS on trait estimates and demonstrated the biasing effects of ERS on these trait estimates when used as explanatory variables. A four-dimensional model accounting for ERS bias provided a better fit to the response data. Increasing levels of ERS showed bias in total scores as a function of trait estimates. The effect of ERS was greater when the pattern of extreme responding was the same across multiple scales modeled jointly. The estimated item category intercepts provided evidence of content independent category selection. Uncorrected trait estimates used as explanatory variables in prediction models showed downward bias. A

  5. Validation of the alcohol use item banks from the Patient-Reported Outcomes Measurement Information System (PROMIS).

    Science.gov (United States)

    Pilkonis, Paul A; Yu, Lan; Dodds, Nathan E; Johnston, Kelly L; Lawrence, Suzanne M; Daley, Dennis C

    2016-04-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS) includes five item banks for alcohol use. There are limited data, however, regarding their validity (e.g., convergent validity, responsiveness to change). To provide such data, we conducted a prospective study with 225 outpatients being treated for substance abuse. Assessments were completed shortly after intake and at 1-month and 3-month follow-ups. The alcohol item banks were administered as computerized adaptive tests (CATs). Fourteen CATs and one six-item short form were also administered from eight other PROMIS domains to generate a comprehensive health status profile. After modeling treatment outcome for the sample as a whole, correlates of outcome from the PROMIS health status profile were examined. For convergent validity, the largest correlation emerged between the PROMIS alcohol use score and the Alcohol Use Disorders Identification Test (r=.79 at intake). Regarding treatment outcome, there were modest changes across the target problem of alcohol use and other domains of the PROMIS health status profile. However, significant heterogeneity was found in initial severity of drinking and in rates of change for both abstinence and severity of drinking during follow-up. This heterogeneity was associated with demographic (e.g., gender) and health-profile (e.g., emotional support, social participation) variables. The results demonstrated the validity of PROMIS CATs, which require only 4-6 items in each domain. This efficiency makes it feasible to use a comprehensive health status profile within the substance use treatment setting, providing important prognostic information regarding abstinence and severity of drinking. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  6. Clinically important deterioration in patients undergoing lumbar spine surgery: a choice of evaluation methods using the Oswestry Disability Index, 36-Item Short Form Health Survey, and pain scales: clinical article.

    Science.gov (United States)

    Gum, Jeffrey L; Glassman, Steven D; Carreon, Leah Y

    2013-11-01

    Health-related quality of life (HRQOL) measures have become the mainstay for outcome appraisal in spine surgery. Clinically meaningful interpretation of HRQOL improvement has centered on the minimum clinically important difference (MCID). The purpose of this study was to calculate clinically important deterioration (CIDET) thresholds and determine a CIDET value for each HRQOL measure for patients undergoing lumbar fusion. Seven hundred twenty-two patients (248 males, 127 smokers, mean age 60.8 years) were identified with complete preoperative and 1-year postoperative HRQOLs including the Oswestry Disability Index (ODI), 36-Item Short Form Health Survey (SF-36), and numeric rating scales (0-10) for back and leg pain following primary, instrumented, posterior lumbar fusion. Anchor-based and distribution-based methods were used to calculate CIDET for each HRQOL. Anchor-based methods included change score, change difference, and receiver operating characteristic curve analysis. The Health Transition Item, an independent item of the SF-36, was used as the external anchor. Patients who responded "somewhat worse" and "much worse" were combined and compared with patients responding "about the same." Distribution-based methods were minimum detectable change and effect size. Diagnoses included spondylolisthesis (n = 332), scoliosis (n = 54), instability (n = 37), disc pathology (n = 146), and stenosis (n = 153). There was a statistically significant change (p < 0.0001) for each HRQOL measure from preoperatively to 1-year postoperatively. Only 107 patients (15%) reported being "somewhat worse" (n = 81) or "much worse" (n = 26). Calculation methods yielded a range of CIDET values for ODI (0.17-9.06), SF-36 physical component summary (-0.32 to 4.43), back pain (0.02-1.50), and leg pain (0.02-1.50). A threshold for clinical deterioration was difficult to identify. This may be due to the small number of patients reporting being worse after surgery and the variability across

  7. A critical review of survey-based research in supply chain integration

    NARCIS (Netherlands)

    van der Vaart, Taco; van Donk, Dirk Pieter

    Supply chain (SC) integration is considered one of the major factors in improving performance. Based upon some concerns regarding the constructs, measurements and items used, this paper analyses survey-based research with respect to the relationship between SC integration and performance. The review

  8. The Blood Donor Anxiety Scale: a six-item state anxiety measure based on the Spielberger State-Trait Anxiety Inventory.

    Science.gov (United States)

    Chell, Kathleen; Waller, Daniel; Masser, Barbara

    2016-06-01

    Research demonstrates that anxiety elevates the risk of blood donors experiencing adverse events, which in turn deters the performance of repeat blood donations. Identifying donors suffering from heightened state anxiety is important to assess the impact of evidence-based interventions. This study analyzed the appropriateness of a shortened version of the state subscale of the State-Trait Anxiety Inventory (STAI) in a blood donation context. STAI-State questionnaire data were collected from two separate samples of Australian blood donors (n = 919 and n = 824 after cleaning). Responses to demographic, donation history, and adverse reaction questions were also obtained. Identification of items and analysis was performed systematically to assess and compare internal reliability and content, construct, convergent, and criterion validity of three potential short-form state anxiety scales. Of the three short-form scales tested, STAI-State six-item scale demonstrated the best metric properties with the least number of items across both sample groups. Cronbach's alpha was acceptable (α = 0.844 and α = 0.820), correlated positively with the original measure (r = 0.927 and r = 0.931) and criterion-related variables, and maintained the two-dimension factorial structure of the original measure. The six-item short version of the STAI-State subscale presented the most reliable and valid scale for use with blood donors. A validated donor anxiety tool provides a standardized assessment and record of donor anxiety to gauge the effectiveness of ongoing efforts to enhance the donation experience. © 2016 AABB.

  9. Item bias in self-reported functional ability among 75-year-old men and women in three Nordic localities

    DEFF Research Database (Denmark)

    Avlund, K; Era, P; Davidsen, M

    1996-01-01

    to geographical locality and gender. Information about self-reported functional ability was gathered from surveys on 75-year-old men and women in Glostrup (Denmark), Göteborg (Sweden) and Jyväskylä (Finland). The data were collected by structured home interviews about mobility and Physical activities of daily......The purpose of this article is to analyse item bias in a measure of self-reported functional ability among 75-year-old people in three Nordic localities. The present item bias analysis examines whether the construction of a functional ability index from several variables results in bias in relation...... living (PADL) in relation to tiredness, reduced speed and dependency and combined into three tiredness-scales, three reduced speed-scales and two dependency-scales. The analysis revealed item bias regarding geographical locality in seven out of eight of the functional ability scales, but nearly no bias...

  10. Polytomous latent scales for the investigation of the ordering of items

    NARCIS (Netherlands)

    Ligtvoet, R.; van der Ark, L.A.; Bergsma, W. P.; Sijtsma, K.

    2011-01-01

    We propose three latent scales within the framework of nonparametric item response theory for polytomously scored items. Latent scales are models that imply an invariant item ordering, meaning that the order of the items is the same for each measurement value on the latent scale. This ordering

  11. Few items in the thyroid-related quality of life instrument ThyPRO exhibited differential item functioning.

    Science.gov (United States)

    Watt, Torquil; Groenvold, Mogens; Hegedüs, Laszlo; Bonnema, Steen Joop; Rasmussen, Åse Krogh; Feldt-Rasmussen, Ulla; Bjorner, Jakob Bue

    2014-02-01

    To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis. A total of 838 patients with benign thyroid diseases completed the ThyPRO questionnaire (84 five-point items, 13 scales). Uniform and nonuniform DIF were investigated using ordinal logistic regression, testing for both statistical significance and magnitude (∆R(2) > 0.02). Scale level was estimated by the sum score, after purification. Twenty instances of DIF in 17 of the 84 items were found. Eight according to diagnosis, where the goiter scale was the one most affected, possibly due to differing perceptions in patients with auto-immune thyroid diseases compared to patients with simple goiter. Eight DIFs according to age were found, of which 5 were in positively worded items, which younger patients were more likely to endorse; one according to gender: women were more likely to report crying, and three according to educational level. The vast majority of DIF had only minor influence on the scale scores (0.1-2.3 points on the 0-100 scales), but two DIF corresponded to a difference of 4.6 and 9.8, respectively. Ordinal logistic regression identified DIF in 17 of 84 items. The potential impact of this on the present scales was low, but items displaying DIF could be avoided when developing abbreviated scales, where the potential impact of DIF (due to fewer items) will be larger.

  12. 10 CFR 74.55 - Item monitoring.

    Science.gov (United States)

    2010-01-01

    ... Quantities of Strategic Special Nuclear Material § 74.55 Item monitoring. (a) Licensees subject to § 74.51... quantitatively measured, the validity of that measurement independently confirmed, and that additionally have..., except for reactor components measuring at least one meter in length and weighing in excess of 30...

  13. Eating Well While Dining Out: Collaborating with Local Restaurants to Promote Heart Healthy Menu Items

    Science.gov (United States)

    Thayer, Linden M.; Pimentel, Daniela C.; Smith, Janice C.; Garcia, Beverly A.; Lee Sylvester, Laura; Kelly, Tammy; Johnston, Larry F.; Ammerman, Alice S.; Keyserling, Thomas C.

    2017-01-01

    Background As Americans commonly consume restaurant foods with poor dietary quality, effective interventions are needed to improve food choices at restaurants. Purpose To design and evaluate a restaurant-based intervention to help customers select and restaurants promote heart healthy menu items with healthful fats and high quality carbohydrates. Methods The intervention included table tents outlining 10 heart healthy eating tips, coupons promoting healthy menu items, an information brochure, and link to study website. Pre and post intervention surveys were completed by restaurant managers and customers completed a brief “intercept” survey. Results Managers (n = 10) reported the table tents and coupons were well received, and several noted improved personal nutrition knowledge. Overall, 4214 coupons were distributed with 1244 (30%) redeemed. Of 300 customers surveyed, 126 (42%) noticed the table tents and of these, 115 (91%) considered the nutrition information helpful, 42 (33%) indicated the information influenced menu items purchased, and 91 (72%) reported the information will influence what they order in the future. Discussion The intervention was well-received by restaurant managers and positively influenced menu item selection by many customers. Translation to Health Education Practice Further research is needed to assess effective strategies for scaling up and sustaining this intervention approach. PMID:28947925

  14. Lateral Violence in Nursing Survey: Instrument Development and Validation

    Directory of Open Access Journals (Sweden)

    Lynne S. Nemeth

    2017-07-01

    Full Text Available An examination of the psychometric properties of the Lateral Violence in Nursing Survey (LVNS, an instrument previously developed to measure the perceived incidence and severity of lateral violence (LV in the nursing workplace, was carried out. Conceptual clustering and principal components analysis were used with survey responses from 663 registered nurses and ancillary nursing staff in a southeastern tertiary care medical center. Where appropriate, Cronbach’s alpha (α evaluated internal consistency. The prevalence/severity of lateral violence items constitute two distinct subscales (LV by self and others with Cronbach’s alpha of 0.74 and 0.86, respectively. The items asking about potential causes of LV are unidimensional and internally consistent (alpha = 0.77 but there is no conceptually coherent theme underlying the various causes. Respondents rating a potential LV cause as “major” scored higher on both prevalence/severity subscales than those rating it a “minor” cause or not a cause. Subsets of items on the LVNS are internally reliable, supporting construct validity. Revisions of the original LVNS instrument will improve its use in future work.

  15. Assessment of Differential Item Functioning in the Experiences of Discrimination Index

    Science.gov (United States)

    Cunningham, Timothy J.; Berkman, Lisa F.; Gortmaker, Steven L.; Kiefe, Catarina I.; Jacobs, David R.; Seeman, Teresa E.; Kawachi, Ichiro

    2011-01-01

    The psychometric properties of instruments used to measure self-reported experiences of discrimination in epidemiologic studies are rarely assessed, especially regarding construct validity. The authors used 2000–2001 data from the Coronary Artery Risk Development in Young Adults (CARDIA) Study to examine differential item functioning (DIF) in 2 versions of the Experiences of Discrimination (EOD) Index, an index measuring self-reported experiences of racial/ethnic and gender discrimination. DIF may confound interpretation of subgroup differences. Large DIF was observed for 2 of 7 racial/ethnic discrimination items: White participants reported more racial/ethnic discrimination for the “at school” item, and black participants reported more racial/ethnic discrimination for the “getting housing” item. The large DIF by race/ethnicity in the index for racial/ethnic discrimination probably reflects item impact and is the result of valid group differences between blacks and whites regarding their respective experiences of discrimination. The authors also observed large DIF by race/ethnicity for 3 of 7 gender discrimination items. This is more likely to have been due to item bias. Users of the EOD Index must consider the advantages and disadvantages of DIF adjustment (omitting items, constructing separate measures, and retaining items). The EOD Index has substantial usefulness as an instrument that can assess self-reported experiences of discrimination. PMID:22038104

  16. Developing and testing items for the South African Personality Inventory (SAPI

    Directory of Open Access Journals (Sweden)

    Carin Hill

    2013-11-01

    Research purpose: This article reports on the process of identifying items for, and provides a quantitative evaluation of, the South African Personality Inventory (SAPI items. Motivation for the study: The study intended to develop an indigenous and psychometrically sound personality instrument that adheres to the requirements of South African legislation and excludes cultural bias. Research design, approach and method: The authors used a cross-sectional design. They measured the nine SAPI clusters identified in the qualitative stage of the SAPI project in 11 separate quantitative studies. Convenience sampling yielded 6735 participants. Statistical analysis focused on the construct validity and reliability of items. The authors eliminated items that showed poor performance, based on common psychometric criteria, and selected the best performing items to form part of the final version of the SAPI. Main findings: The authors developed 2573 items from the nine SAPI clusters. Of these, 2268 items were valid and reliable representations of the SAPI facets. Practical/managerial implications: The authors developed a large item pool. It measures personality in South Africa. Researchers can refine it for the SAPI. Furthermore, the project illustrates an approach that researchers can use in projects that aim to develop culturally-informed psychological measures. Contribution/value-add: Personality assessment is important for recruiting, selecting and developing employees. This study contributes to the current knowledge about the early processes researchers follow when they develop a personality instrument that measures personality fairly in different cultural groups, as the SAPI does.

  17. Mathematical-programming approaches to test item pool design

    NARCIS (Netherlands)

    Veldkamp, Bernard P.; van der Linden, Willem J.; Ariel, A.

    2002-01-01

    This paper presents an approach to item pool design that has the potential to improve on the quality of current item pools in educational and psychological testing andhence to increase both measurement precision and validity. The approach consists of the application of mathematical programming

  18. Determining the Measurement Quality of a Montessori High School Teacher Evaluation Survey

    Directory of Open Access Journals (Sweden)

    Anthony Philip Setari

    2017-05-01

    Full Text Available The purpose of this study was to conduct a psychometric validation of a course evaluation instrument, known as a student evaluation of teaching (SET, implemented in a Montessori high school. The authors demonstrate to the Montessori community how to rigorously examine the measurement and assessment quality of instruments used within Montessori schools. The Montessori high school community needs an SET that has been rigorously examined for measurement issues. The examined SET was developed by a Montessori high school, and the sample data were collected from Montessori high school students. Using a Rasch partial credit model, the results of the analysis identified several measurement issues, including multidimensionality, misfit items, and inappropriate item difficulty levels. A revised version of the SET underwent the same analysis procedure, and the results indicated that measurement issues persisted. The authors suggest several ways to improve the overall measurement quality of the instrument while keeping the Montessori foundation. Additional validation studies with a revised version of the SET will be needed before the instrument can be endorsed for full implementation in a Montessori setting.

  19. Cosmological measurements with forthcoming radio continuum surveys

    CSIR Research Space (South Africa)

    Raccanelli, A

    2012-08-01

    Full Text Available , while the best measurements of dark energy models will come from galaxy autocorrelation function analyses. Using a combination of the EvolutionaryMap of the Universe (EMU) and WODAN to provide a full-sky survey, it will be possible to measure the dark...

  20. The Body Appreciation Scale-2: item refinement and psychometric evaluation.

    Science.gov (United States)

    Tylka, Tracy L; Wood-Barcalow, Nichole L

    2015-01-01

    Considered a positive body image measure, the 13-item Body Appreciation Scale (BAS; Avalos, Tylka, & Wood-Barcalow, 2005) assesses individuals' acceptance of, favorable opinions toward, and respect for their bodies. While the BAS has accrued psychometric support, we improved it by rewording certain BAS items (to eliminate sex-specific versions and body dissatisfaction-based language) and developing additional items based on positive body image research. In three studies, we examined the reworded, newly developed, and retained items to determine their psychometric properties among college and online community (Amazon Mechanical Turk) samples of 820 women and 767 men. After exploratory factor analysis, we retained 10 items (five original BAS items). Confirmatory factor analysis upheld the BAS-2's unidimensionality and invariance across sex and sample type. Its internal consistency, test-retest reliability, and construct (convergent, incremental, and discriminant) validity were supported. The BAS-2 is a psychometrically sound positive body image measure applicable for research and clinical settings. Copyright © 2014 Elsevier Ltd. All rights reserved.

  1. Health Information National Trends Survey in American Sign Language (HINTS-ASL): Protocol for the Cultural Adaptation and Linguistic Validation of a National Survey.

    Science.gov (United States)

    Kushalnagar, Poorna; Harris, Raychelle; Paludneviciene, Raylene; Hoglind, TraciAnn

    2017-09-13

    The Health Information National Trends Survey (HINTS) collects nationally representative data about the American's public use of health-related information. This survey is available in English and Spanish, but not in American Sign Language (ASL). Thus, the exclusion of ASL users from these national health information survey studies has led to a significant gap in knowledge of Internet usage for health information access in this underserved and understudied population. The objectives of this study are (1) to culturally adapt and linguistically translate the HINTS items to ASL (HINTS-ASL); and (2) to gather information about deaf people's health information seeking behaviors across technology-mediated platforms. We modified the standard procedures developed at the US National Center for Health Statistics Cognitive Survey Laboratory to culturally adapt and translate HINTS items to ASL. Cognitive interviews were conducted to assess clarity and delivery of these HINTS-ASL items. Final ASL video items were uploaded to a protected online survey website. The HINTS-ASL online survey has been administered to over 1350 deaf adults (ages 18 to 90 and up) who use ASL. Data collection is ongoing and includes deaf adult signers across the United States. Some items from HINTS item bank required cultural adaptation for use with deaf people who use accessible services or technology. A separate item bank for deaf-related experiences was created, reflecting deaf-specific technology such as sharing health-related ASL videos through social network sites and using video remote interpreting services in health settings. After data collection is complete, we will conduct a series of analyses on deaf people's health information seeking behaviors across technology-mediated platforms. HINTS-ASL is an accessible health information national trends survey, which includes a culturally appropriate set of items that are relevant to the experiences of deaf people who use ASL. The final HINTS

  2. A Bifactor Multidimensional Item Response Theory Model for Differential Item Functioning Analysis on Testlet-Based Items

    Science.gov (United States)

    Fukuhara, Hirotaka; Kamata, Akihito

    2011-01-01

    A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…

  3. Development and implementation of a local government survey to measure community supports for healthy eating and active living

    Directory of Open Access Journals (Sweden)

    Latetia V Moore

    2017-06-01

    Full Text Available The ability to make healthy choices is influenced by where one lives, works, shops, and plays. Locally enacted policies and standards can influence these surroundings but little is known about the prevalence of such policies and standards that support healthier behaviors. In this paper, we describe the development of a survey questionnaire designed to capture local level policy supports for healthy eating and active living and findings and lessons learned from a 2012 pilot in two states, Minnesota and California, including respondent burden, survey sampling and administration methods, and survey item feasibility issues. A 38-item, web-based, self-administered survey and sampling frame were developed to assess the prevalence of 22 types of healthy eating and active living policies in a representative sample of local governments in the two states. The majority of respondents indicated the survey required minimal effort to complete with half taking <20 min to complete the survey. A non-response follow-up plan including emails and phone calls was required to achieve a 68% response rate (versus a 37% response rate for email only reminders. Local governments with larger residential populations reported having healthy eating and active living policies and standards more often than smaller governments. Policies that support active living were more common than those that support healthy eating and varied within the two states. The methods we developed are a feasible data collection tool for estimating the prevalence of municipal healthy eating and active living policies and standards at the state and national level.

  4. Methodology for the development and calibration of the SCI-QOL item banks.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Victorson, David; Choi, Seung W; Gershon, Richard; Heinemann, Allen W; Cella, David

    2015-05-01

    To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Individual interviews (n=44) and focus groups (n=65 individuals with SCI and n=42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n=877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n=245) to assess test-retest reliability and stability. A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury--Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM.

  5. Differential Item Functioning in the SF-36 Physical Functioning and Mental Health Sub-Scales: A Population-Based Investigation in the Canadian Multicentre Osteoporosis Study.

    Science.gov (United States)

    Lix, Lisa M; Wu, Xiuyun; Hopman, Wilma; Mayo, Nancy; Sajobi, Tolulope T; Liu, Juxin; Prior, Jerilynn C; Papaioannou, Alexandra; Josse, Robert G; Towheed, Tanveer E; Davison, K Shawn; Sawatzky, Richard

    2016-01-01

    Self-reported health status measures, like the Short Form 36-item Health Survey (SF-36), can provide rich information about the overall health of a population and its components, such as physical, mental, and social health. However, differential item functioning (DIF), which arises when population sub-groups with the same underlying (i.e., latent) level of health have different measured item response probabilities, may compromise the comparability of these measures. The purpose of this study was to test for DIF on the SF-36 physical functioning (PF) and mental health (MH) sub-scale items in a Canadian population-based sample. Study data were from the prospective Canadian Multicentre Osteoporosis Study (CaMos), which collected baseline data in 1996-1997. DIF was tested using a multiple indicators multiple causes (MIMIC) method. Confirmatory factor analysis defined the latent variable measurement model for the item responses and latent variable regression with demographic and health status covariates (i.e., sex, age group, body weight, self-perceived general health) produced estimates of the magnitude of DIF effects. The CaMos cohort consisted of 9423 respondents; 69.4% were female and 51.7% were less than 65 years. Eight of 10 items on the PF sub-scale and four of five items on the MH sub-scale exhibited DIF. Large DIF effects were observed on PF sub-scale items about vigorous and moderate activities, lifting and carrying groceries, walking one block, and bathing or dressing. On the MH sub-scale items, all DIF effects were small or moderate in size. SF-36 PF and MH sub-scale scores were not comparable across population sub-groups defined by demographic and health status variables due to the effects of DIF, although the magnitude of this bias was not large for most items. We recommend testing and adjusting for DIF to ensure comparability of the SF-36 in population-based investigations.

  6. Differential Item Functioning in the SF-36 Physical Functioning and Mental Health Sub-Scales: A Population-Based Investigation in the Canadian Multicentre Osteoporosis Study.

    Directory of Open Access Journals (Sweden)

    Lisa M Lix

    Full Text Available Self-reported health status measures, like the Short Form 36-item Health Survey (SF-36, can provide rich information about the overall health of a population and its components, such as physical, mental, and social health. However, differential item functioning (DIF, which arises when population sub-groups with the same underlying (i.e., latent level of health have different measured item response probabilities, may compromise the comparability of these measures. The purpose of this study was to test for DIF on the SF-36 physical functioning (PF and mental health (MH sub-scale items in a Canadian population-based sample.Study data were from the prospective Canadian Multicentre Osteoporosis Study (CaMos, which collected baseline data in 1996-1997. DIF was tested using a multiple indicators multiple causes (MIMIC method. Confirmatory factor analysis defined the latent variable measurement model for the item responses and latent variable regression with demographic and health status covariates (i.e., sex, age group, body weight, self-perceived general health produced estimates of the magnitude of DIF effects.The CaMos cohort consisted of 9423 respondents; 69.4% were female and 51.7% were less than 65 years. Eight of 10 items on the PF sub-scale and four of five items on the MH sub-scale exhibited DIF. Large DIF effects were observed on PF sub-scale items about vigorous and moderate activities, lifting and carrying groceries, walking one block, and bathing or dressing. On the MH sub-scale items, all DIF effects were small or moderate in size.SF-36 PF and MH sub-scale scores were not comparable across population sub-groups defined by demographic and health status variables due to the effects of DIF, although the magnitude of this bias was not large for most items. We recommend testing and adjusting for DIF to ensure comparability of the SF-36 in population-based investigations.

  7. Development of a survey tool to assess and monitor the influence of food budget restraint on healthy eating, food related climate impact and quality of life

    DEFF Research Database (Denmark)

    Nielsen, Annemette Ljungdalh; Holm, Lotte; Lund, Thomas Bøker

    This documentation describes the development of a survey tool designed to: 1) measure how different levels of constraints on food budgets are associated to outcomes of healthy eating, environmental sustainability and life quality for individuals in Denmark, and 2) explore how these different...... outcomes are related to strategies people employ to cope with restricted food budgets. The resulting survey consists of a total of 63 question items. The paper lays out the various steps involved in the process of developing the survey tool, presents the final survey items included in the tool...

  8. Development and validation of a new survey: Perceptions of Teaching as a Profession (PTaP)

    Science.gov (United States)

    Adams, Wendy

    2017-01-01

    To better understand the impact of efforts to train more science teachers such as the PhysTEC Project and to help with early identification of future teachers, we are developing the survey of Perceptions of Teaching as a Profession (PTaP) to measure students' views of teaching as a career, their interest in teaching and the perceived climate of physics departments towards teaching as a profession. The instrument consists of a series of statements which require a response using a 5-point Likert-scale and can be easily administered online. The survey items were drafted by a team of researchers and physics teacher candidates and then reviewed by an advisory committee of 20 physics teacher educators and practicing teachers. We conducted 27 interviews with both teacher candidates and non-teaching STEM majors. The survey was refined through an iterative process of student interviews and item clarification until all items were interpreted consistently and answered for consistent reasons. In this presentation the preliminary results from the student interviews as well as the results of item analysis and a factor analysis on 900 student responses will be shared.

  9. The Servant Leadership Survey: Development and Validation of a Multidimensional Measure

    NARCIS (Netherlands)

    D. van Dierendonck (Dirk); I.A.P.M. Nuijten (Inge)

    2011-01-01

    textabstractPurpose: The purpose of this paper is to describe the development and validation of a multi-dimensional instrument to measure servant leadership. Design/Methodology/Approach Based on an extensive literature review and expert judgment, 99 items were formulated. In three steps, using

  10. Measurement equivalence of the KINDL questionnaire across child self-reports and parent proxy-reports: a comparison between item response theory and ordinal logistic regression.

    Science.gov (United States)

    Jafari, Peyman; Sharafi, Zahra; Bagheri, Zahra; Shalileh, Sara

    2014-06-01

    Measurement equivalence is a necessary assumption for meaningful comparison of pediatric quality of life rated by children and parents. In this study, differential item functioning (DIF) analysis is used to examine whether children and their parents respond consistently to the items in the KINDer Lebensqualitätsfragebogen (KINDL; in German, Children Quality of Life Questionnaire). Two DIF detection methods, graded response model (GRM) and ordinal logistic regression (OLR), were applied for comparability. The KINDL was completed by 1,086 school children and 1,061 of their parents. While the GRM revealed that 12 out of the 24 items were flagged with DIF, the OLR identified 14 out of the 24 items with DIF. Seven items with DIF and five items without DIF were common across the two methods, yielding a total agreement rate of 50 %. This study revealed that parent proxy-reports cannot be used as a substitute for a child's ratings in the KINDL.

  11. Nurse Religiosity and Spiritual Care: An Online Survey.

    Science.gov (United States)

    Taylor, Elizabeth Johnston; Gober-Park, Carla; Schoonover-Shoffner, Kathy; Mamier, Iris; Somaiya, Chintan K; Bahjri, Khaled

    2017-08-01

    This study measured the frequency of nurse-provided spiritual care and how it is associated with various facets of nurse religiosity. Data were collected using an online survey accessed from the home page of the Journal of Christian Nursing. The survey included the Nurse Spiritual Care Therapeutics Scale, six scales quantifying facets of religiosity, and demographic and work-related items. Respondents ( N = 358) indicated high religiosity yet reported neutral responses to items about sharing personal beliefs and tentativeness of belief. Findings suggested spiritual care was infrequent. Multivariate analysis showed prayer frequency, employer support of spiritual care, and non-White ethnicity were significantly associated with spiritual care frequency (adjusted R 2 = .10). Results not only provide an indication of spiritual care frequency but empirical encouragement for nurse managers to provide a supportive environment for spiritual care. Findings expose the reality that nurse religiosity is directly related, albeit weakly, to spiritual care frequency.

  12. Hospital safety climate surveys: measurement issues.

    Science.gov (United States)

    Jackson, Jeanette; Sarac, Cakil; Flin, Rhona

    2010-12-01

    Organizational safety culture relates to behavioural norms in the workplace and is usually assessed by safety climate surveys. These can be a diagnostic indicator on the state of safety in a hospital. This review examines recent studies using staff surveys of hospital safety climate, focussing on measurement issues. Four questionnaires (hospital survey on patient safety culture, safety attitudes questionnaire, patient safety climate in healthcare organizations, hospital safety climate scale), with acceptable psychometric properties, are now applied across countries and clinical settings. Comparisons for benchmarking must be made with caution in case of questionnaire modifications. Increasing attention is being paid to the unit and hospital level wherein distinct cultures may be located, as well as to associated measurement and study design issues. Predictive validity of safety climate is tested against safety behaviours/outcomes, with some relationships reported, although effects may be specific to professional groups/units. Few studies test the role of intervening variables that could influence the effect of climate on outcomes. Hospital climate studies are becoming a key component of healthcare safety management systems. Large datasets have established more reliable instruments that allow a more focussed investigation of the role of culture in the improvement and maintenance of staff's safety perceptions within units, as well as within hospitals.

  13. Methods for Assessing Item, Step, and Threshold Invariance in Polytomous Items Following the Partial Credit Model

    Science.gov (United States)

    Penfield, Randall D.; Myers, Nicholas D.; Wolfe, Edward W.

    2008-01-01

    Measurement invariance in the partial credit model (PCM) can be conceptualized in several different but compatible ways. In this article the authors distinguish between three forms of measurement invariance in the PCM: step invariance, item invariance, and threshold invariance. Approaches for modeling these three forms of invariance are proposed,…

  14. An Investigation of Item Type in a Standards-Based Assessment.

    Directory of Open Access Journals (Sweden)

    Liz Hollingworth

    2007-12-01

    Full Text Available Large-scale state assessment programs use both multiple-choice and open-ended items on tests for accountability purposes. Certainly, there is an intuitive belief among some educators and policy makers that open-ended items measure something different than multiple-choice items. This study examined two item formats in custom-built, standards-based tests of achievement in Reading and Mathematics at grades 3-8. In this paper, we raise questions about the value of including open-ended items, given scoring costs, time constraints, and the higher probability of missing data from test-takers.

  15. Differential item functioning analysis of the Vanderbilt Expertise Test for cars.

    Science.gov (United States)

    Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W; Van Gulick, Ana Beth; Gauthier, Isabel

    2015-01-01

    The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.

  16. Item-Level Psychometrics of the Glasgow Outcome Scale: Extended Structured Interviews.

    Science.gov (United States)

    Hong, Ickpyo; Li, Chih-Ying; Velozo, Craig A

    2016-04-01

    The Glasgow Outcome Scale-Extended (GOSE) structured interview captures critical components of activities and participation, including home, shopping, work, leisure, and family/friend relationships. Eighty-nine community dwelling adults with mild-moderate traumatic brain injury (TBI) were recruited (average = 2.7 year post injury). Nine items of the 19 items were used for the psychometrics analysis purpose. Factor analysis and item-level psychometrics were investigated using the Rasch partial-credit model. Although the principal components analysis of residuals suggests that a single measurement factor dominates the measure, the instrument did not meet the factor analysis criteria. Five items met the rating scale criteria. Eight items fit the Rasch model. The instrument demonstrated low person reliability (0.63), low person strata (2.07), and a slight ceiling effect. The GOSE demonstrated limitations in precisely measuring activities/participation for individuals after TBI. Future studies should examine the impact of the low precision of the GOSE on effect size. © The Author(s) 2016.

  17. Reliability and validity of the Spanish version of the 10-item Connor-Davidson Resilience Scale (10-item CD-RISC in young adults

    Directory of Open Access Journals (Sweden)

    García-Campayo Javier

    2011-08-01

    Full Text Available Abstract Background The 10-item Connor-Davidson Resilience Scale (10-item CD-RISC is an instrument for measuring resilience that has shown good psychometric properties in its original version in English. The aim of this study was to evaluate the validity and reliability of the Spanish version of the 10-item CD-RISC in young adults and to verify whether it is structured in a single dimension as in the original English version. Findings Cross-sectional observational study including 681 university students ranging in age from 18 to 30 years. The number of latent factors in the 10 items of the scale was analyzed by exploratory factor analysis. Confirmatory factor analysis was used to verify whether a single factor underlies the 10 items of the scale as in the original version in English. The convergent validity was analyzed by testing whether the mean of the scores of the mental component of SF-12 (MCS and the quality of sleep as measured with the Pittsburgh Sleep Index (PSQI were higher in subjects with better levels of resilience. The internal consistency of the 10-item CD-RISC was estimated using the Cronbach α test and test-retest reliability was estimated with the intraclass correlation coefficient. The Cronbach α coefficient was 0.85 and the test-retest intraclass correlation coefficient was 0.71. The mean MCS score and the level of quality of sleep in both men and women were significantly worse in subjects with lower resilience scores. Conclusions The Spanish version of the 10-item CD-RISC showed good psychometric properties in young adults and thus can be used as a reliable and valid instrument for measuring resilience. Our study confirmed that a single factor underlies the resilience construct, as was the case of the original scale in English.

  18. The basics of item response theory using R

    CERN Document Server

    Baker, Frank B

    2017-01-01

    This graduate-level textbook is a tutorial for item response theory that covers both the basics of item response theory and the use of R for preparing graphical presentation in writings about the theory. Item response theory has become one of the most powerful tools used in test construction, yet one of the barriers to learning and applying it is the considerable amount of sophisticated computational effort required to illustrate even the simplest concepts. This text provides the reader access to the basic concepts of item response theory freed of the tedious underlying calculations. It is intended for those who possess limited knowledge of educational measurement and psychometrics. Rather than presenting the full scope of item response theory, this textbook is concise and practical and presents basic concepts without becoming enmeshed in underlying mathematical and computational complexities. Clearly written text and succinct R code allow anyone familiar with statistical concepts to explore and apply item re...

  19. A comparison of Rasch item-fit and Cronbach's alpha item reduction analysis for the development of a Quality of Life scale for children and adolescents.

    Science.gov (United States)

    Erhart, M; Hagquist, C; Auquier, P; Rajmil, L; Power, M; Ravens-Sieberer, U

    2010-07-01

    This study compares item reduction analysis based on classical test theory (maximizing Cronbach's alpha - approach A), with analysis based on the Rasch Partial Credit Model item-fit (approach B), as applied to children and adolescents' health-related quality of life (HRQoL) items. The reliability and structural, cross-cultural and known-group validity of the measures were examined. Within the European KIDSCREEN project, 3019 children and adolescents (8-18 years) from seven European countries answered 19 HRQoL items of the Physical Well-being dimension of a preliminary KIDSCREEN instrument. The Cronbach's alpha and corrected item total correlation (approach A) were compared with infit mean squares and the Q-index item-fit derived according to a partial credit model (approach B). Cross-cultural differential item functioning (DIF ordinal logistic regression approach), structural validity (confirmatory factor analysis and residual correlation) and relative validity (RV) for socio-demographic and health-related factors were calculated for approaches (A) and (B). Approach (A) led to the retention of 13 items, compared with 11 items with approach (B). The item overlap was 69% for (A) and 78% for (B). The correlation coefficient of the summated ratings was 0.93. The Cronbach's alpha was similar for both versions [0.86 (A); 0.85 (B)]. Both approaches selected some items that are not strictly unidimensional and items displaying DIF. RV ratios favoured (A) with regard to socio-demographic aspects. Approach (B) was superior in RV with regard to health-related aspects. Both types of item reduction analysis should be accompanied by additional analyses. Neither of the two approaches was universally superior with regard to cultural, structural and known-group validity. However, the results support the usability of the Rasch method for developing new HRQoL measures for children and adolescents.

  20. Item response theory analyses of the Delis-Kaplan Executive Function System card sorting subtest.

    Science.gov (United States)

    Spencer, Mercedes; Cho, Sun-Joo; Cutting, Laurie E

    2018-02-02

    In the current study, we examined the dimensionality of the 16-item Card Sorting subtest of the Delis-Kaplan Executive Functioning System assessment in a sample of 264 native English-speaking children between the ages of 9 and 15 years. We also tested for measurement invariance for these items across age and gender groups using item response theory (IRT). Results of the exploratory factor analysis indicated that a two-factor model that distinguished between verbal and perceptual items provided the best fit to the data. Although the items demonstrated measurement invariance across age groups, measurement invariance was violated for gender groups, with two items demonstrating differential item functioning for males and females. Multigroup analysis using all 16 items indicated that the items were more effective for individuals whose IRT scale scores were relatively high. A single-group explanatory IRT model using 14 non-differential item functioning items showed that for perceptual ability, females scored higher than males and that scores increased with age for both males and females; for verbal ability, the observed increase in scores across age differed for males and females. The implications of these findings are discussed.

  1. Comparison of quality of life measures in a depressed population.

    Science.gov (United States)

    Wisniewski, Stephen R; Rush, A John; Bryan, Charlene; Shelton, Richard; Trivedi, Madhukar H; Marcus, Sheila; Husain, Mustafa M; Hollon, Steven D; Fava, Maurizio

    2007-03-01

    Measures of quality of life have been increasingly used in clinical trials. When designing a study, researchers must decide which quality of life measure to use. Some literature provides guidance through general recommendations, though lacks quantitative comparisons. In this report, 2 general quality of life measures, the 12-Item Short Form Health Survey (SF-12) and the Quality of Life Enjoyment and Satisfaction Questionnaire (Q-LES-Q), are compared in a depressed population. STAR*D data were used to analyze the associations among the SF-12 and the Q-LES-Q. Each measure covers 6 domains, overlapping on 5 (health, self-esteem/well-being, community/productivity, social/love relationships, leisure/creativity), with the SF-12 addressing family and the Q-LES-Q addressing living situations. Strong item-by-item associations exist only between the Q-LES-Q and the SF-12 physical health items. The 2 measures overlap on the domains covered while the lack of correlation between the 2 measures may be attributed to the perspective of each question as the Q-LES-Q measures satisfaction while the SF-12 measures the patient's perception of function.

  2. Linking Existing Instruments to Develop an Activity of Daily Living Item Bank.

    Science.gov (United States)

    Li, Chih-Ying; Romero, Sergio; Bonilha, Heather S; Simpson, Kit N; Simpson, Annie N; Hong, Ickpyo; Velozo, Craig A

    2018-03-01

    This study examined dimensionality and item-level psychometric properties of an item bank measuring activities of daily living (ADL) across inpatient rehabilitation facilities and community living centers. Common person equating method was used in the retrospective veterans data set. This study examined dimensionality, model fit, local independence, and monotonicity using factor analyses and fit statistics, principal component analysis (PCA), and differential item functioning (DIF) using Rasch analysis. Following the elimination of invalid data, 371 veterans who completed both the Functional Independence Measure (FIM) and minimum data set (MDS) within 6 days were retained. The FIM-MDS item bank demonstrated good internal consistency (Cronbach's α = .98) and met three rating scale diagnostic criteria and three of the four model fit statistics (comparative fit index/Tucker-Lewis index = 0.98, root mean square error of approximation = 0.14, and standardized root mean residual = 0.07). PCA of Rasch residuals showed the item bank explained 94.2% variance. The item bank covered the range of θ from -1.50 to 1.26 (item), -3.57 to 4.21 (person) with person strata of 6.3. The findings indicated the ADL physical function item bank constructed from FIM and MDS measured a single latent trait with overall acceptable item-level psychometric properties, suggesting that it is an appropriate source for developing efficient test forms such as short forms and computerized adaptive tests.

  3. The anticipated costs analysis and benefit items survey against performing the maintenance rule

    International Nuclear Information System (INIS)

    Hwang, M. J.; Kim, K. Y.; Yang, Z. A.

    2002-01-01

    In this paper, we surveyed the cost and benefit items and evaluated the costs against performing the Maintenance Rule. In the past, only one electric power company had provided the electricity without free competition in Korea. In these days, however, the electric power company was divided into two parts by the sources: atomic and hydraulic generation and thermal-power generation. Therefore, the generation sources that done have competitiveness at the price will be weeded out in the electric power market. Although the preferential goal is on the safe operation at the Nuclear power Plants (NPPs), if too much money is required to maintain or improve the safety of the NPP, the licensee could hesitate to adopt the program related to the safety even though it is a good one. Since the Risk-Informed Applications (RIA) have been using for a plant operation in recent, the condition of a plant might be changed. Therefore, considering the affects of the RIA, a method to keep the capability through the monitoring the maintenance effectiveness has been proposed. However, to perform this, a number of works, continuous collecting data and monitoring the maintenance effectiveness and understanding the reason of degrading capability, should be preceded. Therefore, a lot of man-hour is needed to develop and to manage the application method, and the licensee should pay the costs. Therefore, in the domestic circumstance, it is necessary to evaluate the cost to monitor the maintenance effectiveness. Hence, we are going to examine the cost to perform the MR and its anticipated benefit lists

  4. Development of an instrument to measure behavioral health function for work disability: item pool construction and factor analysis.

    Science.gov (United States)

    Marfeo, Elizabeth E; Ni, Pengsheng; Haley, Stephen M; Jette, Alan M; Bogusz, Kara; Meterko, Mark; McDonough, Christine M; Chan, Leighton; Brandt, Diane E; Rasch, Elizabeth K

    2013-09-01

    To develop a broad set of claimant-reported items to assess behavioral health functioning relevant to the Social Security disability determination processes, and to evaluate the underlying structure of behavioral health functioning for use in development of a new functional assessment instrument. Cross-sectional. Community. Item pools of behavioral health functioning were developed, refined, and field tested in a sample of persons applying for Social Security disability benefits (N=1015) who reported difficulties working because of mental or both mental and physical conditions. None. Social Security Administration Behavioral Health (SSA-BH) measurement instrument. Confirmatory factor analysis (CFA) specified that a 4-factor model (self-efficacy, mood and emotions, behavioral control, social interactions) had the optimal fit with the data and was also consistent with our hypothesized conceptual framework for characterizing behavioral health functioning. When the items within each of the 4 scales were tested in CFA, the fit statistics indicated adequate support for characterizing behavioral health as a unidimensional construct along these 4 distinct scales of function. This work represents a significant advance both conceptually and psychometrically in assessment methodologies for work-related behavioral health. The measurement of behavioral health functioning relevant to the context of work requires the assessment of multiple dimensions of behavioral health functioning. Specifically, we identified a 4-factor model solution that represented key domains of work-related behavioral health functioning. These results guided the development and scale formation of a new SSA-BH instrument. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  5. Method of locating related items in a geometric space for data mining

    Science.gov (United States)

    Hendrickson, Bruce A.

    1999-01-01

    A method for locating related items in a geometric space transforms relationships among items to geometric locations. The method locates items in the geometric space so that the distance between items corresponds to the degree of relatedness. The method facilitates communication of the structure of the relationships among the items. The method is especially beneficial for communicating databases with many items, and with non-regular relationship patterns. Examples of such databases include databases containing items such as scientific papers or patents, related by citations or keywords. A computer system adapted for practice of the present invention can include a processor, a storage subsystem, a display device, and computer software to direct the location and display of the entities. The method comprises assigning numeric values as a measure of similarity between each pairing of items. A matrix is constructed, based on the numeric values. The eigenvectors and eigenvalues of the matrix are determined. Each item is located in the geometric space at coordinates determined from the eigenvectors and eigenvalues. Proper construction of the matrix and proper determination of coordinates from eigenvectors can ensure that distance between items in the geometric space is representative of the numeric value measure of the items' similarity.

  6. Item-level factor analysis of the Self-Efficacy Scale.

    Science.gov (United States)

    Bunketorp Käll, Lina

    2014-03-01

    This study explores the internal structure of the Self-Efficacy Scale (SES) using item response analysis. The SES was previously translated into Swedish and modified to encompass all types of pain, not exclusively back pain. Data on perceived self-efficacy in 47 patients with subacute whiplash-associated disorders were derived from a previously conducted randomized-controlled trial. The item-level factor analysis was carried out using a six-step procedure. To further study the item inter-relationships and to determine the underlying structure empirically, the 20 items of the SES were also subjected to principal component analysis with varimax rotation. The analyses showed two underlying factors, named 'social activities' and 'physical activities', with seven items loading on each factor. The remaining six items of the SES appeared to measure somewhat different constructs and need to be analysed further.

  7. Re-evaluating a vision-related quality of life questionnaire with item response theory (IRT and differential item functioning (DIF analyses

    Directory of Open Access Journals (Sweden)

    Knol Dirk L

    2011-09-01

    Full Text Available Abstract Background For the Low Vision Quality Of Life questionnaire (LVQOL it is unknown whether the psychometric properties are satisfactory when an item response theory (IRT perspective is considered. This study evaluates some essential psychometric properties of the LVQOL questionnaire in an IRT model, and investigates differential item functioning (DIF. Methods Cross-sectional data were used from an observational study among visually-impaired patients (n = 296. Calibration was performed for every dimension of the LVQOL in the graded response model. Item goodness-of-fit was assessed with the S-X2-test. DIF was assessed on relevant background variables (i.e. age, gender, visual acuity, eye condition, rehabilitation type and administration type with likelihood-ratio tests for DIF. The magnitude of DIF was interpreted by assessing the largest difference in expected scores between subgroups. Measurement precision was assessed by presenting test information curves; reliability with the index of subject separation. Results All items of the LVQOL dimensions fitted the model. There was significant DIF on several items. For two items the maximum difference between expected scores exceeded one point, and DIF was found on multiple relevant background variables. Item 1 'Vision in general' from the "Adjustment" dimension and item 24 'Using tools' from the "Reading and fine work" dimension were removed. Test information was highest for the "Reading and fine work" dimension. Indices for subject separation ranged from 0.83 to 0.94. Conclusions The items of the LVQOL showed satisfactory item fit to the graded response model; however, two items were removed because of DIF. The adapted LVQOL with 21 items is DIF-free and therefore seems highly appropriate for use in heterogeneous populations of visually impaired patients.

  8. 36-Item Short Form Survey (SF-36) Versus Gait Speed As Predictor of Preclinical Mobility Disability in Older Women: The Women's Health Initiative.

    Science.gov (United States)

    Laddu, Deepika R; Wertheim, Betsy C; Garcia, David O; Woods, Nancy F; LaMonte, Michael J; Chen, Bertha; Anton-Culver, Hoda; Zaslavsky, Oleg; Cauley, Jane A; Chlebowski, Rowan; Manson, JoAnn E; Thomson, Cynthia A; Stefanick, Marcia L

    2018-04-01

    To compare the value of clinically measured gait speed with that of the self-reported Medical Outcomes Study 36-item Short-Form Survey Physical Function Index (SF-36 PF) in predicting future preclinical mobility disability (PCMD) in older women. Prospective cohort study. Forty clinical centers in the United States. Women aged 65 to 79 enrolled in the Women's Health Initiative Clinical Trials with gait speed and SF-36 assessed at baseline (1993-1998) and follow-up Years 1, 3, and 6 (N = 3,587). Women were categorized as nondecliners or decliners based on changes (from baseline to Year 1) in gait speed and SF-36 PF scores. Logistic regression models were used to estimate incident PCMD (gait speed 36 PF with that of measured gait speed. Slower baseline gait speed and lower SF-36 PF scores were associated with higher adjusted odds of PCMD at Years 3 and 6 (all P 36, decliners were 1.42 times as likely to have developed PCMD by Year 3 and 1.49 times as likely by Year 6. Baseline gait speed (AUC = 0.713) was nonsignificantly better than SF-36 (AUC = 0.705) at predicting PCMD over 6 years (P = .21); including measures at a second time point significantly improved model discrimination for predicting PCMD (all P 36 PF did, although the results may be limited given that gait speed served as a predictor and to define the PCMD outcome. Nonetheless, monitoring trajectories of change in mobility are better predictors of future mobility disability than single measures. © 2018, Copyright the Authors Journal compilation © 2018, The American Geriatrics Society.

  9. Three controversies over item disclosure in medical licensure examinations

    Directory of Open Access Journals (Sweden)

    Yoon Soo Park

    2015-09-01

    Full Text Available In response to views on public's right to know, there is growing attention to item disclosure – release of items, answer keys, and performance data to the public – in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations – 1 fairness and validity, 2 impact on passing levels, and 3 utility of item disclosure – by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers’ right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration.

  10. Automatic item generation implemented for measuring artistic judgment aptitude.

    Science.gov (United States)

    Bezruczko, Nikolaus

    2014-01-01

    Automatic item generation (AIG) is a broad class of methods that are being developed to address psychometric issues arising from internet and computer-based testing. In general, issues emphasize efficiency, validity, and diagnostic usefulness of large scale mental testing. Rapid prominence of AIG methods and their implicit perspective on mental testing is bringing painful scrutiny to many sacred psychometric assumptions. This report reviews basic AIG ideas, then presents conceptual foundations, image model development, and operational application to artistic judgment aptitude testing.

  11. Development and validation of an item response theory-based Social Responsiveness Scale short form.

    Science.gov (United States)

    Sturm, Alexandra; Kuhfeld, Megan; Kasari, Connie; McCracken, James T

    2017-09-01

    Research and practice in autism spectrum disorder (ASD) rely on quantitative measures, such as the Social Responsiveness Scale (SRS), for characterization and diagnosis. Like many ASD diagnostic measures, SRS scores are influenced by factors unrelated to ASD core features. This study further interrogates the psychometric properties of the SRS using item response theory (IRT), and demonstrates a strategy to create a psychometrically sound short form by applying IRT results. Social Responsiveness Scale analyses were conducted on a large sample (N = 21,426) of youth from four ASD databases. Items were subjected to item factor analyses and evaluation of item bias by gender, age, expressive language level, behavior problems, and nonverbal IQ. Item selection based on item psychometric properties, DIF analyses, and substantive validity produced a reduced item SRS short form that was unidimensional in structure, highly reliable (α = .96), and free of gender, age, expressive language, behavior problems, and nonverbal IQ influence. The short form also showed strong relationships with established measures of autism symptom severity (ADOS, ADI-R, Vineland). Degree of association between all measures varied as a function of expressive language. Results identified specific SRS items that are more vulnerable to non-ASD-related traits. The resultant 16-item SRS short form may possess superior psychometric properties compared to the original scale and emerge as a more precise measure of ASD core symptom severity, facilitating research and practice. Future research using IRT is needed to further refine existing measures of autism symptomatology. © 2017 Association for Child and Adolescent Mental Health.

  12. Short assessment of the Big Five: robust across survey methods except telephone interviewing

    OpenAIRE

    Lang, Frieder R.; John, Dennis; Lüdtke, Oliver; Schupp, Jürgen; Wagner, Gert G.

    2011-01-01

    We examined measurement invariance and age-related robustness of a short 15-item Big Five Inventory (BFI–S) of personality dimensions, which is well suited for applications in large-scale multidisciplinary surveys. The BFI–S was assessed in three different interviewing conditions: computer-assisted or paper-assisted face-to-face interviewing, computer-assisted telephone interviewing, and a self-administered questionnaire. Randomized probability samples from a large-scale German panel survey a...

  13. Predicting survey responses: how and why semantics shape survey statistics on organizational behaviour.

    Directory of Open Access Journals (Sweden)

    Jan Ketil Arnulf

    Full Text Available Some disciplines in the social sciences rely heavily on collecting survey responses to detect empirical relationships among variables. We explored whether these relationships were a priori predictable from the semantic properties of the survey items, using language processing algorithms which are now available as new research methods. Language processing algorithms were used to calculate the semantic similarity among all items in state-of-the-art surveys from Organisational Behaviour research. These surveys covered areas such as transformational leadership, work motivation and work outcomes. This information was used to explain and predict the response patterns from real subjects. Semantic algorithms explained 60-86% of the variance in the response patterns and allowed remarkably precise prediction of survey responses from humans, except in a personality test. Even the relationships between independent and their purported dependent variables were accurately predicted. This raises concern about the empirical nature of data collected through some surveys if results are already given a priori through the way subjects are being asked. Survey response patterns seem heavily determined by semantics. Language algorithms may suggest these prior to administering a survey. This study suggests that semantic algorithms are becoming new tools for the social sciences, opening perspectives on survey responses that prevalent psychometric theory cannot explain.

  14. Measuring Sexual Violence on Campus: Climate Surveys and Vulnerable Groups

    Science.gov (United States)

    de Heer, Brooke; Jones, Lynn

    2017-01-01

    Since the 2014 "Not Alone" report on campus sexual assault, the use of climate surveys to measure sexual violence on campuses across the United States has increased considerably. The current study utilizes a quasi meta-analysis approach to examine the utility of general campus climate surveys, which include a measure of sexual violence,…

  15. MEASURING CHILDREN'S FOOD SECURITY IN U.S. HOUSEHOLDS, 1995-99

    OpenAIRE

    Nord, Mark; Bickel, Gary

    2002-01-01

    The capacity to accurately measure the food security status of children in household surveys is an essential tool for monitoring food insecurity and hunger at the most severe levels in U.S. households and for assessing programs designed to prevent or ameliorate these conditions. USDA has developed a children's food security scale to meet this measurement need. The scale is calculated from 8 questions in the 18-item food security survey module that ask specifically about food-related experienc...

  16. Reliability measures in item response theory: manifest versus latent correlation functions.

    Science.gov (United States)

    Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Verbeke, Geert; De Boeck, Paul

    2015-02-01

    For item response theory (IRT) models, which belong to the class of generalized linear or non-linear mixed models, reliability at the scale of observed scores (i.e., manifest correlation) is more difficult to calculate than latent correlation based reliability, but usually of greater scientific interest. This is not least because it cannot be calculated explicitly when the logit link is used in conjunction with normal random effects. As such, approximations such as Fisher's information coefficient, Cronbach's α, or the latent correlation are calculated, allegedly because it is easy to do so. Cronbach's α has well-known and serious drawbacks, Fisher's information is not meaningful under certain circumstances, and there is an important but often overlooked difference between latent and manifest correlations. Here, manifest correlation refers to correlation between observed scores, while latent correlation refers to correlation between scores at the latent (e.g., logit or probit) scale. Thus, using one in place of the other can lead to erroneous conclusions. Taylor series based reliability measures, which are based on manifest correlation functions, are derived and a careful comparison of reliability measures based on latent correlations, Fisher's information, and exact reliability is carried out. The latent correlations are virtually always considerably higher than their manifest counterparts, Fisher's information measure shows no coherent behaviour (it is even negative in some cases), while the newly introduced Taylor series based approximations reflect the exact reliability very closely. Comparisons among the various types of correlations, for various IRT models, are made using algebraic expressions, Monte Carlo simulations, and data analysis. Given the light computational burden and the performance of Taylor series based reliability measures, their use is recommended. © 2014 The British Psychological Society.

  17. Qualitative Development and Content Validation of the PROMIS Pediatric Sleep Health Items.

    Science.gov (United States)

    Bevans, Katherine B; Meltzer, Lisa J; De La Motte, Anna; Kratchman, Amy; Viél, Dominique; Forrest, Christopher B

    2018-04-25

    To develop the Patient Reported Outcome Measurement Information System (PROMIS) Pediatric Sleep Health item pool and evaluate its content validity. Participants included 8 expert sleep clinician-researchers, 64 children ages 8-17 years, and 54 parents of children ages 5-17 years. We started with item concepts and expressions from the PROMIS Sleep Disturbance and Sleep Related Impairment adult measures. Additional pediatric sleep health concepts were generated by expert (n = 8), child (n = 28), and parent (n = 33) concept elicitation interviews and a systematic review of existing pediatric sleep health questionnaires. Content validity of the item pool was evaluated with item translatability review, readability analysis, and child (n = 36) and parent (n = 21) cognitive interviews. The final pediatric Sleep Health item pool includes 43 items that assess sleep disturbance (children's capacity to fall and stay asleep, sleep quality, dreams, and parasomnias) and sleep-related impairments (daytime sleepiness, low energy, difficulty waking up, and the impact of sleep and sleepiness on cognition, affect, behavior, and daily activities). Items are translatable and relevant and well understood by children ages 8-17 and parents of children ages 5-17. Rigorous qualitative procedures were used to develop and evaluate the content validity of the PROMIS Pediatric Sleep Health item pool. Once the item pool's psychometric properties are established, the scales will be useful for measuring children's subjective experiences of sleep.

  18. Secondary Psychometric Examination of the Dimensional Obsessive-Compulsive Scale: Classical Testing, Item Response Theory, and Differential Item Functioning.

    Science.gov (United States)

    Thibodeau, Michel A; Leonard, Rachel C; Abramowitz, Jonathan S; Riemann, Bradley C

    2015-12-01

    The Dimensional Obsessive-Compulsive Scale (DOCS) is a promising measure of obsessive-compulsive disorder (OCD) symptoms but has received minimal psychometric attention. We evaluated the utility and reliability of DOCS scores. The study included 832 students and 300 patients with OCD. Confirmatory factor analysis supported the originally proposed four-factor structure. DOCS total and subscale scores exhibited good to excellent internal consistency in both samples (α = .82 to α = .96). Patient DOCS total scores reduced substantially during treatment (t = 16.01, d = 1.02). DOCS total scores discriminated between students and patients (sensitivity = 0.76, 1 - specificity = 0.23). The measure did not exhibit gender-based differential item functioning as tested by Mantel-Haenszel chi-square tests. Expected response options for each item were plotted as a function of item response theory and demonstrated that DOCS scores incrementally discriminate OCD symptoms ranging from low to extremely high severity. Incremental differences in DOCS scores appear to represent unbiased and reliable differences in true OCD symptom severity. © The Author(s) 2014.

  19. Development of a measure of knowledge use by stakeholders in rehabilitation technology

    Directory of Open Access Journals (Sweden)

    Vathsala I Stone

    2014-11-01

    Full Text Available Objectives: Uptake of new knowledge by diverse and diffuse stakeholders of health-care technology innovations has been a persistent challenge, as has been measurement of this uptake. This article describes the development of the Level of Knowledge Use Survey instrument, a web-based measure of self-reported knowledge use. Methods: The Level of Knowledge Use Survey instrument was developed in the context of assessing effectiveness of knowledge communication strategies in rehabilitation technology. It was validated on samples representing five stakeholder types: researchers, manufacturers, clinician–practitioners, knowledge brokers, and consumers. Its structure is broadly based on Rogers’ stages of innovation adoption. Its item generation was initially guided by Hall et al’s Levels of Use framework. Item selection was based on content validity indices computed from expert ratings (n1 = 4; n2 = 3. Five representative stakeholders established usability of the web version. The version included 47 items (content validity index for individual items >0.78; content validity index for a scale or set of items >0.90 in self-reporting format. Psychometrics were then established for the version. Results: Analyses of data from small (n = 69 and large (n = 215 samples using the Level of Knowledge Use Survey instrument suggested a conceptual model of four levels of knowledge use—Non-awareness, Awareness, Interest, and Use. The levels covered eight dimensions and six user action categories. The sequential nature of levels was inconclusive due to low cell frequencies. The Level of Knowledge Use Survey instrument showed adequate content validity (≈ 0.88; n = 3 and excellent test–retest reliability (1.0; n = 69. It also demonstrated good construct validity (n = 215 for differentiating among new knowledge outputs (p < 0.001 and among stakeholder types (0.001 < p ≤ 0.013. It showed strong responsiveness to change

  20. The Trauma Center Organizational Culture Survey: development and conduction.

    Science.gov (United States)

    Davis, Matthew L; Wehbe-Janek, Hania; Subacius, Haris; Pinto, Ruxandra; Nathens, Avery B

    2015-01-01

    The Trauma Center Organizational Culture Survey (TRACCS) instrument was developed to assess organizational culture of trauma centers enrolled in the American College of Surgeons Trauma Quality Program (ACS TQIP). The objective is to provide evidence on the psychometric properties of the factors of TRACCS and describe the current organizational culture of TQIP-enrolled trauma centers. A cross-sectional study was conducted by surveying a sampling of employees at 174 TQIP-enrolled trauma centers. Data collection was preceded by multistep survey development. Psychometric properties were assessed by an exploratory factor analysis (construct validity) and the item-total correlations and Cronbach alpha were calculated (internal reliability). Statistical outcomes of the survey responses were measured by descriptive statistics and mixed effect models. The response rate for trauma center participation in the study was 78.7% (n = 137). The factor analysis resulted in 16 items clustered into three factors as described: opportunity, pride, and diversity, trauma center leadership, and employee respect and recognition. TRACCS was found to be highly reliable with a Cronbach alpha of 0.90 in addition to the three factors (0.91, 0.90, and 0.85). Considerable variability of TRACCS overall and factor score among hospitals was measured, with the largest interhospital deviations among trauma center leadership. More than 80% of the variability in the responses occurred within rather than between hospitals. TRACCS was developed as a reliable tool for measuring trauma center organizational culture. Relationships between TQIP outcomes and measured organizational culture are under investigation. Trauma centers could apply TRACCS to better understand current organizational culture and how change tools can impact culture and subsequent patient and process outcomes. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. An Introduction to Item Response Theory for Health Behavior Researchers

    Science.gov (United States)

    Warne, Russell T.; McKyer, E. J. Lisako; Smith, Matthew L.

    2012-01-01

    Objective: To introduce item response theory (IRT) to health behavior researchers by contrasting it with classical test theory and providing an example of IRT in health behavior. Method: Demonstrate IRT by fitting the 2PL model to substance-use survey data from the Adolescent Health Risk Behavior questionnaire (n = 1343 adolescents). Results: An…

  2. Random Item Generation Is Affected by Age

    Science.gov (United States)

    Multani, Namita; Rudzicz, Frank; Wong, Wing Yiu Stephanie; Namasivayam, Aravind Kumar; van Lieshout, Pascal

    2016-01-01

    Purpose: Random item generation (RIG) involves central executive functioning. Measuring aspects of random sequences can therefore provide a simple method to complement other tools for cognitive assessment. We examine the extent to which RIG relates to specific measures of cognitive function, and whether those measures can be estimated using RIG…

  3. Development of a Comprehensive Assessment of Food Parenting Practices: The Home Self-Administered Tool for Environmental Assessment of Activity and Diet Family Food Practices Survey.

    Science.gov (United States)

    Vaughn, Amber E; Dearth-Wesley, Tracy; Tabak, Rachel G; Bryant, Maria; Ward, Dianne S

    2017-02-01

    Parents' food parenting practices influence children's dietary intake and risk for obesity and chronic disease. Understanding the influence and interactions between parents' practices and children's behavior is limited by a lack of development and psychometric testing and/or limited scope of current measures. The Home Self-Administered Tool for Environmental Assessment of Activity and Diet (HomeSTEAD) was created to address this gap. This article describes development and psychometric testing of the HomeSTEAD family food practices survey. Between August 2010 and May 2011, a convenience sample of 129 parents of children aged 3 to 12 years were recruited from central North Carolina and completed the self-administered HomeSTEAD survey on three occasions during a 12- to 18-day window. Demographic characteristics and child diet were assessed at Time 1. Child height and weight were measured during the in-home observations (following Time 1 survey). Exploratory factor analysis with Time 1 data was used to identify potential scales. Scales with more than three items were examined for scale reduction. Following this, mean scores were calculated at each time point. Construct validity was assessed by examining Spearman rank correlations between mean scores (Time 1) and children's diet (fruits and vegetables, sugar-sweetened beverages, snacks, sweets) and body mass index (BMI) z scores. Repeated measures analysis of variance was used to examine differences in mean scores between time points, and single-measure intraclass correlations were calculated to examine test-retest reliability between time points. Exploratory factor analysis identified 24 factors and retained 124 items; however, scale reduction narrowed items to 86. The final instrument captures five coercive control practices (16 items), seven autonomy support practices (24 items), and 12 structure practices (46 items). All scales demonstrated good internal reliability (α>.62), 18 factors demonstrated construct

  4. Development and psychometric evaluation of an information literacy self-efficacy survey and an information literacy knowledge test.

    Science.gov (United States)

    Tepe, Rodger; Tepe, Chabha

    2015-03-01

    To develop and psychometrically evaluate an information literacy (IL) self-efficacy survey and an IL knowledge test. In this test-retest reliability study, a 25-item IL self-efficacy survey and a 50-item IL knowledge test were developed and administered to a convenience sample of 53 chiropractic students. Item analyses were performed on all questions. The IL self-efficacy survey demonstrated good reliability (test-retest correlation = 0.81) and good/very good internal consistency (mean κ = .56 and Cronbach's α = .92). A total of 25 questions with the best item analysis characteristics were chosen from the 50-item IL knowledge test, resulting in a 25-item IL knowledge test that demonstrated good reliability (test-retest correlation = 0.87), very good internal consistency (mean κ = .69, KR20 = 0.85), and good item discrimination (mean point-biserial = 0.48). This study resulted in the development of three instruments: a 25-item IL self-efficacy survey, a 50-item IL knowledge test, and a 25-item IL knowledge test. The information literacy self-efficacy survey and the 25-item version of the information literacy knowledge test have shown preliminary evidence of adequate reliability and validity to justify continuing study with these instruments.

  5. Behavioral Health Needs Assessment Survey (BHNAS): Overview of Survey Items and Measures

    Science.gov (United States)

    2013-02-12

    medication use • Personal and unit morale • Unit cohesion • Attitudes toward leadership • Positive effects of deployment • Navy support during deployment...to select any of the following: • Over-the-counter drugs (including Aspirin, Tylenol, Motrin, Ibuprofen, Aleve) • Prescription painkillers that...are not opioids (including Celebrex, Vioxx, Bextra, topical lidocaine) • Prescription opioid/narcotic painkiller (including OxyContin, Percocet

  6. Assessment of health surveys: fitting a multidimensional graded response model.

    Science.gov (United States)

    Depaoli, Sarah; Tiemensma, Jitske; Felt, John M

    The multidimensional graded response model, an item response theory (IRT) model, can be used to improve the assessment of surveys, even when sample sizes are restricted. Typically, health-based survey development utilizes classical statistical techniques (e.g. reliability and factor analysis). In a review of four prominent journals within the field of Health Psychology, we found that IRT-based models were used in less than 10% of the studies examining scale development or assessment. However, implementing IRT-based methods can provide more details about individual survey items, which is useful when determining the final item content of surveys. An example using a quality of life survey for Cushing's syndrome (CushingQoL) highlights the main components for implementing the multidimensional graded response model. Patients with Cushing's syndrome (n = 397) completed the CushingQoL. Results from the multidimensional graded response model supported a 2-subscale scoring process for the survey. All items were deemed as worthy contributors to the survey. The graded response model can accommodate unidimensional or multidimensional scales, be used with relatively lower sample sizes, and is implemented in free software (example code provided in online Appendix). Use of this model can help to improve the quality of health-based scales being developed within the Health Sciences.

  7. Development of a Survey Instrument to Measure TEFL Academics' Perceptions about, Individual and Workplace Characteristics for Conducting Research

    Science.gov (United States)

    Bai, Li; Hudson, Peter; Millwater, Jan; Tones, Megan

    2013-01-01

    A 30-item survey was devised to determine Chinese TEFL (Teaching English as a Foreign Language) academics' potential for conducting research. A five-part Likert scale was used to gather data from 182 academics on four factors: (1) perceptions on teaching-research nexus, (2) personal perspectives for conducting research, (3) predispositions for…

  8. A method for additive bias correction in cross-cultural surveys

    DEFF Research Database (Denmark)

    Scholderer, Joachim; Grunert, Klaus G.; Brunsø, Karen

    2001-01-01

    additive bias from cross-cultural data. The procedure involves four steps: (1) embed a potentially biased item in a factor-analytic measurement model, (2) test for the existence of additive bias between populations, (3) use the factor-analytic model to estimate the magnitude of the bias, and (4) replace......Measurement bias in cross-cultural surveys can seriously threaten the validity of hypothesis tests. Direct comparisons of means depend on the assumption that differences in observed variables reflect differences in the underlying constructs, and not an additive bias that may be caused by cultural...... differences in the understanding of item wording or response category labels. However, experience suggests that additive bias can be found more often than not. Based on the concept of partial measurement invariance (Byrne, Shavelson and Muthén, 1989), the present paper develops a procedure for eliminating...

  9. Item response theory scoring and the detection of curvilinear relationships.

    Science.gov (United States)

    Carter, Nathan T; Dalal, Dev K; Guan, Li; LoPilato, Alexander C; Withrow, Scott A

    2017-03-01

    Psychologists are increasingly positing theories of behavior that suggest psychological constructs are curvilinearly related to outcomes. However, results from empirical tests for such curvilinear relations have been mixed. We propose that correctly identifying the response process underlying responses to measures is important for the accuracy of these tests. Indeed, past research has indicated that item responses to many self-report measures follow an ideal point response process-wherein respondents agree only to items that reflect their own standing on the measured variable-as opposed to a dominance process, wherein stronger agreement, regardless of item content, is always indicative of higher standing on the construct. We test whether item response theory (IRT) scoring appropriate for the underlying response process to self-report measures results in more accurate tests for curvilinearity. In 2 simulation studies, we show that, regardless of the underlying response process used to generate the data, using the traditional sum-score generally results in high Type 1 error rates or low power for detecting curvilinearity, depending on the distribution of item locations. With few exceptions, appropriate power and Type 1 error rates are achieved when dominance-based and ideal point-based IRT scoring are correctly used to score dominance and ideal point response data, respectively. We conclude that (a) researchers should be theory-guided when hypothesizing and testing for curvilinear relations; (b) correctly identifying whether responses follow an ideal point versus dominance process, particularly when items are not extreme is critical; and (c) IRT model-based scoring is crucial for accurate tests of curvilinearity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  10. Veterinary students' perceptions of their learning environment as measured by the Dundee Ready Education Environment Measure.

    Science.gov (United States)

    Pelzer, Jacquelyn M; Hodgson, Jennifer L; Werre, Stephen R

    2014-03-24

    The Dundee Ready Education Environment Measure (DREEM) has been widely used to evaluate the learning environment within health sciences education, however, this tool has not been applied in veterinary medical education. The aim of this study was to evaluate the reliability and validity of the DREEM tool in a veterinary medical program and to determine veterinary students' perceptions of their learning environment. The DREEM is a survey tool which quantitatively measures students' perceptions of their learning environment. The survey consists of 50 items, each scored 0-4 on a Likert Scale. The 50 items are subsequently analysed within five subscales related to students' perceptions of learning, faculty (teachers), academic atmosphere, and self-perceptions (academic and social). An overall score is obtained by summing the mean score for each subscale, with an overall possible score of 200. All students in the program were asked to complete the DREEM. Means and standard deviations were calculated for the 50 items, the five subscale scores and the overall score. Cronbach's alpha was determined for the five subscales and overall score to evaluate reliability. Confirmatory factor analysis was used to evaluate construct validity. 224 responses (53%) were received. The Cronbach's alpha for the overall score was 0.93 and for the five subscales were; perceptions of learning 0.85, perceptions of faculty 0.79, perceptions of atmosphere 0.81, academic self-perceptions 0.68, and social self-perceptions 0.72. Construct validity was determined to be acceptable (p education programs. Four individual items of concern were identified by students. In this setting the DREEM was a reliable and valid tool to measure veterinary students' perceptions of their learning environment. The four items identified as concerning originated from four of the five subscales, but all related to workload. Negative perceptions regarding workload is a common concern of students in health education

  11. Report on achievements in fiscal 1998. Surveys on development of an at-home welfare device system to rationalize energy use. (Imaichi City); 1998 nendo energy shiyo gorika zaitaku fukushi kiki system kaihatsu chosa (Imaichi) saiitaku kenkyu seika hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1999-03-01

    Evaluations were given on energy demand in residential houses arranged with considerations for elderly people based on district features for at-home welfare devices, and on change in energy use at care-taking sites. Surveys and studies were performed on structural characteristics of residential houses arranged with considerations for elderly people, and on structuring of at-home welfare device systems. The study items are as follows: (1) identification of the using patterns of the at-home welfare devices, and measurement and evaluation on electric power consumption, (2) measurement and evaluation on household energy consumption, (3) estimation on energy demand in residential houses for elderly people, (4) surveys on hot heat sensitivity, and (5) estimation on allowance for steps in rooms. In Item 1, electric power consumption is measured during standard operation of at-home welfare devices by using power measuring devices in the Welfare Techno-House (WTH) Imaichi. In item 2, a questionnaire survey is performed on the ways of living in elderly people's households and general households. In Item 3, energy demand is estimated in houses for elderly people incorporated with welfare devices, based on the results of Items 1 and 2. In Item 4, measurement and evaluation are made by using experiments on physiological changes in bodies of elderly people in toilet space in the WTH, and on a room heating system giving comfortable feeling. In item 5, discussions are given on air tightness of doors in barrier free sections, and on allowable limit of steps inside rooms. (NEDO)

  12. Item Construction and Psychometric Models Appropriate for Constructed Responses

    Science.gov (United States)

    1991-08-01

    which involve only one attribute per item. This is especially true when we are dealing with constructed-response items, we have to measure much more...Service University of Ilinois Educacional Testing Service Rosedal Road Capign. IL 61801 Princeton. K3 08541 Princeton. N3 08541 Dr. Charles LeiS Dr

  13. Item Response Theory Modeling of the Philadelphia Naming Test

    Science.gov (United States)

    Fergadiotis, Gerasimos; Kellough, Stacey; Hula, William D.

    2015-01-01

    Purpose: In this study, we investigated the fit of the Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996) to an item-response-theory measurement model, estimated the precision of the resulting scores and item parameters, and provided a theoretical rationale for the interpretation of PNT overall scores by relating…

  14. The role of attention in item-item binding in visual working memory.

    Science.gov (United States)

    Peterson, Dwight J; Naveh-Benjamin, Moshe

    2017-09-01

    An important yet unresolved question regarding visual working memory (VWM) relates to whether or not binding processes within VWM require additional attentional resources compared with processing solely the individual components comprising these bindings. Previous findings indicate that binding of surface features (e.g., colored shapes) within VWM is not demanding of resources beyond what is required for single features. However, it is possible that other types of binding, such as the binding of complex, distinct items (e.g., faces and scenes), in VWM may require additional resources. In 3 experiments, we examined VWM item-item binding performance under no load, articulatory suppression, and backward counting using a modified change detection task. Binding performance declined to a greater extent than single-item performance under higher compared with lower levels of concurrent load. The findings from each of these experiments indicate that processing item-item bindings within VWM requires a greater amount of attentional resources compared with single items. These findings also highlight an important distinction between the role of attention in item-item binding within VWM and previous studies of long-term memory (LTM) where declines in single-item and binding test performance are similar under divided attention. The current findings provide novel evidence that the specific type of binding is an important determining factor regarding whether or not VWM binding processes require attention. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  15. Developing measures of food and nutrition security within an Australian context.

    Science.gov (United States)

    Archer, Claire; Gallegos, Danielle; McKechnie, Rebecca

    2017-10-01

    To develop a measure of food and nutrition security for use among an Australian population that measures all pillars of food security and to establish its content validity. The study consisted of two phases. Phase 1 involved focus groups with experts working in the area of food security. Data were assessed using content analysis and results informed the development of a draft tool. Phase 2 consisted of a series of three online surveys using the Delphi technique. Findings from each survey were used to establish content validity and progressively modify the tool until consensus was reached for all items. Australia. Phase 1 focus groups involved twenty-five experts working in the field of food security, who were attending the Dietitians Association of Australia National Conference, 2013. Phase 2 included twenty-five experts working in food security, who were recruited via email. Findings from Phase 1 supported the need for an Australian-specific tool and highlighted the failure of current tools to measure across all pillars of food security. Participants encouraged the inclusion of items to measure barriers to food acquisition and the previous single item to enable comparisons with previous data. Phase 2 findings informed the selection and modification of items for inclusion in the final tool. The results led to the development of a draft tool to measure food and nutrition security, and supported its content validity. Further research is needed to validate the tool among the Australian population and to establish inter- and intra-rater reliability.

  16. Psychometric properties of the Centers for Disease Control and Prevention Health-Related Quality of Life (CDC HRQOL items in adults with arthritis

    Directory of Open Access Journals (Sweden)

    DeVellis Robert

    2006-09-01

    Full Text Available Abstract Background Measuring health-related quality of life (HRQOL is important in arthritis and the SF-36v2 is the current state-of-the-art. It is only emerging how well the Centers for Disease Control and Prevention (CDC HRQOL measures HRQOL for people with arthritis. This study's purpose is to assess the psychometric properties of the 9-item CDC HRQOL (4-item Healthy Days Core Module and 5-item Healthy Days Symptoms Module in an arthritis sample using the SF-36v2 as a comparison. Methods In Fall 2002, a cross-sectional study acquired survey data including the CDC HRQOL and SF-36v2 from 2 North Carolina populations of adult patients reporting osteoarthritis, rheumatoid arthritis, and fibromyalgia; 2182 (52% responded. The first item of both the CDC HRQOL and the SF-36v2 was general health (GEN. All 8 other CDC HRQOL items ask for the number of days in the past 30 days that respondents experienced various aspects of HRQOL. Exploratory principal components analyses (PCA were conducted on each sample and the combined samples of the CDC HRQOL. The multitrait-multimethod matrix (MTMM was used to compute correlations between each trait (physical health and mental health and between each method of measurement (CDC HRQOL and SF36v2. The relative contribution of the CDC HRQOL in predicting the physical component summary (PCS and the mental component summary (MCS was determined by regressing the CDC HRQOL items on the PCS and MCS scales. Results All 9 CDC HRQOL items loaded primarily onto 1 factor (explaining 57% of the item variance representing a reasonable solution for capturing overall HRQOL. After rotation a 2 factor interpretation for the 9 items was clear, with 4 items capturing physical health (physical, activity, pain, and energy days and 3 items capturing mental health (mental, depression, and anxiety days. All of the loadings for these two factors were greater than 0.70. The CDC HRQOL physical health factor correlated with PCS (r = -.78, p 2

  17. Development of the PROMIS positive emotional and sensory expectancies of smoking item banks.

    Science.gov (United States)

    Tucker, Joan S; Shadel, William G; Edelen, Maria Orlando; Stucky, Brian D; Li, Zhen; Hansen, Mark; Cai, Li

    2014-09-01

    The positive emotional and sensory expectancies of cigarette smoking include improved cognitive abilities, positive affective states, and pleasurable sensorimotor sensations. This paper describes development of Positive Emotional and Sensory Expectancies of Smoking item banks that will serve to standardize the assessment of this construct among daily and nondaily cigarette smokers. Data came from daily (N = 4,201) and nondaily (N =1,183) smokers who completed an online survey. To identify a unidimensional set of items, we conducted item factor analyses, item response theory analyses, and differential item functioning analyses. Additionally, we evaluated the performance of fixed-item short forms (SFs) and computer adaptive tests (CATs) to efficiently assess the construct. Eighteen items were included in the item banks (15 common across daily and nondaily smokers, 1 unique to daily, 2 unique to nondaily). The item banks are strongly unidimensional, highly reliable (reliability = 0.95 for both), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.86). Results from simulated CATs indicated that, on average, less than 8 items are needed to assess the construct with adequate precision using the item banks. These analyses identified a new set of items that can assess the positive emotional and sensory expectancies of smoking in a reliable and standardized manner. Considerable efficiency in assessing this construct can be achieved by using the item bank SF, employing computer adaptive tests, or selecting subsets of items tailored to specific research or clinical purposes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  18. Relationship between Future Time Orientation and Item Nonresponse on Subjective Probability Questions: A Cross-Cultural Analysis.

    Science.gov (United States)

    Lee, Sunghee; Liu, Mingnan; Hu, Mengyao

    2017-06-01

    Time orientation is an unconscious yet fundamental cognitive process that provides a framework for organizing personal experiences in temporal categories of past, present and future, reflecting the relative emphasis given to these categories. Culture lies central to individuals' time orientation, leading to cultural variations in time orientation. For example, people from future-oriented cultures tend to emphasize the future and store information relevant for the future more than those from present- or past-oriented cultures. For survey questions that ask respondents to report expected probabilities of future events, this may translate into culture-specific question difficulties, manifested through systematically varying "I don't know" item nonresponse rates. This study drew on the time orientation theory and examined culture-specific nonresponse patterns on subjective probability questions using methodologically comparable population-based surveys from multiple countries. The results supported our hypothesis. Item nonresponse rates on these questions varied significantly in the way that future-orientation at the group as well as individual level was associated with lower nonresponse rates. This pattern did not apply to non-probability questions. Our study also suggested potential nonresponse bias. Examining culture-specific constructs, such as time orientation, as a framework for measurement mechanisms may contribute to improving cross-cultural research.

  19. Evaluating Job Demands and Control Measures for Use in Farm Worker Health Surveillance

    Science.gov (United States)

    Alterman, Toni; Gabbard, Susan; Grzywacz, Joseph G.; Shen, Rui; Li, Jia; Nakamoto, Jorge; Carroll, Daniel J.; Muntaner, Carles

    2015-01-01

    Workplace stress likely plays a role in health disparities; however, applying standard measures to studies of immigrants requires thoughtful consideration. The goal of this study was to determine the appropriateness of two measures of occupational stressors (‘decision latitude’ and ‘job demands’) for use with mostly immigrant Latino farm workers. Cross-sectional data from a pilot module containing a four-item measure of decision latitude and a two-item measure of job demands were obtained from a subsample (N = 409) of farm workers participating in the National Agricultural Workers Survey. Responses to items for both constructs were clustered toward the low end of the structured response-set. Percentages of responses of ‘very often’ and ‘always’ for each of the items were examined by educational attainment, birth country, dominant language spoken, task, and crop. Cronbach’s α, when stratified by subgroups of workers, for the decision latitude items were (0.65–0.90), but were less robust for the job demands items (0.25–0.72). The four-item decision latitude scale can be applied to occupational stress research with immigrant farm workers, and potentially other immigrant Latino worker groups. The short job demands scale requires further investigation and evaluation before suggesting widespread use. PMID:25138138

  20. Identification and Development of Items Comprising Organizational Citizenship Behaviors Among Pharmacy Faculty.

    Science.gov (United States)

    Desselle, Shane P; Semsick, Gretchen R

    2016-12-25

    Objective. Identify behaviors that can compose a measure of organizational citizenship by pharmacy faculty. Methods. A four-round, modified Delphi procedure using open-ended questions (Round 1) was conducted with 13 panelists from pharmacy academia. The items generated were evaluated and refined for inclusion in subsequent rounds. A consensus was reached after completing four rounds. Results. The panel produced a set of 26 items indicative of extra-role behaviors by faculty colleagues considered to compose a measure of citizenship, which is an expressed manifestation of collegiality. Conclusions. The items generated require testing for validation and reliability in a large sample to create a measure of organizational citizenship. Even prior to doing so, the list of items can serve as a resource for mentorship of junior and senior faculty alike.

  1. Identification and Development of Items Comprising Organizational Citizenship Behaviors Among Pharmacy Faculty

    Science.gov (United States)

    Semsick, Gretchen R.

    2016-01-01

    Objective. Identify behaviors that can compose a measure of organizational citizenship by pharmacy faculty. Methods. A four-round, modified Delphi procedure using open-ended questions (Round 1) was conducted with 13 panelists from pharmacy academia. The items generated were evaluated and refined for inclusion in subsequent rounds. A consensus was reached after completing four rounds. Results. The panel produced a set of 26 items indicative of extra-role behaviors by faculty colleagues considered to compose a measure of citizenship, which is an expressed manifestation of collegiality. Conclusions. The items generated require testing for validation and reliability in a large sample to create a measure of organizational citizenship. Even prior to doing so, the list of items can serve as a resource for mentorship of junior and senior faculty alike. PMID:28179717

  2. Factor structure and longitudinal measurement invariance of the demand control support model: an evidence from the Swedish Longitudinal Occupational Survey of Health (SLOSH).

    Science.gov (United States)

    Chungkham, Holendro Singh; Ingre, Michael; Karasek, Robert; Westerlund, Hugo; Theorell, Töres

    2013-01-01

    To examine the factor structure and to evaluate the longitudinal measurement invariance of the demand-control-support questionnaire (DCSQ), using the Swedish Longitudinal Occupational Survey of Health (SLOSH). A confirmatory factor analysis (CFA) and multi-group confirmatory factor analysis (MGCFA) models within the framework of structural equation modeling (SEM) have been used to examine the factor structure and invariance across time. Four factors: psychological demand, skill discretion, decision authority and social support, were confirmed by CFA at baseline, with the best fit obtained by removing the item repetitive work of skill discretion. A measurement error correlation (0.42) between work fast and work intensively for psychological demands was also detected. Acceptable composite reliability measures were obtained except for skill discretion (0.68). The invariance of the same factor structure was established, but caution in comparing mean levels of factors over time is warranted as lack of intercept invariance was evident. However, partial intercept invariance was established for work intensively. Our findings indicate that skill discretion and decision authority represent two distinct constructs in the retained model. However removing the item repetitive work along with either work fast or work intensively would improve model fit. Care should also be taken while making comparisons in the constructs across time. Further research should investigate invariance across occupations or socio-economic classes.

  3. Factor structure and longitudinal measurement invariance of the demand control support model: an evidence from the Swedish Longitudinal Occupational Survey of Health (SLOSH.

    Directory of Open Access Journals (Sweden)

    Holendro Singh Chungkham

    Full Text Available OBJECTIVES: To examine the factor structure and to evaluate the longitudinal measurement invariance of the demand-control-support questionnaire (DCSQ, using the Swedish Longitudinal Occupational Survey of Health (SLOSH. METHODS: A confirmatory factor analysis (CFA and multi-group confirmatory factor analysis (MGCFA models within the framework of structural equation modeling (SEM have been used to examine the factor structure and invariance across time. RESULTS: Four factors: psychological demand, skill discretion, decision authority and social support, were confirmed by CFA at baseline, with the best fit obtained by removing the item repetitive work of skill discretion. A measurement error correlation (0.42 between work fast and work intensively for psychological demands was also detected. Acceptable composite reliability measures were obtained except for skill discretion (0.68. The invariance of the same factor structure was established, but caution in comparing mean levels of factors over time is warranted as lack of intercept invariance was evident. However, partial intercept invariance was established for work intensively. CONCLUSION: Our findings indicate that skill discretion and decision authority represent two distinct constructs in the retained model. However removing the item repetitive work along with either work fast or work intensively would improve model fit. Care should also be taken while making comparisons in the constructs across time. Further research should investigate invariance across occupations or socio-economic classes.

  4. Statistical power as a function of Cronbach alpha of instrument questionnaire items.

    Science.gov (United States)

    Heo, Moonseong; Kim, Namhee; Faith, Myles S

    2015-10-14

    In countless number of clinical trials, measurements of outcomes rely on instrument questionnaire items which however often suffer measurement error problems which in turn affect statistical power of study designs. The Cronbach alpha or coefficient alpha, here denoted by C(α), can be used as a measure of internal consistency of parallel instrument items that are developed to measure a target unidimensional outcome construct. Scale score for the target construct is often represented by the sum of the item scores. However, power functions based on C(α) have been lacking for various study designs. We formulate a statistical model for parallel items to derive power functions as a function of C(α) under several study designs. To this end, we assume fixed true score variance assumption as opposed to usual fixed total variance assumption. That assumption is critical and practically relevant to show that smaller measurement errors are inversely associated with higher inter-item correlations, and thus that greater C(α) is associated with greater statistical power. We compare the derived theoretical statistical power with empirical power obtained through Monte Carlo simulations for the following comparisons: one-sample comparison of pre- and post-treatment mean differences, two-sample comparison of pre-post mean differences between groups, and two-sample comparison of mean differences between groups. It is shown that C(α) is the same as a test-retest correlation of the scale scores of parallel items, which enables testing significance of C(α). Closed-form power functions and samples size determination formulas are derived in terms of C(α), for all of the aforementioned comparisons. Power functions are shown to be an increasing function of C(α), regardless of comparison of interest. The derived power functions are well validated by simulation studies that show that the magnitudes of theoretical power are virtually identical to those of the empirical power. Regardless

  5. [Sampling and measurement methods of the protocol design of the China Nine-Province Survey for blindness, visual impairment and cataract surgery].

    Science.gov (United States)

    Zhao, Jia-liang; Wang, Yu; Gao, Xue-cheng; Ellwein, Leon B; Liu, Hu

    2011-09-01

    To design the protocol of the China nine-province survey for blindness, visual impairment and cataract surgery to evaluate the prevalence and main causes of blindness and visual impairment, and the prevalence and outcomes of the cataract surgery. The protocol design was began after accepting the task for the national survey for blindness, visual impairment and cataract surgery from the Department of Medicine, Ministry of Health, China, in November, 2005. The protocol in Beijing Shunyi Eye Study in 1996 and Guangdong Doumen County Eye Study in 1997, both supported by World Health Organization, was taken as the basis for the protocol design. The relative experts were invited to discuss and prove the draft protocol. An international advisor committee was established to examine and approve the draft protocol. Finally, the survey protocol was checked and approved by the Department of Medicine, Ministry of Health, China and Prevention Program of Blindness and Deafness, WHO. The survey protocol was designed according to the characteristics and the scale of the survey. The contents of the protocol included determination of target population and survey sites, calculation of the sample size, design of the random sampling, composition and organization of the survey teams, determination of the examinee, the flowchart of the field work, survey items and methods, diagnostic criteria of blindness and moderate and sever visual impairment, the measures of the quality control, the methods of the data management. The designed protocol became the standard and practical protocol for the survey to evaluate the prevalence and main causes of blindness and visual impairment, and the prevalence and outcomes of the cataract surgery.

  6. Development of a questionnaire to assess patient satisfaction with allergen-specific immunotherapy in adults: item generation, item reduction, and preliminary validation

    Directory of Open Access Journals (Sweden)

    Justícia JL

    2011-05-01

    Full Text Available Jose Luis Justícia1, Eva Baró2, Victoria Cardona3, Pedro Guardia4, Pedro Ojeda5, José Maria Olaguíbel6, José Maria Vega7, Carmen Vidal81Medical Department, Stallergenes Ibérica, Barcelona, Spain; 2Health Outcomes Research Department, 3D Health Research, Barcelona, Spain; 3Hospital Vall d'Hebron, Barcelona, Spain; 4Hospital Virgen Macarena, Sevilla, Spain; 5Clínica de Asma y Alergia Dres. Ojeda, Madrid, Spain; 6Complejo Hospitalario de Navarra, Pamplona, Spain; 7Hospital Regional Universitario Carlos Haya Málaga, Spain; 8Complejo Hospitalario Universitario de Santiago, Santiago de Compostela, SpainBackground: Allergen-specific immunotherapy (SIT is a treatment capable of modifying the natural course of allergy, so ensuring good adherence to SIT is fundamental. Up until now there has not existed an instrument specifically developed to measure patient satisfaction with SIT, although its assessment could help us to comprehend better and improve treatment adherence and effectiveness. The aim of this study was to develop an instrument to measure adult patient satisfaction with SIT.Methods: Items were generated from a literature review, focus groups with allergic adult patients undergoing SIT, and a meeting with experts. Potential items were administered to allergic patients undergoing SIT in an observational, cross-sectional, multicenter study. Item reduction was based on quantitative and qualitative criteria. A preliminary assessment of feasibility, reliability, and validity of the retained items was performed.Results: An initial pool of 70 items was administered to 257 patients undergoing SIT. Fifty-four items were eliminated resulting in a provisional instrument with 16 items. Factor analysis yielded four factors that were identified as perceived efficacy, activities and environment, cost-benefit balance, and overall satisfaction, explaining 74.8% of variance. Ceiling and floor effects were negligible for overall score. Overall score was

  7. Method using a density field for locating related items for data mining

    Science.gov (United States)

    Wylie, Brian N.

    2002-01-01

    A method for locating related items in a geometric space transforms relationships among items to geometric locations. The method locates items in the geometric space so that the distance between items corresponds to the degree of relatedness. The method facilitates communication of the structure of the relationships among the items. The method makes use of numeric values as a measure of similarity between each pairing of items. The items are given initial coordinates in the space. An energy is then determined for each item from the item's distance and similarity to other items, and from the density of items assigned coordinates near the item. The distance and similarity component can act to draw items with high similarities close together, while the density component can act to force all items apart. If a terminal condition is not yet reached, then new coordinates can be determined for one or more items, and the energy determination repeated. The iteration can terminate, for example, when the total energy reaches a threshold, when each item's energy is below a threshold, after a certain amount of time or iterations.

  8. Item Response Theory in the context of Improving Student Reasoning

    Science.gov (United States)

    Goddard, Chase; Davis, Jeremy; Pyper, Brian

    2011-10-01

    We are interested to see if Item Response Theory can help to better inform the development of reasoning ability in introductory physics. A first pass through our latest batch of data from the Heat and Temperature Conceptual Evaluation, the Lawson Classroom Test of Scientific Reasoning, and the Epistemological Beliefs About Physics Survey may help in this effort.

  9. The variety, popularity and nutritional quality of tuck shop items ...

    African Journals Online (AJOL)

    Method: A cross-sectional tuck shop survey. Nutritional analyses were conducted using the ... Results: Savoury pies were the most popular lunch item for all learners for both breaks (n = 5, 45%, and n = 3, 27.3%), selling the most number of units (43) per day at eight schools (72.7%). Iced popsicles were sold at almost every ...

  10. Comparing Two Inferential Approaches to Handling Measurement Error in Mixed-Mode Surveys

    Directory of Open Access Journals (Sweden)

    Buelens Bart

    2017-06-01

    Full Text Available Nowadays sample survey data collection strategies combine web, telephone, face-to-face, or other modes of interviewing in a sequential fashion. Measurement bias of survey estimates of means and totals are composed of different mode-dependent measurement errors as each data collection mode has its own associated measurement error. This article contains an appraisal of two recently proposed methods of inference in this setting. The first is a calibration adjustment to the survey weights so as to balance the survey response to a prespecified distribution of the respondents over the modes. The second is a prediction method that seeks to correct measurements towards a benchmark mode. The two methods are motivated differently but at the same time coincide in some circumstances and agree in terms of required assumptions. The methods are applied to the Labour Force Survey in the Netherlands and are found to provide almost identical estimates of the number of unemployed. Each method has its own specific merits. Both can be applied easily in practice as they do not require additional data collection beyond the regular sequential mixed-mode survey, an attractive element for national statistical institutes and other survey organisations.

  11. The four Es of problem gambling: a psychological measure of risk.

    Science.gov (United States)

    Rockloff, Matthew J; Dyer, Victoria

    2006-01-01

    A focus group of Reno area Gamblers Anonymous members identified four psychological traits contributing to risk for problem gambling, including: Escape, Esteem, Excess and Excitement. A panel of four experts authored 240 Likert-type items to measure these traits. By design, none of the items explicitly referred to gambling activities. Study 1 narrowed the field of useful items by employing a quasi-experimental design which compared the answers of Reno area Gamblers Anonymous members (N = 39) to a control sample (N = 34). Study 2 submitted successful items, plus new items authored with the knowledge gained from Study 1, to validation in a random sample telephone survey across Queensland, Australia (N=2577). The final 40 item Four Es scale (4Es) was reliable (alpha=.90); predicted gambling problems as measured by the Canadian Problem Gambling Index of Severity (PGSI, Ferris & Wynne (2001). The Canadian Problem Gambling Index: Final Report: Canadian Centre on Substance Abuse); and distinguished problem gamblers from persons with alcohol abuse problems. The new scale can provide a basis for further study in harm minimization, treatment, and theory development.

  12. A confirmative clinimetric analysis of the 36-item Family Assessment Device.

    Science.gov (United States)

    Timmerby, Nina; Cosci, Fiammetta; Watson, Maggie; Csillag, Claudio; Schmitt, Florence; Steck, Barbara; Bech, Per; Thastum, Mikael

    2018-02-07

    The Family Assessment Device (FAD) is a 60-item questionnaire widely used to evaluate self-reported family functioning. However, the factor structure as well as the number of items has been questioned. A shorter and more user-friendly version of the original FAD-scale, the 36-item FAD, has therefore previously been proposed, based on findings in a nonclinical population of adults. We aimed in this study to evaluate the brief 36-item version of the FAD in a clinical population. Data from a European multinational study, examining factors associated with levels of family functioning in adult cancer patients' families, were used. Both healthy and ill parents completed the 60-item version FAD. The psychometric analyses conducted were Principal Component Analysis and Mokken-analysis. A total of 564 participants were included. Based on the psychometric analysis we confirmed that the 36-item version of the FAD has robust psychometric properties and can be used in clinical populations. The present analysis confirmed that the 36-item version of the FAD (18 items assessing 'well-being' and 18 items assessing 'dysfunctional' family function) is a brief scale where the summed total score is a valid measure of the dimensions of family functioning. This shorter version of the FAD is, in accordance with the concept of 'measurement-based care', an easy to use scale that could be considered when the aim is to evaluate self-reported family functioning.

  13. Differential item functioning of the UWES-17 in South Africa

    Directory of Open Access Journals (Sweden)

    Leanne Goliath-Yarde

    2011-11-01

    Research purpose: This study assesses the Differential Item Functioning (DIF of the Utrecht Work Engagement Scale (UWES-17 for different South African cultural groups in a South African company. Motivation for the study: Organisations are using the UWES-17 more and more in South Africa to assess work engagement. Therefore, research evidence from psychologists or assessment practitioners on its DIF across different cultural groups is necessary. Research design, approach and method: The researchers conducted a Secondary Data Analysis (SDA on the UWES-17 sample (n = 2429 that they obtained from a cross-sectional survey undertaken in a South African Information and Communication Technology (ICT sector company (n = 24 134. Quantitative item data on the UWES-17 scale enabled the authors to address the research question. Main findings: The researchers found uniform and/or non-uniform DIF on five of the vigour items, four of the dedication items and two of the absorption items. This also showed possible Differential Test Functioning (DTF on the vigour and dedication dimensions. Practical/managerial implications: Based on the DIF, the researchers suggested that organisations should not use the UWES-17 comparatively for different cultural groups or employment decisions in South Africa. Contribution/value add: The study provides evidence on DIF and possible DTF for the UWES-17. However, it also raises questions about possible interaction effects that need further investigation.

  14. A Proposal of a Method to Measure and Evaluate the Effect to Apply External Support Measures for Owners by Construction Management Method, etc

    Science.gov (United States)

    Tada, Hiroshi; Miyatake, Ichiro; Mouri, Junji; Ajiki, Norihiko; Fueta, Toshiharu

    In Japan, various approaches have been taken to ensure the quality of public works or to support the procurement regime of the governmental agencies, as a means to utilize external resources, which include the procurement support service or the construction management (CM) method. Although discussions on these measures to utilize external resources (hereinafter referred to as external support measure) have been going on, as well as the follow-up surveys showing the positive effects of such measures have been conducted, the surveys only deal with the matters concerning the overall effects of the external support measure on the whole, meaning that the effect of each item of the tasks have not been addressed, and that the extent it dealt with the expectations of the client is unknown. However, the effective use of the external support measure in future cannot be achieved without knowing what was the purpose to introduce the external support measure, and what effect was expected on each task item, and what extent the expectation fulfilled. Furthermore, it is important to clarify not only the effect as compared to the client's expectation (performance), but also the public benefit of this measure (value improvement). From this point of view, there is not an established method to figure out the effect of the client's measure to utilize external resources. In view of this background, this study takes the CM method as an example of the external support measure, and proposes a method to measure and evaluate the effect by each task item, and suggests the future issues and possible responses, in the aim of contributing the promotion, improvement, and proper implementation of the external support measures in future.

  15. The development and initial assessment of the strategy and leadership systems capability evaluation survey.

    Science.gov (United States)

    Coon, Cheryl D; Bokowy, Kay L; Horblyuk, Ruslan; Zisman, Robert S; McLeod, Lori D; Brown, T Michelle

    2012-01-01

    Hospital management and leadership systems are associated with organizational success and quality care. The Strategy and Leadership Systems Capability Evaluation (CE) survey was developed by GE Healthcare to assess management and leadership systems at health care institutions, serve as a benchmark for improvement, and measure progress. To assess the psychometric properties of the 29-item CE survey, including the factor structure, scoring algorithm, reliability, and discriminant validity, an online survey was completed by 3450 employees at 15 US hospitals. Of these employees, 609 worked at a hospital where a leadership and management intervention occurred after the initial survey administration. Data were also collected on job level, number of hospital beds, hospital ownership, location, community type, and the implementation of hospital interventions. Item response frequencies showed no floor or ceiling effects and limited missing data. Interitem correlations were strong without obvious redundancies, and factor analysis suggested a unidimensional scale. The resulting scale had strong internal consistency and was able to discriminate among known groups. The CE survey was developed to evaluate management and leadership systems at health care institutions. This study provides psychometric evidence in support of the reliability, validity, and scoring structure of this survey.

  16. Survey Development to Assess College Students' Perceptions of the Campus Environment.

    Science.gov (United States)

    Sowers, Morgan F; Colby, Sarah; Greene, Geoffrey W; Pickett, Mackenzie; Franzen-Castle, Lisa; Olfert, Melissa D; Shelnutt, Karla; Brown, Onikia; Horacek, Tanya M; Kidd, Tandalayo; Kattelmann, Kendra K; White, Adrienne A; Zhou, Wenjun; Riggsbee, Kristin; Yan, Wangcheng; Byrd-Bredbenner, Carol

    2017-11-01

    We developed and tested a College Environmental Perceptions Survey (CEPS) to assess college students' perceptions of the healthfulness of their campus. CEPS was developed in 3 stages: questionnaire development, validity testing, and reliability testing. Questionnaire development was based on an extensive literature review and input from an expert panel to establish content validity. Face validity was established with the target population using cognitive interviews with 100 college students. Concurrent-criterion validity was established with in-depth interviews (N = 30) of college students compared to surveys completed by the same 30 students. Surveys completed by college students from 8 universities (N = 1147) were used to test internal structure (factor analysis) and internal consistency (Cronbach's alpha). After development and testing, 15 items remained from the original 48 items. A 5-factor solution emerged: physical activity (4 items, α = .635), water (3 items, α = .773), vending (2 items, α = .680), healthy food (2 items, α = .631), and policy (2 items, α = .573). The mean total score for all universities was 62.71 (±11.16) on a 100-point scale. CEPS appears to be a valid and reliable tool for assessing college students' perceptions of their health-related campus environment.

  17. Using Patient Health Questionnaire-9 item parameters of a common metric resulted in similar depression scores compared to independent item response theory model reestimation.

    Science.gov (United States)

    Liegl, Gregor; Wahl, Inka; Berghöfer, Anne; Nolte, Sandra; Pieh, Christoph; Rose, Matthias; Fischer, Felix

    2016-03-01

    To investigate the validity of a common depression metric in independent samples. We applied a common metrics approach based on item-response theory for measuring depression to four German-speaking samples that completed the Patient Health Questionnaire (PHQ-9). We compared the PHQ item parameters reported for this common metric to reestimated item parameters that derived from fitting a generalized partial credit model solely to the PHQ-9 items. We calibrated the new model on the same scale as the common metric using two approaches (estimation with shifted prior and Stocking-Lord linking). By fitting a mixed-effects model and using Bland-Altman plots, we investigated the agreement between latent depression scores resulting from the different estimation models. We found different item parameters across samples and estimation methods. Although differences in latent depression scores between different estimation methods were statistically significant, these were clinically irrelevant. Our findings provide evidence that it is possible to estimate latent depression scores by using the item parameters from a common metric instead of reestimating and linking a model. The use of common metric parameters is simple, for example, using a Web application (http://www.common-metrics.org) and offers a long-term perspective to improve the comparability of patient-reported outcome measures. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Use of indicator items to monitor marine debris on a New Jersey beach from 1991 to 1996

    Science.gov (United States)

    Ribic, C.A.

    1998-01-01

    The US National Marine Debris Monitoring Program is using indicator items from beach surveys to identify whether amounts of marine debris are changing over time. Indicator items were selected through expert opinion and assumed to reflect the trend of all debris. We used monthly data from a 1991-1996 study of debris on a New Jersey beach to determine if indicator and non-indicator items showed similar trends. Total indicator debris levels did not change; this was true regardless of probable source. Non-indicator debris increased about 40% annually. Plastic non-indicator items increased regardless of whether items were whole items, cigarette filters, or pieces. Of the whole items, almost 50% were plastic lids, cups, and utensils, and about 25% were drug-related paraphernalia, tobacco-related products, plastic stirrers, pull rings, and fireworks. When indicator items are used in a monitoring programme to reflect total debris patterns, concordance of trends in indicator and non-indicator debris should be checked.

  19. General mixture item response models with different item response structures: Exposition with an application to Likert scales.

    Science.gov (United States)

    Tijmstra, Jesper; Bolsinova, Maria; Jeon, Minjeong

    2018-01-10

    This article proposes a general mixture item response theory (IRT) framework that allows for classes of persons to differ with respect to the type of processes underlying the item responses. Through the use of mixture models, nonnested IRT models with different structures can be estimated for different classes, and class membership can be estimated for each person in the sample. If researchers are able to provide competing measurement models, this mixture IRT framework may help them deal with some violations of measurement invariance. To illustrate this approach, we consider a two-class mixture model, where a person's responses to Likert-scale items containing a neutral middle category are either modeled using a generalized partial credit model, or through an IRTree model. In the first model, the middle category ("neither agree nor disagree") is taken to be qualitatively similar to the other categories, and is taken to provide information about the person's endorsement. In the second model, the middle category is taken to be qualitatively different and to reflect a nonresponse choice, which is modeled using an additional latent variable that captures a person's willingness to respond. The mixture model is studied using simulation studies and is applied to an empirical example.

  20. A procedure for eliminating additive bias from cross-cultural survey data

    DEFF Research Database (Denmark)

    Scholderer, Joachim; Grunert, Klaus G.; Brunsø, Karen

    2005-01-01

    additive bias from cross-cultural data. The procedure involves four steps: (1) embed a potentially biased item in a factor-analytic measurement model, (2) test for the existence of additive bias between populations, (3) use the factor-analytic model to estimate the magnitude of the bias, and (4) replace......Measurement bias in cross-cultural surveys can seriously threaten the validity of hypothesis tests. Direct comparisons of means depend on the assumption that differences in observed variables reflect differences in the underlying constructs, and not an additive bias that may be caused by cultural...... differences in the understanding of item wording or response category labels. However, experience suggests that additive bias can be found more often than not. Based on the concept of partial measurement invariance (Byrne, Shavelson and Muthén 1989), the present paper develops a procedure for eliminating...

  1. Applying Item Response Theory methods to design a learning progression-based science assessment

    Science.gov (United States)

    Chen, Jing

    Learning progressions are used to describe how students' understanding of a topic progresses over time and to classify the progress of students into steps or levels. This study applies Item Response Theory (IRT) based methods to investigate how to design learning progression-based science assessments. The research questions of this study are: (1) how to use items in different formats to classify students into levels on the learning progression, (2) how to design a test to give good information about students' progress through the learning progression of a particular construct and (3) what characteristics of test items support their use for assessing students' levels. Data used for this study were collected from 1500 elementary and secondary school students during 2009--2010. The written assessment was developed in several formats such as the Constructed Response (CR) items, Ordered Multiple Choice (OMC) and Multiple True or False (MTF) items. The followings are the main findings from this study. The OMC, MTF and CR items might measure different components of the construct. A single construct explained most of the variance in students' performances. However, additional dimensions in terms of item format can explain certain amount of the variance in student performance. So additional dimensions need to be considered when we want to capture the differences in students' performances on different types of items targeting the understanding of the same underlying progression. Items in each item format need to be improved in certain ways to classify students more accurately into the learning progression levels. This study establishes some general steps that can be followed to design other learning progression-based tests as well. For example, first, the boundaries between levels on the IRT scale can be defined by using the means of the item thresholds across a set of good items. Second, items in multiple formats can be selected to achieve the information criterion at all

  2. Development of a self-report physical function instrument for disability assessment: item pool construction and factor analysis.

    Science.gov (United States)

    McDonough, Christine M; Jette, Alan M; Ni, Pengsheng; Bogusz, Kara; Marfeo, Elizabeth E; Brandt, Diane E; Chan, Leighton; Meterko, Mark; Haley, Stephen M; Rasch, Elizabeth K

    2013-09-01

    To build a comprehensive item pool representing work-relevant physical functioning and to test the factor structure of the item pool. These developmental steps represent initial outcomes of a broader project to develop instruments for the assessment of function within the context of Social Security Administration (SSA) disability programs. Comprehensive literature review; gap analysis; item generation with expert panel input; stakeholder interviews; cognitive interviews; cross-sectional survey administration; and exploratory and confirmatory factor analyses to assess item pool structure. In-person and semistructured interviews and Internet and telephone surveys. Sample of SSA claimants (n=1017) and a normative sample of adults from the U.S. general population (n=999). Not applicable. Model fit statistics. The final item pool consisted of 139 items. Within the claimant sample, 58.7% were white; 31.8% were black; 46.6% were women; and the mean age was 49.7 years. Initial factor analyses revealed a 4-factor solution, which included more items and allowed separate characterization of: (1) changing and maintaining body position, (2) whole body mobility, (3) upper body function, and (4) upper extremity fine motor. The final 4-factor model included 91 items. Confirmatory factor analyses for the 4-factor models for the claimant and the normative samples demonstrated very good fit. Fit statistics for claimant and normative samples, respectively, were: Comparative Fit Index=.93 and .98; Tucker-Lewis Index=.92 and .98; and root mean square error approximation=.05 and .04. The factor structure of the physical function item pool closely resembled the hypothesized content model. The 4 scales relevant to work activities offer promise for providing reliable information about claimant physical functioning relevant to work disability. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  3. Using Likert-type and ipsative/forced choice items in sequence to generate a preference.

    Science.gov (United States)

    Ried, L Douglas

    2014-01-01

    Collaboration and implementation of a minimum, standardized set of core global educational and professional competencies seems appropriate given the expanding international evolution of pharmacy practice. However, winnowing down hundreds of competencies from a plethora of local, national and international competency frameworks to select the most highly preferred to be included in the core set is a daunting task. The objective of this paper is to describe a combination of strategies used to ascertain the most highly preferred items among a large number of disparate items. In this case, the items were >100 educational and professional competencies that might be incorporated as the core components of new and existing competency frameworks. Panelists (n = 30) from the European Union (EU) and United States (USA) were chosen to reflect a variety of practice settings. Each panelist completed two electronic surveys. The first survey presented competencies in a Likert-type format and the second survey presented many of the same competencies in an ipsative/forced choice format. Item mean scores were calculated for each competency, the competencies were ranked, and non-parametric statistical tests were used to ascertain the consistency in the rankings achieved by the two strategies. This exploratory study presented over 100 competencies to the panelists in the beginning. The two methods provided similar results, as indicated by the significant correlation between the rankings (Spearman's rho = 0.30, P < 0.09). A two-step strategy using Likert-type and ipsative/forced choice formats in sequence, appears to be useful in a situation where a clear preference is required from among a large number of choices. The ipsative/forced choice format resulted in some differences in the competency preferences because the panelists could not rate them equally by design. While this strategy was used for the selection of professional educational competencies in this exploratory study, it is

  4. A Practitioner's Instrument for Measuring Secondary Mathematics Teachers' Beliefs Surrounding Learner-Centered Classroom Practice.

    Science.gov (United States)

    Lischka, Alyson E; Garner, Mary

    In this paper we present the development and validation of a Mathematics Teaching Pedagogical and Discourse Beliefs Instrument (MTPDBI), a 20 item partial-credit survey designed and analyzed using Rasch measurement theory. Items on the MTPDBI address beliefs about the nature of mathematics, teaching and learning mathematics, and classroom discourse practices. A Rasch partial credit model (Masters, 1982) was estimated from the pilot study data. Results show that item separation reliability is .96 and person separation reliability is .71. Other analyses indicate the instrument is a viable measure of secondary teachers' beliefs about reform-oriented mathematics teaching and learning. This instrument is proposed as a useful measure of teacher beliefs for those working with pre-service and in-service teacher development.

  5. What Does a Verbal Test Measure? A New Approach to Understanding Sources of Item Difficulty.

    Science.gov (United States)

    Berk, Eric J. Vanden; Lohman, David F.; Cassata, Jennifer Coyne

    Assessing the construct relevance of mental test results continues to present many challenges, and it has proven to be particularly difficult to assess the construct relevance of verbal items. This study was conducted to gain a better understanding of the conceptual sources of verbal item difficulty using a unique approach that integrates…

  6. Identifying the ‘red flags’ for unhealthy weight control among adolescents: Findings from an item response theory analysis of a national survey

    Directory of Open Access Journals (Sweden)

    Utter Jennifer

    2012-08-01

    Full Text Available Abstract Background Weight control behaviors are common among young people and are associated with poor health outcomes. Yet clinicians rarely ask young people about their weight control; this may be due to uncertainty about which questions to ask, specifically around whether certain weight loss strategies are healthier or unhealthy or about what weight loss behaviors are more likely to lead to adverse outcomes. Thus, the aims of the current study are: to confirm, using item response theory analysis, that the underlying latent constructs of healthy and unhealthy weight control exist; to determine the ‘red flag’ weight loss behaviors that may discriminate unhealthy from healthy weight loss; to determine the relationships between healthy and unhealthy weight loss and mental health; and to examine how weight control may vary among demographic groups. Methods Data were collected as part of a national health and wellbeing survey of secondary school students in New Zealand (n = 9,107 in 2007. Item response theory analyses were conducted to determine the underlying constructs of weight control behaviors and the behaviors that discriminate unhealthy from healthy weight control. Results The current study confirms that there are two underlying constructs of weight loss behaviors which can be described as healthy and unhealthy weight control. Unhealthy weight control was positively correlated with depressive mood. Fasting and skipping meals for weight loss had the lowest item thresholds on the unhealthy weight control continuum, indicating that they act as ‘red flags’ and warrant further discussion in routine clinical assessments. Conclusions Routine assessments of weight control strategies by clinicians are warranted, particularly for screening for meal skipping and fasting for weight loss as these behaviors appear to ‘flag’ behaviors that are associated with poor mental wellbeing.

  7. A Survey of Binary Similarity and Distance Measures

    Directory of Open Access Journals (Sweden)

    Seung-Seok Choi

    2010-02-01

    Full Text Available The binary feature vector is one of the most common representations of patterns and measuring similarity and distance measures play a critical role in many problems such as clustering, classification, etc. Ever since Jaccard proposed a similarity measure to classify ecological species in 1901, numerous binary similarity and distance measures have been proposed in various fields. Applying appropriate measures results in more accurate data analysis. Notwithstanding, few comprehensive surveys on binary measures have been conducted. Hence we collected 76 binary similarity and distance measures used over the last century and reveal their correlations through the hierarchical clustering technique.

  8. Psychometric evaluation and design of patient-centered communication measures for cancer care settings.

    Science.gov (United States)

    Reeve, Bryce B; Thissen, David M; Bann, Carla M; Mack, Nicole; Treiman, Katherine; Sanoff, Hanna K; Roach, Nancy; Magnus, Brooke E; He, Jason; Wagner, Laura K; Moultrie, Rebecca; Jackson, Kathryn D; Mann, Courtney; McCormack, Lauren A

    2017-07-01

    To evaluate the psychometric properties of questions that assess patient perceptions of patient-provider communication and design measures of patient-centered communication (PCC). Participants (adults with colon or rectal cancer living in North Carolina) completed a survey at 2 to 3 months post-diagnosis. The survey included 87 questions in six PCC Functions: Exchanging Information, Fostering Health Relationships, Making Decisions, Responding to Emotions, Enabling Patient Self-Management, and Managing Uncertainty. For each Function we conducted factor analyses, item response theory modeling, and tests for differential item functioning, and assessed reliability and construct validity. Participants included 501 respondents; 46% had a high school education or less. Reliability within each Function ranged from 0.90 to 0.96. The PCC-Ca-36 (36-question survey; reliability=0.94) and PCC-Ca-6 (6-question survey; reliability=0.92) measures differentiated between individuals with poor and good health (i.e., known-groups validity) and were highly correlated with the HINTS communication scale (i.e., convergent validity). This study provides theory-grounded PCC measures found to be reliable and valid in colorectal cancer patients in North Carolina. Future work should evaluate measure validity over time and in other cancer populations. The PCC-Ca-36 and PCC-Ca-6 measures may be used for surveillance, intervention research, and quality improvement initiatives. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  9. Perception that "everything requires a lot of effort": transcultural SCL-25 item validation.

    Science.gov (United States)

    Moreau, Nicolas; Hassan, Ghayda; Rousseau, Cécile; Chenguiti, Khalid

    2009-09-01

    This brief report illustrates how the migration context can affect specific item validity of mental health measures. The SCL-25 was administered to 432 recently settled immigrants (220 Haitian and 212 Arabs). We performed descriptive analyses, as well as Infit and Outfit statistics analyses using WINSTEPS Rasch Measurement Software based on Item Response Theory. The participants' comments about the item You feel everything requires a lot of effort in the SCL-25 were also qualitatively analyzed. Results revealed that the item You feel everything requires a lot of effort is an outlier and does not adjust in an expected and valid fashion with its cluster items, as it is over-endorsed by Haitian and Arab healthy participants. Our study thus shows that, in transcultural mental health research, the cultural and migratory contexts may interact and significantly influence the meaning of some symptom items and consequently, the validity of symptom scales.

  10. Examining the Psychometric Quality of Multiple-Choice Assessment Items using Mokken Scale Analysis.

    Science.gov (United States)

    Wind, Stefanie A

    The concept of invariant measurement is typically associated with Rasch measurement theory (Engelhard, 2013). Concerned with the appropriateness of the parametric transformation upon which the Rasch model is based, Mokken (1971) proposed a nonparametric procedure for evaluating the quality of social science measurement that is theoretically and empirically related to the Rasch model. Mokken's nonparametric procedure can be used to evaluate the quality of dichotomous and polytomous items in terms of the requirements for invariant measurement. Despite these potential benefits, the use of Mokken scaling to examine the properties of multiple-choice (MC) items in education has not yet been fully explored. A nonparametric approach to evaluating MC items is promising in that this approach facilitates the evaluation of assessments in terms of invariant measurement without imposing potentially inappropriate transformations. Using Rasch-based indices of measurement quality as a frame of reference, data from an eighth-grade physical science assessment are used to illustrate and explore Mokken-based techniques for evaluating the quality of MC items. Implications for research and practice are discussed.

  11. Quantum partial search for uneven distribution of multiple target items

    Science.gov (United States)

    Zhang, Kun; Korepin, Vladimir

    2018-06-01

    Quantum partial search algorithm is an approximate search. It aims to find a target block (which has the target items). It runs a little faster than full Grover search. In this paper, we consider quantum partial search algorithm for multiple target items unevenly distributed in a database (target blocks have different number of target items). The algorithm we describe can locate one of the target blocks. Efficiency of the algorithm is measured by number of queries to the oracle. We optimize the algorithm in order to improve efficiency. By perturbation method, we find that the algorithm runs the fastest when target items are evenly distributed in database.

  12. Measuring organisational readiness for patient engagement (MORE): an international online Delphi consensus study.

    Science.gov (United States)

    Oostendorp, Linda J M; Durand, Marie-Anne; Lloyd, Amy; Elwyn, Glyn

    2015-02-14

    Widespread implementation of patient engagement by organisations and clinical teams is not a reality yet. The aim of this study is to develop a measure of organisational readiness for patient engagement designed to monitor and facilitate a healthcare organisation's willingness and ability to effectively implement patient engagement in healthcare. The development of the MORE (Measuring Organisational Readiness for patient Engagement) scale was guided by Weiner's theory of organisational readiness for change. Weiner postulates that an organisation's readiness is determined by both the willingness and ability to implement the change (i.e. in this context: patient engagement). A first version of the scale was developed based on a literature search and evaluation of pre-existing tools. We invited multi-disciplinary stakeholders to participate in a two-round online Delphi survey. Respondents were asked to rate the importance of each proposed item, and to comment on the proposed domains and items. Second round participants received feedback from the first round and were asked to re-rate the importance of the revised, new and unchanged items, and to provide comments. The first version of the scale contained 51 items divided into three domains: (1) Respondents' characteristics; (2) the organisation's willingness to implement patient engagement; and (3) the organisation's ability to implement patient engagement. 131 respondents from 16 countries (health care managers, policy makers, clinicians, patients and patient representatives, researchers, and other stakeholders) completed the first survey, and 72 of them also completed the second survey. During the Delphi process, 34 items were reworded, 8 new items were added, 5 items were removed, and 18 were combined. The scale's instructions were revised. The final version of MORE totalled 38 items; 5 on stakeholders, 13 on an organisation's willingness to implement, and 20 on an organisation's ability to implement patient

  13. ARABIC TRANSLATION AND ADAPTATION OF THE HOSPITAL CONSUMER ASSESSMENT OF HEALTHCARE PROVIDERS AND SYSTEMS (HCAHPS) PATIENT SATISFACTION SURVEY INSTRUMENT.

    Science.gov (United States)

    Dockins, James; Abuzahrieh, Ramzi; Stack, Martin

    2015-01-01

    To translate and adapt an effective, validated, benchmarked, and widely used patient satisfaction measurement tool for use with an Arabic-speaking population. Translation of survey's items, survey administration process development, evaluation of reliability, and international benchmarking Three hundred-bed tertiary care hospital in Jeddah, Saudi Arabia. 645 patients discharged during 2011 from the hospital's inpatient care units. INTERVENTIONS; The Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) instrument was translated into Arabic, a randomized weekly sample of patients was selected, and the survey was administered via telephone during 2011 to patients or their relatives. Scores were compiled for each of the HCAHPS questions and then for each of the six HCAHPS clinical composites, two non-clinical items, and two global items. Clinical composite scores, as well as the two non-clinical and two global items were analyzed for the 645 respondents. Clinical composites were analyzed using Spearman's correlation coefficient and Cronbach's alpha to demonstrate acceptable internal consistency for these items and scales demonstrated acceptable internal consistency for the clinical composites. (Spearman's correlation coefficient = 0.327 - 0.750, P quarterly to US national averages with results that closely paralleled the US benchmarks. . The Arabic translation and adaptation of the HCAHPS is a valid, reliable, and feasible tool for evaluation and benchmarking of inpatient satisfaction in Arabic speaking populations.

  14. Communicative Access Measures for Stroke: Development and Evaluation of a Quality Improvement Tool.

    Science.gov (United States)

    Kagan, Aura; Simmons-Mackie, Nina; Victor, J Charles; Chan, Melodie T

    2017-11-01

    To (1) develop a systems-level quality improvement tool targeting communicative access to information and decision-making for stroke patients with language disorders; and (2) evaluate the resulting tool-the Communicative Access Measures for Stroke (CAMS). Survey development and evaluation was in line with accepted guidelines and included item generation and reduction, survey formatting and composition, pretesting, pilot testing, and reliability assessment. Development and evaluation were carried out in hospital and community agency settings. The project used a convenience sample of 31 participants for the survey development, and 63 participants for the CAMS reliability study (broken down into 6 administrators/managers, 32 frontline staff, 25 participants with aphasia). Eligible participants invited to the reliability study included individuals from 45 community-based organizations in Ontario as well as 4400 individuals from communities of practice. Not applicable. Data were analyzed using kappa statistics and intraclass correlations for each item score on all surveys. A tool, the CAMS, comprising 3 surveys, was developed for health facilities from the perspectives of (1) administrators/policymakers, (2) staff/frontline health care providers, and (3) patients with aphasia (using a communicatively accessible version). Reliability for items on the CAMS-Administrator and CAMS-Staff surveys was moderate to high (kappa/intraclass correlation coefficients [ICCs], .54-1.00). As expected, reliability was lower for the CAMS-Patient survey, with most items having ICCs between 0.4 and 0.6. These findings suggest that CAMS may provide useful quality improvement information for health care facilities with an interest in improving care for patients with stroke and aphasia. Copyright © 2017 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  15. Design of Web Questionnaires : A Test for Number of Items per Screen

    NARCIS (Netherlands)

    Toepoel, V.; Das, J.W.M.; van Soest, A.H.O.

    2005-01-01

    This paper presents results from an experimental manipulation of one versus multiple-items per screen format in a Web survey.The purpose of the experiment was to find out if a questionnaire s format influences how respondents provide answers in online questionnaires and if this is depending on

  16. Nonparametric Bounds in the Presence of Item Nonresponse, Unfolding Brackets and Anchoring

    NARCIS (Netherlands)

    Vazquez-Alvarez, R.; Melenberg, B.; van Soest, A.H.O.

    2001-01-01

    Household surveys often suffer from nonresponse on variables such as income, savings or wealth.Recent work by Manski shows how bounds on conditional quantiles of the variable of interest can be derived, allowing for any type of nonrandom item nonresponse.The width between these bounds can be reduced

  17. Measuring single constructs by single items: Constructing an even shorter version of the "Short Five" personality inventory.

    Directory of Open Access Journals (Sweden)

    Kenn Konstabel

    Full Text Available The aim of this study was to construct a short, 30-item personality questionnaire that would be, in terms of content and meaning of the scores, as comparable as possible with longer, well-established inventories such as NEO PI-R and its clones. To do this, we shortened the formerly constructed 60-item "Short Five" (S5 by half so that each subscale would be represented by a single item. We compared all possibilities of selecting 30 items (preserving balanced keying within each domain of the five-factor model in terms of correlations with well-established scales, self-peer correlations, and clarity of meaning, and selected an optimal combination for each domain. The resulting shortened questionnaire, XS5, was compared to the original S5 using data from student samples in 6 different countries (Estonia, Finland, UK, Germany, Spain, and China, and a representative Finnish sample. The correlations between XS5 domain scales and their longer counterparts from well-established scales ranged from 0.74 to 0.84; the difference from the equivalent correlations for full version of S5 or from meta-analytic short-term dependability coefficients of NEO PI-R was not large. In terms of prediction of external criteria (emotional experience and self-reported behaviours, there were no important differences between XS5, S5, and the longer well-established scales. Controlling for acquiescence did not improve the prediction of criteria, self-peer correlations, or correlations with longer scales, but it did improve internal reliability and, in some analyses, comparability of the principal component structure. XS5 can be recommended as an economic measure of the five-factor model of personality at the level of domain scales; it has reasonable psychometric properties, fair correlations with longer well-established scales, and it can predict emotional experience and self-reported behaviours no worse than S5. When subscales are essential, we would still recommend using the

  18. Measuring the Success of a Pipeline Program to Increase Nursing Workforce Diversity.

    Science.gov (United States)

    Katz, Janet R; Barbosa-Leiker, Celestina; Benavides-Vaello, Sandra

    2016-01-01

    The purpose of this study was to understand changes in knowledge and opinions of underserved American Indian and Hispanic high school students after attending a 2-week summer pipeline program using and testing a pre/postsurvey. The research aims were to (a) psychometrically analyze the survey to determine if scale items could be summed to create a total scale score or subscale scores; (b) assess change in scores pre/postprogram; and (c) examine the survey to make suggestions for modifications and further testing to develop a valid tool to measure changes in student perceptions about going to college and nursing as a result of pipeline programs. Psychometric analysis indicated poor model fit for a 1-factor model for the total scale and majority of subscales. Nonparametric tests indicated statistically significant increases in 13 items and decreases in 2 items. Therefore, while total scores or subscale scores cannot be used to assess changes in perceptions from pre- to postprogram, the survey can be used to examine changes over time in each item. Student did not have an accurate view of nursing and college and underestimated support needed to attend college. However students realized that nursing was a profession with autonomy, respect, and honor. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Dissociation between source and item memory in Parkinson's disease

    Institute of Scientific and Technical Information of China (English)

    Hu Panpan; Li Youhai; Ma Huijuan; Xi Chunhua; Chen Xianwen; Wang Kai

    2014-01-01

    Background Episodic memory includes information about item memory and source memory.Many researches support the hypothesis that these two memory systems are implemented by different brain structures.The aim of this study was to investigate the characteristics of item memory and source memory processing in patients with Parkinson's disease (PD),and to further verify the hypothesis of dual-process model of source and item memory.Methods We established a neuropsychological battery to measure the performance of item memory and source memory.Totally 35 PD individuals and 35 matched healthy controls (HC) were administrated with the battery.Item memory task consists of the learning and recognition of high-frequency national Chinese characters; source memory task consists of the learning and recognition of three modes (character,picture,and image) of objects.Results Compared with the controls,the idiopathic PD patients have been impaired source memory (PD vs.HC:0.65±0.06 vs.0.72±0.09,P=0.001),but not impaired in item memory (PD vs.HC:0.65±0.07 vs.0.67±0.08,P=0.240).Conclusions The present experiment provides evidence for dissociation between item and source memory in PD patients,thereby strengthening the claim that the item or source memory rely on different brain structures.PD patients show poor source memory,in which dopamine plays a critical role.

  20. Bayesian galaxy shape measurement for weak lensing surveys - III. Application to the Canada-France-Hawaii Telescope Lensing Survey

    Science.gov (United States)

    Miller, L.; Heymans, C.; Kitching, T. D.; van Waerbeke, L.; Erben, T.; Hildebrandt, H.; Hoekstra, H.; Mellier, Y.; Rowe, B. T. P.; Coupon, J.; Dietrich, J. P.; Fu, L.; Harnois-Déraps, J.; Hudson, M. J.; Kilbinger, M.; Kuijken, K.; Schrabback, T.; Semboloni, E.; Vafaei, S.; Velander, M.

    2013-03-01

    A likelihood-based method for measuring weak gravitational lensing shear in deep galaxy surveys is described and applied to the Canada-France-Hawaii Telescope (CFHT) Lensing Survey (CFHTLenS). CFHTLenS comprises 154 deg2 of multi-colour optical data from the CFHT Legacy Survey, with lensing measurements being made in the i' band to a depth i'AB noise ratio νSN ≳ 10. The method is based on the lensfit algorithm described in earlier papers, but here we describe a full analysis pipeline that takes into account the properties of real surveys. The method creates pixel-based models of the varying point spread function (PSF) in individual image exposures. It fits PSF-convolved two-component (disc plus bulge) models to measure the ellipticity of each galaxy, with Bayesian marginalization over model nuisance parameters of galaxy position, size, brightness and bulge fraction. The method allows optimal joint measurement of multiple, dithered image exposures, taking into account imaging distortion and the alignment of the multiple measurements. We discuss the effects of noise bias on the likelihood distribution of galaxy ellipticity. Two sets of image simulations that mirror the observed properties of CFHTLenS have been created to establish the method's accuracy and to derive an empirical correction for the effects of noise bias.

  1. Developing a theoretical model and questionnaire survey instrument to measure the success of electronic health records in residential aged care.

    Science.gov (United States)

    Yu, Ping; Qian, Siyu

    2018-01-01

    Electronic health records (EHR) are introduced into healthcare organizations worldwide to improve patient safety, healthcare quality and efficiency. A rigorous evaluation of this technology is important to reduce potential negative effects on patient and staff, to provide decision makers with accurate information for system improvement and to ensure return on investment. Therefore, this study develops a theoretical model and questionnaire survey instrument to assess the success of organizational EHR in routine use from the viewpoint of nursing staff in residential aged care homes. The proposed research model incorporates six variables in the reformulated DeLone and McLean information systems success model: system quality, information quality, service quality, use, user satisfaction and net benefits. Two variables training and self-efficacy were also incorporated into the model. A questionnaire survey instrument was designed to measure the eight variables in the model. After a pilot test, the measurement scale was used to collect data from 243 nursing staff members in 10 residential aged care homes belonging to three management groups in Australia. Partial least squares path modeling was conducted to validate the model. The validated EHR systems success model predicts the impact of the four antecedent variables-training, self-efficacy, system quality and information quality-on the net benefits, the indicator of EHR systems success, through the intermittent variables use and user satisfaction. A 24-item measurement scale was developed to quantitatively evaluate the performance of an EHR system. The parsimonious EHR systems success model and the measurement scale can be used to benchmark EHR systems success across organizations and units and over time.

  2. School nutritional capacity, resources and practices are associated with availability of food/beverage items in schools.

    Science.gov (United States)

    Mâsse, Louise C; de Niet, Judith E

    2013-02-19

    The school food environment is important to target as less healthful food and beverages are widely available at schools. This study examined whether the availability of specific food/beverage items was associated with a number of school environmental factors. Principals from elementary (n=369) and middle/high schools (n=118) in British Columbia (BC), Canada completed a survey measuring characteristics of the school environment. Our measurement framework integrated constructs from the Theories of Organizational Change and elements from Stillman's Tobacco Policy Framework adapted for obesity prevention. Our measurement framework included assessment of policy institutionalization of nutritional guidelines at the district and school levels, climate, nutritional capacity and resources (nutritional resources and participation in nutritional programs), nutritional practices, and school community support for enacting stricter nutritional guidelines. We used hierarchical mixed-effects logistic regression analyses to examine associations with the availability of fruit, vegetables, pizza/hamburgers/hot dogs, chocolate candy, sugar-sweetened beverages, and french fried potatoes. In elementary schools, fruit and vegetable availability was more likely among schools that have more nutritional resources (OR=6.74 and 5.23, respectively). In addition, fruit availability in elementary schools was highest in schools that participated in the BC School Fruit and Vegetable Nutritional Program and the BC Milk program (OR=4.54 and OR=3.05, respectively). In middle/high schools, having more nutritional resources was associated with vegetable availability only (OR=5.78). Finally, middle/high schools that have healthier nutritional practices (i.e., which align with upcoming provincial/state guidelines) were less likely to have the following food/beverage items available at school: chocolate candy (OR= .80) and sugar-sweetened beverages (OR= .76). School nutritional capacity, resources

  3. Random selection of items. Selection of n1 samples among N items composing a stratum

    International Nuclear Information System (INIS)

    Jaech, J.L.; Lemaire, R.J.

    1987-02-01

    STR-224 provides generalized procedures to determine required sample sizes, for instance in the course of a Physical Inventory Verification at Bulk Handling Facilities. The present report describes procedures to generate random numbers and select groups of items to be verified in a given stratum through each of the measurement methods involved in the verification. (author). 3 refs

  4. Evolution of a Test Item

    Science.gov (United States)

    Spaan, Mary

    2007-01-01

    This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…

  5. Behaviors in Advance Care Planning and ACtions Survey (BACPACS): development and validation part 1.

    Science.gov (United States)

    Kassam, Aliya; Douglas, Maureen L; Simon, Jessica; Cunningham, Shannon; Fassbender, Konrad; Shaw, Marta; Davison, Sara N

    2017-11-22

    Although advance care planning (ACP) is fairly well understood, significant barriers to patient participation remain. As a result, tools to assess patient behaviour are required. The objective of this study was to improve the measurement of patient engagement in ACP by detecting existing survey design issues and establishing content and response process validity for a new survey entitled Behaviours in Advance Care Planning and ACtions Survey (BACPACS). We based our new tool on that of an existing ACP engagement survey. Initial item reduction was carried out using behavior change theories by content and design experts to help reduce response burden and clarify questions. Thirty-two patients with chronic diseases (cancer, heart failure or renal failure) were recruited for the think aloud cognitive interviewing with the new, shortened survey evaluating patient engagement with ACP. Of these, n = 27 had data eligible for analysis (n = 8 in round 1 and n = 19 in rounds 2 and 3). Interviews were audio-recorded and analyzed using the constant comparison method. Three reviewers independently listened to the interviews, summarized findings and discussed discrepancies until consensus was achieved. Item reduction from key content expert review and conversation analysis helped decrease number of items from 116 in the original ACP Engagement Survey to 24-38 in the new BACPACS depending on branching of responses. For the think aloud study, three rounds of interviews were needed until saturation for patient clarity was achieved. The understanding of ACP as a construct, survey response options, instructions and terminology pertaining to patient engagement in ACP warranted further clarification. Conversation analysis, content expert review and think aloud cognitive interviewing were useful in refining the new survey instrument entitled BACPACS. We found evidence for both content and response process validity for this new tool.

  6. Monitoring the health of transgender and other gender minority populations: validity of natal sex and gender identity survey items in a U.S. national cohort of young adults.

    Science.gov (United States)

    Reisner, Sari L; Conron, Kerith J; Tardiff, Laura Anatale; Jarvi, Stephanie; Gordon, Allegra R; Austin, S Bryn

    2014-11-26

    A barrier to monitoring the health of gender minority (transgender) populations is the lack of brief, validated tools with which to identify participants in surveillance systems. We used the Growing Up Today Study (GUTS), a prospective cohort study of U.S. young adults (mean age = 20.7 years in 2005), to assess the validity of self-report measures and implement a two-step method to measure gender minority status (step 1: assigned sex at birth, step 2: current gender identity). A mixed-methods study was conducted in 2013. Construct validity was evaluated in secondary data analysis of the 2010 wave (n = 7,831). Cognitive testing interviews of close-ended measures were conducted with a subsample of participants (n = 39). Compared to cisgender (non-transgender) participants, transgender participants had higher levels of recalled childhood gender nonconformity age gender nonconformity and were more likely to have ever identified as not completely heterosexual (p gender minority participants. Assigned sex at birth was interpreted as sex designated on a birth certificate; transgender was understood to be a difference between a person's natal sex and gender identity. Participants were correctly classified as male, female, or transgender. The survey items performed well in this sample and are recommended for further evaluation in languages other than English and with diverse samples in terms of age, race/ethnicity, and socioeconomic status.

  7. Turkish Version of the Survey of Attitudes toward Statistics: Factorial Structure Invariance by Gender

    Science.gov (United States)

    Sarikaya, Esma Emmioglu; Ok, Ahmet; Aydin, Yesim Capa; Schau, Candace

    2018-01-01

    This study examines factorial structure and the gender invariance of the Turkish version of the Survey of Attitudes toward Statistics (SATS-36). The SATS-36 has 36 items measuring six components: affect, cognitive competence, value, difficulty, effort, and interest. Data were collected from 347 university students. Results showed that the Turkish…

  8. Using Reversed MFCC and IT-EM for Automatic Speaker Verification

    Directory of Open Access Journals (Sweden)

    Sheeraz Memon

    2012-01-01

    Full Text Available This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/ Reverse Mel Frequency Coefficients and IT-EM (Information Theoretic Expectation Maximization. To perform speaker verification, feature extraction using Mel scale has been widely applied and has established better results. The IMFCC is based on inverse Mel-scale. The IMFCC effectively captures information available at the high frequency formants which is ignored by the MFCC. In this paper the fusion of MFCC and IMFCC at input level is proposed. GMMs (Gaussian Mixture Models based on EM (Expectation Maximization have been widely used for classification of text independent verification. However EM comes across the convergence issue. In this paper we use our proposed IT-EM which has faster convergence, to train speaker models. IT-EM uses information theory principles such as PDE (Parzen Density Estimation and KL (Kullback-Leibler divergence measure. IT-EM acclimatizes the weights, means and covariances, like EM. However, IT-EM process is not performed on feature vector sets but on a set of centroids obtained using IT (Information Theoretic metric. The IT-EM process at once diminishes divergence measure between PDE estimates of features distribution within a given class and the centroids distribution within the same class. The feature level fusion and IT-EM is tested for the task of speaker verification using NIST2001 and NIST2004. The experimental evaluation validates that MFCC/IMFCC has better results than the conventional delta/MFCC feature set. The MFCC/IMFCC feature vector size is also much smaller than the delta MFCC thus reducing the computational burden as well. IT-EM method also showed faster convergence, than the conventional EM method, and thus it leads to higher speaker recognition scores.

  9. Measuring teamwork in health care settings: a review of survey instruments.

    Science.gov (United States)

    Valentine, Melissa A; Nembhard, Ingrid M; Edmondson, Amy C

    2015-04-01

    Teamwork in health care settings is widely recognized as an important factor in providing high-quality patient care. However, the behaviors that comprise effective teamwork, the organizational factors that support teamwork, and the relationship between teamwork and patient outcomes remain empirical questions in need of rigorous study. To identify and review survey instruments used to assess dimensions of teamwork so as to facilitate high-quality research on this topic. We conducted a systematic review of articles published before September 2012 to identify survey instruments used to measure teamwork and to assess their conceptual content, psychometric validity, and relationships to outcomes of interest. We searched the ISI Web of Knowledge database, and identified relevant articles using the search terms team, teamwork, or collaboration in combination with survey, scale, measure, or questionnaire. We found 39 surveys that measured teamwork. Surveys assessed different dimensions of teamwork. The most commonly assessed dimensions were communication, coordination, and respect. Of the 39 surveys, 10 met all of the criteria for psychometric validity, and 14 showed significant relationships to nonself-report outcomes. Evidence of psychometric validity is lacking for many teamwork survey instruments. However, several psychometrically valid instruments are available. Researchers aiming to advance research on teamwork in health care should consider using or adapting one of these instruments before creating a new one. Because instruments vary considerably in the behavioral processes and emergent states of teamwork that they capture, researchers must carefully evaluate the conceptual consistency between instrument, research question, and context.

  10. Measuring teamwork and conflict among Emergency Medical Technician personnel

    Science.gov (United States)

    Patterson, P. Daniel; Weaver, Matthew D.; Weaver, Sallie J.; Rosen, Michael A.; Todorova, Gergana; Weingart, Laurie R.; Krackhardt, David; Lave, Judith R.; Arnold, Robert M.; Yealy, Donald M.; Salas, Eduardo

    2011-01-01

    Objective We sought to develop a reliable and valid tool for measuring teamwork among Emergency Medical Technician (EMT) partnerships. Methods We adapted existing scales and developed new items to measure components of teamwork. After recruiting a convenience sample of 39 agencies, we tested a 122-item draft survey tool. We performed a series of Exploratory Factor Analyses (EFA) and Confirmatory Factor Analysis (CFA) to test reliability and construct validity, describing variation in domain and global scores using descriptive statistics. Results We received 687 completed surveys. The EFA analyses identified a 9-factor solution. We labeled these factors [1] Team Orientation, [2] Team Structure & Leadership, [3] Partner Communication, Team Support, & Monitoring, [4] Partner Trust and Shared Mental Models, [5] Partner Adaptability & Back-Up Behavior, [6] Process Conflict, [7] Strong Task Conflict, [8] Mild Task Conflict, and [9] Interpersonal Conflict. We tested a short form (30-item SF) and long form (45-item LF) version. The CFA analyses determined that both the SF and LF versions possess positive psychometric properties of reliability and construct validity. The EMT-TEAMWORK-SF has positive internal consistency properties with a mean Cronbach’s alpha coefficient ≥0.70 across all 9-factors (mean=0.84; min=0.78, max=0.94). The mean Cronbach’s alpha coefficient for the EMT-TEAMWORK-LF version was 0.87 (min=0.79, max=0.94). There was wide variation in weighted scores across all 9 factors and the global score for the SF and LF versions. Mean scores were lowest for the Team Orientation factor (48.1, SD 21.5 SF; 49.3 SD 19.8 LF) and highest (more positive) for the Interpersonal Conflict factor (87.7 SD 18.1 for both SF and LF). Conclusions We developed a reliable and valid survey to evaluate teamwork between EMT partners. PMID:22128909

  11. Measuring teamwork and conflict among emergency medical technician personnel.

    Science.gov (United States)

    Patterson, P Daniel; Weaver, Matthew D; Weaver, Sallie J; Rosen, Michael A; Todorova, Gergana; Weingart, Laurie R; Krackhardt, David; Lave, Judith R; Arnold, Robert M; Yealy, Donald M; Salas, Eduardo

    2012-01-01

    We sought to develop a reliable and valid tool for measuring teamwork among emergency medical technician (EMT) partnerships. We adapted existing scales and developed new items to measure components of teamwork. After recruiting a convenience sample of 39 agencies, we tested a 122-item draft survey tool (EMT-TEAMWORK). We performed a series of exploratory factor analyses (EFAs) and confirmatory factor analysis (CFA) to test reliability and construct validity, describing variation in domain and global scores using descriptive statistics. We received 687 completed surveys. The EFAs identified a nine-factor solution. We labeled these factors 1) Team Orientation, 2) Team Structure & Leadership, 3) Partner Communication, Team Support, & Monitoring, 4) Partner Trust and Shared Mental Models, 5) Partner Adaptability & Back-Up Behavior, 6) Process Conflict, 7) Strong Task Conflict, 8) Mild Task Conflict, and 9) Interpersonal Conflict. We tested a short-form (30-item SF) and long-form (45-item LF) version. The CFAs determined that both the SF and the LF possess positive psychometric properties of reliability and construct validity. The EMT-TEAMWORK-SF has positive internal consistency properties, with a mean Cronbach's alpha coefficient ≥0.70 across all nine factors (mean = 0.84; minimum = 0.78, maximum = 0.94). The mean Cronbach's alpha coefficient for the EMT-TEAMWORK-LF was 0.87 (minimum = 0.79, maximum = 0.94). There was wide variation in weighted scores across all nine factors and the global score for the SF and LF. Mean scores were lowest for the Team Orientation factor (48.1, standard deviation [SD] 21.5, SF; 49.3, SD 19.8, LF) and highest (more positive) for the Interpersonal Conflict factor (87.7, SD 18.1, for both SF and LF). We developed a reliable and valid survey to evaluate teamwork between EMT partners.

  12. A study of the psychometric properties of 12-item World Health Organization Disability Assessment Schedule 2.0 in a large population of people with chronic musculoskeletal pain.

    Science.gov (United States)

    Saltychev, Mikhail; Bärlund, Esa; Mattie, Ryan; McCormick, Zachary; Paltamaa, Jaana; Laimi, Katri

    2017-02-01

    To assess the validity of the Finnish translation of the 12-item World Health Organization Disability Assessment Schedule (WHODAS 2.0). Cross-sectional cohort survey study. Physical and Rehabilitation Medicine outpatient university clinic. The 501 consecutive patients with chronic musculoskeletal pain. Exploratory factor analysis and a graded response model using item response theory analysis were used to assess the constructs and discrimination ability of WHODAS 2.0. The exploratory factor analysis revealed two retained factors with eigenvalues 5.15 and 1.04. Discrimination ability of all items was high or perfect, varying from 1.2 to 2.5. The difficulty levels of seven out of 12 items were shifted towards the elevated disability level. As a result, the entire test characteristic curve showed a shift towards higher levels of disability, placing it at the point of disability level of +1 (where 0 indicates the average level of disability within the sample). The present data indicate that the Finnish translation of the 12-item WHODAS 2.0 is a valid instrument for measuring restrictions of activity and participation among patients with chronic musculoskeletal pain.

  13. Measurement Properties of the Psoriasis Symptom Inventory Electronic Daily Diary in Patients with Moderate to Severe Plaque Psoriasis.

    Science.gov (United States)

    Viswanathan, Hema N; Mutebi, Alex; Milmont, Cassandra E; Gordon, Kenneth; Wilson, Hilary; Zhang, Hao; Klekotka, Paul A; Revicki, Dennis A; Augustin, Matthias; Kricorian, Gregory; Nirula, Ajay; Strober, Bruce

    2017-09-01

    The Psoriasis Symptom Inventory (PSI) is a patient-reported outcome instrument that measures the severity of psoriasis signs and symptoms. This study evaluated measurement properties of the PSI in patients with moderate to severe plaque psoriasis. This secondary analysis used pooled data from a phase 3 brodalumab clinical trial (AMAGINE-1). Outcome measures included the PSI, Psoriasis Area and Severity Index (PASI), static Physician's Global Assessment (sPGA), psoriasis-affected body surface area, 36-item Short-Form Health Survey version 2, and the Dermatology Life Quality Index (DLQI). The PSI was evaluated for dimensionality, item performance, reliability (internal consistency and test-retest), construct validity, ability to detect change, and agreement between PSI response and response measures based on the PASI, sPGA, and DLQI. Results supported unidimensionality, good item fit, ordered responses, and PSI scoring. The PSI demonstrated reliability: baseline Cronbach's alpha ≥ 0.92 and intraclass correlation coefficients ≥ 0.95. Correlations between PSI total score and DLQI item 1 (r = 0.86), DLQI symptoms and feelings (r = 0.87), and 36-item Short-Form Health Survey version 2 bodily pain (r = -0.61) supported convergent validity. PSI scores differed significantly (P 10%), and DLQI (≤ 5/> 5) at weeks 8 and 12. At week 12, the PSI detected significant changes in severity based on PASI responses (psoriasis signs and symptoms. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  14. Using a Multivariate Multilevel Polytomous Item Response Theory Model to Study Parallel Processes of Change: The Dynamic Association between Adolescents' Social Isolation and Engagement with Delinquent Peers in the National Youth Survey

    Science.gov (United States)

    Hsieh, Chueh-An; von Eye, Alexander A.; Maier, Kimberly S.

    2010-01-01

    The application of multidimensional item response theory models to repeated observations has demonstrated great promise in developmental research. It allows researchers to take into consideration both the characteristics of item response and measurement error in longitudinal trajectory analysis, which improves the reliability and validity of the…

  15. Characterizing Sources of Uncertainty in Item Response Theory Scale Scores

    Science.gov (United States)

    Yang, Ji Seung; Hansen, Mark; Cai, Li

    2012-01-01

    Traditional estimators of item response theory scale scores ignore uncertainty carried over from the item calibration process, which can lead to incorrect estimates of the standard errors of measurement (SEMs). Here, the authors review a variety of approaches that have been applied to this problem and compare them on the basis of their statistical…

  16. Developing Item Response Theory-Based Short Forms to Measure the Social Impact of Burn Injuries.

    Science.gov (United States)

    Marino, Molly E; Dore, Emily C; Ni, Pengsheng; Ryan, Colleen M; Schneider, Jeffrey C; Acton, Amy; Jette, Alan M; Kazis, Lewis E

    2018-03-01

    To develop self-reported short forms for the Life Impact Burn Recovery Evaluation (LIBRE) Profile. Short forms based on the item parameters of discrimination and average difficulty. A support network for burn survivors, peer support networks, social media, and mailings. Burn survivors (N=601) older than 18 years. Not applicable. The LIBRE Profile. Ten-item short forms were developed to cover the 6 LIBRE Profile scales: Relationships with Family & Friends, Social Interactions, Social Activities, Work & Employment, Romantic Relationships, and Sexual Relationships. Ceiling effects were ≤15% for all scales; floor effects were item bank, computerized adaptive test, and short forms are all scored along the same metric, and therefore scores are comparable regardless of the mode of administration. Copyright © 2017 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  17. Dimensions of insight in schizophrenia: Exploratory factor analysis of items from multiple self- and interviewer-rated measures of insight.

    Science.gov (United States)

    Konsztowicz, Susanna; Schmitz, Norbert; Lepage, Martin

    2018-03-10

    Insight in schizophrenia is regarded as a multidimensional construct that comprises aspects such as awareness of the disorder and recognition of the need for treatment. The proposed number of underlying dimensions of insight is variable in the literature. In an effort to identify a range of existing dimensions of insight, we conducted a factor analysis on combined items from multiple measures of insight. We recruited 165 participants with enduring schizophrenia (treated for >3years). Exploratory factor analysis was conducted on itemized scores from two interviewer-rated measures of insight: the Schedule for the Assessment of Insight-Expanded and the abbreviated Scale to assess Unawareness of Mental Disorder; and two self-report measures: the Birchwood Insight Scale and the Beck Cognitive Insight Scale. A five-factor solution was selected as the best-fitting model, with the following dimensions of insight: 1) awareness of illness and the need for treatment; 2) awareness and attribution of symptoms and consequences; 3) self-certainty; 4) self-reflectiveness for objectivity and fallibility; and 5) self-reflectiveness for errors in reasoning and openness to feedback. Insight in schizophrenia is a multidimensional construct comprised of distinct clinical and cognitive domains of awareness. Multiple measures of insight, both clinician- and self-rated, are needed to capture all of the existing dimensions of insight. Future exploration of associations between the various dimensions and their potential determinants will facilitate the development of clinically useful models of insight and effective interventions to improve outcome. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Development of a tool to measure person-centered maternity care in developing settings: validation in a rural and urban Kenyan population.

    Science.gov (United States)

    Afulani, Patience A; Diamond-Smith, Nadia; Golub, Ginger; Sudhinaraset, May

    2017-09-22

    Person-centered reproductive health care is recognized as critical to improving reproductive health outcomes. Yet, little research exists on how to operationalize it. We extend the literature in this area by developing and validating a tool to measure person-centered maternity care. We describe the process of developing the tool and present the results of psychometric analyses to assess its validity and reliability in a rural and urban setting in Kenya. We followed standard procedures for scale development. First, we reviewed the literature to define our construct and identify domains, and developed items to measure each domain. Next, we conducted expert reviews to assess content validity; and cognitive interviews with potential respondents to assess clarity, appropriateness, and relevance of the questions. The questions were then refined and administered in surveys; and survey results used to assess construct and criterion validity and reliability. The exploratory factor analysis yielded one dominant factor in both the rural and urban settings. Three factors with eigenvalues greater than one were identified for the rural sample and four factors identified for the urban sample. Thirty of the 38 items administered in the survey were retained based on the factors loadings and correlation between the items. Twenty-five items load very well onto a single factor in both the rural and urban sample, with five items loading well in either the rural or urban sample, but not in both samples. These 30 items also load on three sub-scales that we created to measure dignified and respectful care, communication and autonomy, and supportive care. The Chronbach alpha for the main scale is greater than 0.8 in both samples, and that for the sub-scales are between 0.6 and 0.8. The main scale and sub-scales are correlated with global measures of satisfaction with maternity services, suggesting criterion validity. We present a 30-item scale with three sub-scales to measure person

  19. Development and reliability testing of a self-report instrument to measure the office layout as a correlate of occupational sitting

    Directory of Open Access Journals (Sweden)

    Duncan Mitch J

    2013-02-01

    Full Text Available Abstract Background Spatial configurations of office environments assessed by Space Syntax methodologies are related to employee movement patterns. These methods require analysis of floors plans which are not readily available in large population-based studies or otherwise unavailable. Therefore a self-report instrument to assess spatial configurations of office environments using four scales was developed. Methods The scales are: local connectivity (16 items, overall connectivity (11 items, visibility of co-workers (10 items, and proximity of co-workers (5 items. A panel cohort (N = 1154 completed an online survey, only data from individuals employed in office-based occupations (n = 307 were used to assess scale measurement properties. To assess test-retest reliability a separate sample of 37 office-based workers completed the survey on two occasions 7.7 (±3.2 days apart. Redundant scale items were eliminated using factor analysis; Chronbach’s α was used to evaluate internal consistency and test re-test reliability (retest-ICC. ANOVA was employed to examine differences between office types (Private, Shared, Open as a measure of construct validity. Generalized Linear Models were used to examine relationships between spatial configuration scales and the duration of and frequency of breaks in occupational sitting. Results The number of items on all scales were reduced, Chronbach’s α and ICCs indicated good scale internal consistency and test re-test reliability: local connectivity (5 items; α = 0.70; retest-ICC = 0.84, overall connectivity (6 items; α = 0.86; retest-ICC = 0.87, visibility of co-workers (4 items; α = 0.78; retest-ICC = 0.86, and proximity of co-workers (3 items; α = 0.85; retest-ICC = 0.70. Significant (p ≤ 0.001 differences, in theoretically expected directions, were observed for all scales between office types, except overall connectivity. Significant associations were

  20. Comparison of Alternate and Original Items on the Montreal Cognitive Assessment.

    Science.gov (United States)

    Lebedeva, Elena; Huang, Mei; Koski, Lisa

    2016-03-01

    The Montreal Cognitive Assessment (MoCA) is a screening tool for mild cognitive impairment (MCI) in elderly individuals. We hypothesized that measurement error when using the new alternate MoCA versions to monitor change over time could be related to the use of items that are not of comparable difficulty to their corresponding originals of similar content. The objective of this study was to compare the difficulty of the alternate MoCA items to the original ones. Five selected items from alternate versions of the MoCA were included with items from the original MoCA administered adaptively to geriatric outpatients (N = 78). Rasch analysis was used to estimate the difficulty level of the items. None of the five items from the alternate versions matched the difficulty level of their corresponding original items. This study demonstrates the potential benefits of a Rasch analysis-based approach for selecting items during the process of development of parallel forms. The results suggest that better match of the items from different MoCA forms by their difficulty would result in higher sensitivity to changes in cognitive function over time.

  1. Applying modern psychometric techniques to melodic discrimination testing: Item response theory, computerised adaptive testing, and automatic item generation.

    Science.gov (United States)

    Harrison, Peter M C; Collins, Tom; Müllensiefen, Daniel

    2017-06-15

    Modern psychometric theory provides many useful tools for ability testing, such as item response theory, computerised adaptive testing, and automatic item generation. However, these techniques have yet to be integrated into mainstream psychological practice. This is unfortunate, because modern psychometric techniques can bring many benefits, including sophisticated reliability measures, improved construct validity, avoidance of exposure effects, and improved efficiency. In the present research we therefore use these techniques to develop a new test of a well-studied psychological capacity: melodic discrimination, the ability to detect differences between melodies. We calibrate and validate this test in a series of studies. Studies 1 and 2 respectively calibrate and validate an initial test version, while Studies 3 and 4 calibrate and validate an updated test version incorporating additional easy items. The results support the new test's viability, with evidence for strong reliability and construct validity. We discuss how these modern psychometric techniques may also be profitably applied to other areas of music psychology and psychological science in general.

  2. Improving a measure of mobility-related fatigue (the mobility-tiredness scale) by establishing item intensity

    DEFF Research Database (Denmark)

    Fieo, Robert A; Mortensen, Erik L; Rantanen, Taina

    2013-01-01

    To improve the construct validity of self-reported fatigue by establishing a formal hierarchy of scale items and to determine whether such a hierarchy could be maintained across time (aged 75-80), sex, and nationality.......To improve the construct validity of self-reported fatigue by establishing a formal hierarchy of scale items and to determine whether such a hierarchy could be maintained across time (aged 75-80), sex, and nationality....

  3. Validation of a 10-item care-related regret intensity scale (RIS-10) for health care professionals.

    Science.gov (United States)

    Courvoisier, Delphine S; Cullati, Stéphane; Haller, Chiara S; Schmidt, Ralph E; Haller, Guy; Agoritsas, Thomas; Perneger, Thomas V

    2013-03-01

    Regret after one of the many decisions and interventions that health care professionals make every day can have an impact on their own health and quality of life, and on their patient care practices. To validate a new care-related regret intensity scale (RIS) for health care professionals. Retrospective cross-sectional cohort study with a 1-month follow-up (test-retest) in a French-speaking University Hospital. A total of 469 nurses and physicians responded to the survey, and 175 answered the retest. RIS, self-report questions on the context of the regret-inducing event, its consequences for the patient, involvement of the health care professionals, and changes in patient care practices after the event. We measured the impact of regret intensity on health care professionals with the satisfaction with life scale, the SF-36 first question (self-reported health), and a question on self-esteem. On the basis of factor analysis and item response analysis, the initial 19-item scale was shortened to 10 items. The resulting scale (RIS-10) was unidimensional and had high internal consistency (α=0.87) and acceptable test-retest reliability (0.70). Higher regret intensity was associated with (a) more consequences for the patient; (b) lower life satisfaction and poorer self-reported health in health care professionals; and (c) changes in patient care practices. Nurses reported analyzing the event and apologizing, whereas physicians reported talking preferentially to colleagues, rather than to their supervisor, about changing practices. The RIS is a valid and reliable measure of care-related regret intensity for hospital-based physicians and nurses.

  4. Item Difficulty in the Evaluation of Computer-Based Instruction: An Example from Neuroanatomy

    Science.gov (United States)

    Chariker, Julia H.; Naaz, Farah; Pani, John R.

    2012-01-01

    This article reports large item effects in a study of computer-based learning of neuroanatomy. Outcome measures of the efficiency of learning, transfer of learning, and generalization of knowledge diverged by a wide margin across test items, with certain sets of items emerging as particularly difficult to master. In addition, the outcomes of…

  5. Validation of a survey instrument to assess home environments for physical activity and healthy eating in overweight children

    Directory of Open Access Journals (Sweden)

    Crane Lori A

    2008-01-01

    Full Text Available Abstract Background Few measures exist to measure the overall home environment for its ability to support physical activity (PA and healthy eating in overweight children. The purpose of this study was to develop and test the reliability and validity of such a measure. Methods The Home Environment Survey (HES was developed to reflect availability, accessibility, parental role modelling, and parental policies related to PA resources, fruits and vegetables (F&V, and sugar sweetened drinks and snacks (SS. Parents of overweight children (n = 219 completed the HES and concurrent behavioural assessments. Children completed the Block Kids survey and wore an accelerometer for one week. A subset of parents (n = 156 completed the HES a second time to determine test-retest reliability. Finally, 41 parent dyads living in the same home (n = 41 completed the survey to determine inter-rater reliability. Initial psychometric analyses were completed to trim items from the measure based on lack of variability in responses, moderate or higher item to scale correlation, or contribution to strong internal consistency. Inter-rater and test-retest reliability were completed using intraclass correlation coefficients. Validity was assessed using Pearson correlations between the HES scores and child and parent nutrition and PA. Results Eight items were removed and acceptable internal consistency was documented for all scales (α = .66–84 with the exception of the F&V accessibility. The F&V accessibility was reduced to a single item because the other two items did not meet reliability standards. Test-retest reliability was high (r > .75 for all scales. Inter-rater reliability varied across scales (r = .22–.89. PA accessibility, parent role modelling, and parental policies were all related significantly to child (r = .14–.21 and parent (r = .15–.31 PA. Similarly, availability of F&V and SS, parental role modelling, and parental policies were related to child (r

  6. The Measurement Invariance of the Student Opinion Survey across English and non-English Language Learner Students within the Context of Low- and High-Stakes Assessments

    Directory of Open Access Journals (Sweden)

    Jason C. Immekus

    2016-09-01

    Full Text Available Student effort on large-scale assessments has important implications on the interpretation and use of scores to guide decisions. Within the United States, English Language Learners (ELLs generally are outperformed on large-scale assessments by non-ELLs, prompting research to examine factors associated with test performance. There is a gap in the literature regarding the test-taking motivation of ELLs compared to non-ELLs and whether existing measures have similar psychometric properties across groups. The Student Opinion Survey (SOS; Sundre, 2007 was designed to be administered after completion of a large-scale assessment to operationalize students’ test-taking motivation. Based on data obtained on 5,257 (41.8% ELL 10th grade students, study purpose was to test the measurement invariance of the SOS across ELLs and non-ELLs based on completion of low- and high-stakes assessments. Preliminary item analyses supported the removal of two SOS items (Items 3 and 7 that resulted in improved internal consistency for each of the two SOS subscales: Importance, Effort. A subsequent multi-sample confirmatory factor analysis (MCFA supported the measurement invariance of the scale’s two-factor model across language groups, indicating it met strict factorial invariance (Meredith 1993. A follow-up latent means analysis found that ELLs had higher effort on both the low- and high-stakes assessment with a small effect size. Effect size estimates indicated negligible differences on the importance factor. Although the instrument can be expected to function similarly across diverse language groups, which may have direct utility of test users and research into factors associated with large-scale test performance, continued research is recommended. Implications for SOS use in applied and research settings are discussed.

  7. 41 CFR 101-28.306-6 - Sensitive items.

    Science.gov (United States)

    2010-07-01

    ... Regulations System FEDERAL PROPERTY MANAGEMENT REGULATIONS SUPPLY AND PROCUREMENT 28-STORAGE AND DISTRIBUTION... accountable item of personal property. Each customer activity shall take all appropriate measures necessary to... Government use. ...

  8. For data's sake: dilemmas in the measurement of gender minorities.

    Science.gov (United States)

    Glick, Jennifer L; Theall, Katherine; Andrinopoulos, Katherine; Kendall, Carl

    2018-03-13

    Gender-minority health disparity research is limited by binary gender measurement practices. This study seeks to broaden current discourse on gender identity measurement in the USA, including measurement adoption challenges and mitigation strategies, thereby allowing for better data collection to understand and address health disparities for people of all genders. Three data sources were used to triangulate findings: expert interviews with gender and sexuality research leaders; key-informant interviews with gender minorities in New Orleans, LA; and document analysis of relevant surveys, guides and commentaries. Ten key dilemmas were identified: 1) moving beyond binary gender construction; 2) conflation of gender, sex and sexual orientation; 3) emerging nature of gender-related language; 4) concerns about item sensitivity; 5) research fatigue among gender minorities; 6) design and analytical limitations; 7) categorical and procedural consistency; 8) pre-populated vs. open-field survey items; 9) potential misclassification; and 10) competing data collection needs. Researchers must continue working toward consensus concerning better practices is gender measurement and be explicit about their methodological choices. The existence of these dilemmas must not impede research on important health issues affecting gender minorities.

  9. A survey of resilience, burnout, and tolerance of uncertainty in Australian general practice registrars

    Directory of Open Access Journals (Sweden)

    Cooke Georga PE

    2013-01-01

    Full Text Available Abstract Background Burnout and intolerance of uncertainty have been linked to low job satisfaction and lower quality patient care. While resilience is related to these concepts, no study has examined these three concepts in a cohort of doctors. The objective of this study was to measure resilience, burnout, compassion satisfaction, personal meaning in patient care and intolerance of uncertainty in Australian general practice (GP registrars. Methods We conducted a paper-based cross-sectional survey of GP registrars in Australia from June to July 2010, recruited from a newsletter item or registrar education events. Survey measures included the Resilience Scale-14, a single-item scale for burnout, Professional Quality of Life (ProQOL scale, Personal Meaning in Patient Care scale, Intolerance of Uncertainty-12 scale, and Physician Response to Uncertainty scale. Results 128 GP registrars responded (response rate 90%. Fourteen percent of registrars were found to be at risk of burnout using the single-item scale for burnout, but none met the criteria for burnout using the ProQOL scale. Secondary traumatic stress, general intolerance of uncertainty, anxiety due to clinical uncertainty and reluctance to disclose uncertainty to patients were associated with being at higher risk of burnout, but sex, age, practice location, training duration, years since graduation, and reluctance to disclose uncertainty to physicians were not. Only ten percent of registrars had high resilience scores. Resilience was positively associated with compassion satisfaction and personal meaning in patient care. Resilience was negatively associated with burnout, secondary traumatic stress, inhibitory anxiety, general intolerance to uncertainty, concern about bad outcomes and reluctance to disclose uncertainty to patients. Conclusions GP registrars in this survey showed a lower level of burnout than in other recent surveys of the broader junior doctor population in both Australia

  10. Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items

    Science.gov (United States)

    Aybek, Eren Can; Demirtasli, R. Nukhet

    2017-01-01

    This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…

  11. A one-item workability measure mediates work demands, individual resources and health in the prediction of sickness absence.

    Science.gov (United States)

    Thorsen, Sannie Vester; Burr, Hermann; Diderichsen, Finn; Bjorner, Jakob Bue

    2013-10-01

    The study tested the hypothesis that a one-item workability measure represented an assessment of the fit between resources (the individuals' physical and mental health and functioning) and workplace demands and that this resource/demand fit was a mediator in the prediction of sickness absence. We also estimated the relative importance of health and work environment for workability and sickness absence. Baseline data were collected within a Danish work and health survey (3,214 men and 3,529 women) and followed up in a register of sickness absence. Probit regression analysis with workability as mediator was performed for a binary outcome of sickness absence. The predictors in the analysis were as follows: age, social class, physical health, mental health, number of diagnoses, ergonomic exposures, occupational noise, exposure to risks, social support from supervisor, job control and quantitative demands. High age, poor health and ergonomic exposures were associated with low workability and mediated by workability to sickness absence for both genders. Low social class and low quantitative demands were associated with low workability and mediated to sickness absence among men. The mediated part was from 11 to 63 % of the total effect for the significant predictors. Workability mediated health, age, social class and ergonomic exposures in the prediction of sickness absence. The health predictors had the highest association with both workability and sickness absence; physical work environment was higher associated with the outcomes than psychosocial work environment. However, the explanatory value of the predictors for the variance in the model was low.

  12. Fiscal 1998 survey report. Survey on method of environmental-impact assessment in wind power development; 1998 nendo furyoku kaihatsu ni okeru kankyo eikyo hyoka shuho chosa hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1999-03-01

    With the purpose of preparing the main points of 'environmental-impact assessment guidelines for wind power developments (draft)', examples of environmental-impact assessments, related laws and regulations in Japan and abroad were collected, and rearranged in respect to requirements in environmental-assessments, concrete procedures, survey/projection/assessment method, summarisation of results, etc. It was clarified, for example; in a large-scale wind power development, it can be dealt with by choosing items and contents on the assumption that a land area is developed; in a small-scale development, there is basically no need of considering the possible effect of the construction work; and, as far as noise, vibration and the ecosystem (plants/animals) are concerned, however, the characteristics of the site be taken into consideration. Objects for general assessment are noise, low-frequency air vibration, radio wave interference, the ecosystem (plants and animals) and the landscape. The guideline draft is constituted of (1) basic items, (2) overview of the area, (3) determination of items for environmental-impact assessment and (4) research, prediction, assessment, conservation measures and follow-up research; in the basic items, importance of preliminary consideration was emphasized, as were priority/simplification, implementation of environmental conservation measures, and implementation of follow-up research. (NEDO)

  13. Fiscal 1998 survey report. Survey on method of environmental-impact assessment in wind power development; 1998 nendo furyoku kaihatsu ni okeru kankyo eikyo hyoka shuho chosa hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1999-03-01

    With the purpose of preparing the main points of 'environmental-impact assessment guidelines for wind power developments (draft)', examples of environmental-impact assessments, related laws and regulations in Japan and abroad were collected, and rearranged in respect to requirements in environmental-assessments, concrete procedures, survey/projection/assessment method, summarisation of results, etc. It was clarified, for example; in a large-scale wind power development, it can be dealt with by choosing items and contents on the assumption that a land area is developed; in a small-scale development, there is basically no need of considering the possible effect of the construction work; and, as far as noise, vibration and the ecosystem (plants/animals) are concerned, however, the characteristics of the site be taken into consideration. Objects for general assessment are noise, low-frequency air vibration, radio wave interference, the ecosystem (plants and animals) and the landscape. The guideline draft is constituted of (1) basic items, (2) overview of the area, (3) determination of items for environmental-impact assessment and (4) research, prediction, assessment, conservation measures and follow-up research; in the basic items, importance of preliminary consideration was emphasized, as were priority/simplification, implementation of environmental conservation measures, and implementation of follow-up research. (NEDO)

  14. Patient experience and satisfaction with inpatient service: development of short form survey instrument measuring the core aspect of inpatient experience.

    Directory of Open Access Journals (Sweden)

    Eliza L Y Wong

    Full Text Available Patient experience reflects quality of care from the patients' perspective; therefore, patients' experiences are important data in the evaluation of the quality of health services. The development of an abbreviated, reliable and valid instrument for measuring inpatients' experience would reflect the key aspect of inpatient care from patients' perspective as well as facilitate quality improvement by cultivating patient engagement and allow the trends in patient satisfaction and experience to be measured regularly. The study developed a short-form inpatient instrument and tested its ability to capture a core set of inpatients' experiences. The Hong Kong Inpatient Experience Questionnaire (HKIEQ was established in 2010; it is an adaptation of the General Inpatient Questionnaire of the Care Quality Commission created by the Picker Institute in United Kingdom. This study used a consensus conference and a cross-sectional validation survey to create and validate a short-form of the Hong Kong Inpatient Experience Questionnaire (SF-HKIEQ. The short-form, the SF-HKIEQ, consisted of 18 items derived from the HKIEQ. The 18 items mainly covered relational aspects of care under four dimensions of the patient's journey: hospital staff, patient care and treatment, information on leaving the hospital, and overall impression. The SF-HKIEQ had a high degree of face validity, construct validity and internal reliability. The validated SF-HKIEQ reflects the relevant core aspects of inpatients' experience in a hospital setting. It provides a quick reference tool for quality improvement purposes and a platform that allows both healthcare staff and patients to monitor the quality of hospital care over time.

  15. Discussion on monitoring items of radionuclides in influents from nuclear power plants

    International Nuclear Information System (INIS)

    Zhang Yanxia; Li Jin; Liu Jiacheng; Han Shanbiao; Yu Zhengwei

    2014-01-01

    For the radionuclide monitoring items of effluents from nuclear power plant, this paper makes some comparisons and analysis from three aspects of the international atomic energy general requirements, the routine radionuclide measurement items of China's nuclear power plant and effluents low level radionuclide experimental research results. Finally, it summarizes the necessary items and recommended items of the radionuclide monitoring of effluents from nuclear power plant, which can provide references for the radioactivity monitoring activities of nuclear power plant effluent and the supervisions of regulatory departments. (authors)

  16. Selecting Items for Criterion-Referenced Tests.

    Science.gov (United States)

    Mellenbergh, Gideon J.; van der Linden, Wim J.

    1982-01-01

    Three item selection methods for criterion-referenced tests are examined: the classical theory of item difficulty and item-test correlation; the latent trait theory of item characteristic curves; and a decision-theoretic approach for optimal item selection. Item contribution to the standardized expected utility of mastery testing is discussed. (CM)

  17. Asymptotic Standard Errors for Item Response Theory True Score Equating of Polytomous Items

    Science.gov (United States)

    Cher Wong, Cheow

    2015-01-01

    Building on previous works by Lord and Ogasawara for dichotomous items, this article proposes an approach to derive the asymptotic standard errors of item response theory true score equating involving polytomous items, for equivalent and nonequivalent groups of examinees. This analytical approach could be used in place of empirical methods like…

  18. Quality of Life Assessment for Physical Activity and Health Promotion: Further Psychometrics and Comparison of Measures

    Science.gov (United States)

    Gill, Diane L.; Reifsteck, Erin J.; Adams, Melanie M.; Shang, Ya-Ting

    2015-01-01

    Despite the clear relationship between physical activity and quality of life, few sound, relevant quality of life measures exist. Gill and colleagues developed a 32-item quality of life survey, and provided initial psychometric evidence. This study further examined that quality of life survey in comparison with the widely used short form (SF-36)…

  19. Reliability of a computer and Internet survey (Computer User Profile) used by adults with and without traumatic brain injury (TBI).

    Science.gov (United States)

    Kilov, Andrea M; Togher, Leanne; Power, Emma

    2015-01-01

    To determine test-re-test reliability of the 'Computer User Profile' (CUP) in people with and without TBI. The CUP was administered on two occasions to people with and without TBI. The CUP investigated the nature and frequency of participants' computer and Internet use. Intra-class correlation coefficients and kappa coefficients were conducted to measure reliability of individual CUP items. Descriptive statistics were used to summarize content of responses. Sixteen adults with TBI and 40 adults without TBI were included in the study. All participants were reliable in reporting demographic information, frequency of social communication and leisure activities and computer/Internet habits and usage. Adults with TBI were reliable in 77% of their responses to survey items. Adults without TBI were reliable in 88% of their responses to survey items. The CUP was practical and valuable in capturing information about social, leisure, communication and computer/Internet habits of people with and without TBI. Adults without TBI scored more items with satisfactory reliability overall in their surveys. Future studies may include larger samples and could also include an exploration of how people with/without TBI use other digital communication technologies. This may provide further information on determining technology readiness for people with TBI in therapy programmes.

  20. MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

    Science.gov (United States)

    Wang, Wen-Chung; Shih, Ching-Lin

    2010-01-01

    Three multiple indicators-multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods…

  1. A Comprehensive List of Items to be Included on a Pediatric Drug Monograph.

    Science.gov (United States)

    Kelly, Lauren E; Ito, Shinya; Woods, David; Nunn, Anthony J; Taketomo, Carol; de Hoog, Matthijs; Offringa, Martin

    2017-01-01

    Children require special considerations for drug prescribing. Drug information summarized in a formulary containing drug monographs is essential for safe and effective prescribing. Currently, little is known about the information needs of those who prescribe and administer medicines to children. Our primary objective was to identify a list of important and relevant items to be included in a pediatric drug monograph. Following the establishment of an expert steering committee and an environmental scan of adult and pediatric formulary monograph items, 46 participants from 25 countries were invited to complete a 2-round Delphi survey. Questions regarding source of prescribing information and importance of items were recorded. An international consensus meeting to vote on and finalize the items list with the steering committee followed. Pediatric formularies are most commonly the first resource consulted for information on medication used in children by 31 Delphi participants. After the Delphi rounds, 116 items were identified to be included in a comprehensive pediatric drug monograph, including general information, adverse drug reactions, dosages, precautions, drug-drug interactions, formulation, and drug properties. Health care providers identified 116 monograph items as important for prescribing medicines for children by an international consensus-based process. This information will assist in setting standards for the creation of new pediatric drug monographs for international application and for those involved in pediatric formulary development.

  2. Development of the Chicago Food Allergy Research Surveys: assessing knowledge, attitudes, and beliefs of parents, physicians, and the general public

    Directory of Open Access Journals (Sweden)

    Pongracic Jacqueline A

    2009-08-01

    Full Text Available Abstract Background Parents of children with food allergy, primary care physicians, and members of the general public play a critical role in the health and well-being of food-allergic children, though little is known about their knowledge and perceptions of food allergy. The purpose of this paper is to detail the development of the Chicago Food Allergy Research Surveys to assess food allergy knowledge, attitudes, and beliefs among these three populations. Methods From 2006–2008, parents of food-allergic children, pediatricians, family physicians, and adult members of the general public were recruited to assist in survey development. Preliminary analysis included literature review, creation of initial content domains, expert panel review, and focus groups. Survey validation included creation of initial survey items, expert panel ratings, cognitive interviews, reliability testing, item reduction, and final validation. National administration of the surveys is ongoing. Results Nine experts were assembled to oversee survey development. Six focus groups were held: 2/survey population, 4–9 participants/group; transcripts were reviewed via constant comparative methods to identify emerging themes and inform item creation. At least 220 participants per population were recruited to assess the relevance, reliability, and utility of each survey item as follows: cognitive interviews, 10 participants; reliability testing ≥ 10; item reduction ≥ 50; and final validation, 150 respondents. Conclusion The Chicago Food Allergy Research surveys offer validated tools to assess food allergy knowledge and perceptions among three distinct populations: a 42 item parent tool, a 50 item physician tool, and a 35 item general public tool. No such tools were previously available.

  3. Using Differential Item Functioning Procedures to Explore Sources of Item Difficulty and Group Performance Characteristics.

    Science.gov (United States)

    Scheuneman, Janice Dowd; Gerritz, Kalle

    1990-01-01

    Differential item functioning (DIF) methodology for revealing sources of item difficulty and performance characteristics of different groups was explored. A total of 150 Scholastic Aptitude Test items and 132 Graduate Record Examination general test items were analyzed. DIF was evaluated for males and females and Blacks and Whites. (SLD)

  4. Item Response Data Analysis Using Stata Item Response Theory Package

    Science.gov (United States)

    Yang, Ji Seung; Zheng, Xiaying

    2018-01-01

    The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

  5. SU-F-T-244: Radiotherapy Risk Estimation Based On Expert Group Survey

    International Nuclear Information System (INIS)

    Koo, J; Yoon, M; Chung, W; Chung, M; Kim, D

    2016-01-01

    Purpose: To evaluate the reliability of RPN (Risk Priority Number) decided by expert group and to provide preliminary data for adapting FMEA in Korea. Methods: 1163 Incidents reported in ROSIS for 11 years were used as a real data to be compared with, and were categorized into 146 items. The questionnaire was composed of the 146 items and respondents had to valuate ‘occurrence (O)’, ‘severity (S)’, ‘detectability (D)’ of each item on a scale from 1 to 10 according to the proposed AAPM TG-100 rating scales. 19 medical physicists from 19 different organizations in Korea had participated in the survey. Because the number of ROSIS items was not evenly spread enough to be classified into 10 grades, 1–5 scale was chosen instead of 1–10 and survey result was also fit to 5 grades to compare. Results: The average O,S,D were 1.77, 3.50, 2.13, respectively and the item which had the highest RPN(32) was ‘patient movement during treatment’ in the survey. When comparing items ranked in the top 10 of each survey(O) and ROSIS database, two items were duplicated and ‘Simulation’ and ’Treatment’ were the most frequently ranked RT process in top 10 of survey and ROSIS each. The Chronbach α of each RT process were ranged from 0.74 to 0.99 and p-value was <0.001. When comparing O*D, the average difference was 1.4. Conclusion: This work indicates the deviation between actual risk and expectation. Considering that the respondents were Korean and ROSIS is mainly composed of incidents happened in European countries and some of the top 10 items of ROSIS cannot be applied in radiotherapy procedure in Korea, the deviation could have been came from procedural difference. Moreover, if expert group was consisted of experts from various parts, expectation might have been more accurate. Therefore, further research on radiotherapy risk estimation is needed.

  6. SU-F-T-244: Radiotherapy Risk Estimation Based On Expert Group Survey

    Energy Technology Data Exchange (ETDEWEB)

    Koo, J; Yoon, M [Korea University, Seoul (Korea, Republic of); Chung, W; Chung, M; Kim, D [Kyung Hee University Hospital at Gangdong, Gangdong-gu, Seoul (Korea, Republic of)

    2016-06-15

    Purpose: To evaluate the reliability of RPN (Risk Priority Number) decided by expert group and to provide preliminary data for adapting FMEA in Korea. Methods: 1163 Incidents reported in ROSIS for 11 years were used as a real data to be compared with, and were categorized into 146 items. The questionnaire was composed of the 146 items and respondents had to valuate ‘occurrence (O)’, ‘severity (S)’, ‘detectability (D)’ of each item on a scale from 1 to 10 according to the proposed AAPM TG-100 rating scales. 19 medical physicists from 19 different organizations in Korea had participated in the survey. Because the number of ROSIS items was not evenly spread enough to be classified into 10 grades, 1–5 scale was chosen instead of 1–10 and survey result was also fit to 5 grades to compare. Results: The average O,S,D were 1.77, 3.50, 2.13, respectively and the item which had the highest RPN(32) was ‘patient movement during treatment’ in the survey. When comparing items ranked in the top 10 of each survey(O) and ROSIS database, two items were duplicated and ‘Simulation’ and ’Treatment’ were the most frequently ranked RT process in top 10 of survey and ROSIS each. The Chronbach α of each RT process were ranged from 0.74 to 0.99 and p-value was <0.001. When comparing O*D, the average difference was 1.4. Conclusion: This work indicates the deviation between actual risk and expectation. Considering that the respondents were Korean and ROSIS is mainly composed of incidents happened in European countries and some of the top 10 items of ROSIS cannot be applied in radiotherapy procedure in Korea, the deviation could have been came from procedural difference. Moreover, if expert group was consisted of experts from various parts, expectation might have been more accurate. Therefore, further research on radiotherapy risk estimation is needed.

  7. Combining item and bulk material loss-detection uncertainties

    International Nuclear Information System (INIS)

    Eggers, R.F.

    1982-01-01

    Loss detection requirements, such as five formula kilograms with 99% probability of detection, which apply to the sum of losses from material in both item and bulk form, constitute a special problem for the nuclear material statistician. Requirements of this type are included in the Material Control and Accounting Reform Amendments described in the Advance Notice of Proposed Rule Making (Federal Register, 46(175):45144-46151). Attribute test sampling of items is the method used to detect gross defects in the inventory of items in a given control unit. Attribute sampling plans are designed to detect a loss of a specificed goal quantity of material with a given probability. In contrast to the methods and statistical models used for item loss detection, bulk material loss detection requires all the material entering and leaving a control unit to be measured and the calculation of a loss estimator that will be tested against an appropriate alarm threshold. The alarm threshold is determined from an estimate of the error inherent in the components of the loss estimator. In this paper a simple grahical method of evaluating the combined capabilities of bulk material loss detection methods and item attribute testing procedures will be described. Quantitative results will be given for several cases, indicating how a decrease in the precision of the item loss detection method tends to force an increase in the precision of the bulk loss detection procedure in order to meet the overall detection requirement. 4 figures

  8. The Dutch-Flemish PROMIS Physical Function item bank exhibited strong psychometric properties in patients with chronic pain.

    Science.gov (United States)

    Crins, Martine H P; Terwee, Caroline B; Klausch, Thomas; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis A; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Roorda, Leo D

    2017-07-01

    The objective of this study was to assess the psychometric properties of the Dutch-Flemish Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank in Dutch patients with chronic pain. A bank of 121 items was administered to 1,247 Dutch patients with chronic pain. Unidimensionality was assessed by fitting a one-factor confirmatory factor analysis and evaluating resulting fit statistics. Items were calibrated with the graded response model and its fit was evaluated. Cross-cultural validity was assessed by testing items for differential item functioning (DIF) based on language (Dutch vs. English). Construct validity was evaluated by calculation correlations between scores on the Dutch-Flemish PROMIS Physical Function measure and scores on generic and disease-specific measures. Results supported the Dutch-Flemish PROMIS Physical Function item bank's unidimensionality (Comparative Fit Index = 0.976, Tucker Lewis Index = 0.976) and model fit. Item thresholds targeted a wide range of physical function construct (threshold-parameters range: -4.2 to 5.6). Cross-cultural validity was good as four items only showed DIF for language and their impact on item scores was minimal. Physical Function scores were strongly associated with scores on all other measures (all correlations ≤ -0.60 as expected). The Dutch-Flemish PROMIS Physical Function item bank exhibited good psychometric properties. Development of a computer adaptive test based on the large bank is warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Effects of Reducing the Cognitive Load of Mathematics Test Items on Student Performance

    Directory of Open Access Journals (Sweden)

    Susan C. Gillmor

    2015-01-01

    Full Text Available This study explores a new item-writing framework for improving the validity of math assessment items. The authors transfer insights from Cognitive Load Theory (CLT, traditionally used in instructional design, to educational measurement. Fifteen, multiple-choice math assessment items were modified using research-based strategies for reducing extraneous cognitive load. An experimental design with 222 middle-school students tested the effects of the reduced cognitive load items on student performance and anxiety. Significant findings confirm the main research hypothesis that reducing the cognitive load of math assessment items improves student performance. Three load-reducing item modifications are identified as particularly effective for reducing item difficulty: signalling important information, aesthetic item organization, and removing extraneous content. Load reduction was not shown to impact student anxiety. Implications for classroom assessment and future research are discussed.

  10. Item Banking with Embedded Standards

    Science.gov (United States)

    MacCann, Robert G.; Stanley, Gordon

    2009-01-01

    An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…

  11. Standardization of depression measurement

    DEFF Research Database (Denmark)

    Wahl, Inka; Löwe, Bernd; Bjørner, Jakob

    2014-01-01

    OBJECTIVES: To provide a standardized metric for the assessment of depression severity to enable comparability among results of established depression measures. STUDY DESIGN AND SETTING: A common metric for 11 depression questionnaires was developed applying item response theory (IRT) methods. Data...... of 33,844 adults were used for secondary analysis including routine assessments of 23,817 in- and outpatients with mental and/or medical conditions (46% with depressive disorders) and a general population sample of 10,027 randomly selected participants from three representative German household surveys....... RESULTS: A standardized metric for depression severity was defined by 143 items, and scores were normed to a general population mean of 50 (standard deviation = 10) for easy interpretability. It covers the entire range of depression severity assessed by established instruments. The metric allows...

  12. Engaging Community Leaders in the Development of a Cardiovascular Health Behavior Survey Using Focus Group–Based Cognitive Interviewing

    Directory of Open Access Journals (Sweden)

    Gwenyth R Wallen

    2017-04-01

    Full Text Available Establishing the validity of health behavior surveys used in community-based participatory research (CBPR in diverse populations is often overlooked. A novel, group-based cognitive interviewing method was used to obtain qualitative data for tailoring a survey instrument designed to identify barriers to improved cardiovascular health in at-risk populations in Washington, DC. A focus group–based cognitive interview was conducted to assess item comprehension, recall, and interpretation and to establish the initial content validity of the survey. Thematic analysis of verbatim transcripts yielded 5 main themes for which participants (n = 8 suggested survey modifications, including survey item improvements, suggestions for additional items, community-specific issues, changes in the skip logic of the survey items, and the identification of typographical errors. Population-specific modifications were made, including the development of more culturally appropriate questions relevant to the community. Group-based cognitive interviewing provided an efficient and effective method for piloting a cardiovascular health survey instrument using CBPR.

  13. Measuring single constructs by single items: Constructing an even shorter version of the “Short Five” personality inventory

    Science.gov (United States)

    Konstabel, Kenn; Lönnqvist, Jan-Erik; Leikas, Sointu; García Velázquez, Regina; Qin, Hiaying; Verkasalo, Markku; Walkowitz, Gari

    2017-01-01

    The aim of this study was to construct a short, 30-item personality questionnaire that would be, in terms of content and meaning of the scores, as comparable as possible with longer, well-established inventories such as NEO PI-R and its clones. To do this, we shortened the formerly constructed 60-item “Short Five” (S5) by half so that each subscale would be represented by a single item. We compared all possibilities of selecting 30 items (preserving balanced keying within each domain of the five-factor model) in terms of correlations with well-established scales, self-peer correlations, and clarity of meaning, and selected an optimal combination for each domain. The resulting shortened questionnaire, XS5, was compared to the original S5 using data from student samples in 6 different countries (Estonia, Finland, UK, Germany, Spain, and China), and a representative Finnish sample. The correlations between XS5 domain scales and their longer counterparts from well-established scales ranged from 0.74 to 0.84; the difference from the equivalent correlations for full version of S5 or from meta-analytic short-term dependability coefficients of NEO PI-R was not large. In terms of prediction of external criteria (emotional experience and self-reported behaviours), there were no important differences between XS5, S5, and the longer well-established scales. Controlling for acquiescence did not improve the prediction of criteria, self-peer correlations, or correlations with longer scales, but it did improve internal reliability and, in some analyses, comparability of the principal component structure. XS5 can be recommended as an economic measure of the five-factor model of personality at the level of domain scales; it has reasonable psychometric properties, fair correlations with longer well-established scales, and it can predict emotional experience and self-reported behaviours no worse than S5. When subscales are essential, we would still recommend using the full version

  14. Quality of life and discriminating power of two questionnaires in fibromyalgia patients: Fibromyalgia Impact Questionnaire and Medical Outcomes Study 36-Item Short-Form Health Survey.

    Science.gov (United States)

    Assumpção, Ana; Pagano, Tatiana; Matsutani, Luciana A; Ferreira, Elizabeth A G; Pereira, Carlos A B; Marques, Amélia P

    2010-01-01

    Fibromyalgia is a painful syndrome characterized by widespread chronic pain and associated symptoms with a negative impact on quality of life. Considering the subjectivity of quality of life measurements, the aim of this study was to verify the discriminating power of two quality of life questionnaires in patients with fibromyalgia: the generic Medical Outcomes Study 36-Item Short-Form Health Survey (SF-36) and the specific Fibromyalgia Impact Questionnaire (FIQ). A cross-sectional study was conducted on 150 participants divided into Fibromyalgia Group (FG) and Control Group (CG) (n=75 in each group). The participants were evaluated using the SF-36 and the FIQ. The data were analyzed by the Student t-test (α=0.05) and inferential analysis using the Receiver Operating Characteristics (ROC) Curve--sensitivity, specificity and area under the curve (AUC). The significance level was 0.05. The sample was similar for age (CG: 47.8 ± 8.1; FG: 47.0 ± 7.7 years). A significant difference was observed in quality of life assessment in all aspects of both questionnaires (pquality of life in fibromyalgia patients, and we suggest that both should be used in parallel because they evaluate relevant and complementary aspects of quality of life.

  15. TT detector description and implementation of the survey measurements

    CERN Document Server

    Salzmann, C

    2008-01-01

    The TT geometry in the software has been updated to comply with the latest technical drawings. The main difference is in the description of the beam pipe insulation, where the amount of material has increased from $7.5\\%$ to $15.4\\%$ of $X_0$. Mother volumes are added to decrease the CPU consumption and finally several scans are made to compare the material budget between the DC06 geometry and the new 2008 geometry. In addition, the survey measurements of the TT detector have been analysed. These measurements can be subdivided into surveys of the detector box, photogrammetry of the balconies and metrology of the half-modules. The offsets with the nominal geometry are implemented in the alignment condition database.

  16. Validation of the Instructional Materials Motivation Survey (IMMS) in a self-directed instructional setting aimed at working with technology

    NARCIS (Netherlands)

    Loorbach, N.R.; Peters, O.; Karreman, Joyce; Steehouder, M.F.

    2015-01-01

    The ARCS Model of Motivational Design has been used myriad times to design motivational instructions that focus on attention, relevance, confidence and satisfaction in order to motivate students. The Instructional Materials Motivation Survey (IMMS) is a 36-item situational measure of people's

  17. Development of abbreviated eight-item form of the Penn Verbal Reasoning Test.

    Science.gov (United States)

    Bilker, Warren B; Wierzbicki, Michael R; Brensinger, Colleen M; Gur, Raquel E; Gur, Ruben C

    2014-12-01

    The ability to reason with language is a highly valued cognitive capacity that correlates with IQ measures and is sensitive to damage in language areas. The Penn Verbal Reasoning Test (PVRT) is a 29-item computerized test for measuring abstract analogical reasoning abilities using language. The full test can take over half an hour to administer, which limits its applicability in large-scale studies. We previously described a procedure for abbreviating a clinical rating scale and a modified procedure for reducing tests with a large number of items. Here we describe the application of the modified method to reducing the number of items in the PVRT to a parsimonious subset of items that accurately predicts the total score. As in our previous reduction studies, a split sample is used for model fitting and validation, with cross-validation to verify results. We find that an 8-item scale predicts the total 29-item score well, achieving a correlation of .9145 for the reduced form for the model fitting sample and .8952 for the validation sample. The results indicate that a drastically abbreviated version, which cuts administration time by more than 70%, can be safely administered as a predictor of PVRT performance. © The Author(s) 2014.

  18. Development of Abbreviated Eight-Item Form of the Penn Verbal Reasoning Test

    Science.gov (United States)

    Bilker, Warren B.; Wierzbicki, Michael R.; Brensinger, Colleen M.; Gur, Raquel E.; Gur, Ruben C.

    2014-01-01

    The ability to reason with language is a highly valued cognitive capacity that correlates with IQ measures and is sensitive to damage in language areas. The Penn Verbal Reasoning Test (PVRT) is a 29-item computerized test for measuring abstract analogical reasoning abilities using language. The full test can take over half an hour to administer, which limits its applicability in large-scale studies. We previously described a procedure for abbreviating a clinical rating scale and a modified procedure for reducing tests with a large number of items. Here we describe the application of the modified method to reducing the number of items in the PVRT to a parsimonious subset of items that accurately predicts the total score. As in our previous reduction studies, a split sample is used for model fitting and validation, with cross-validation to verify results. We find that an 8-item scale predicts the total 29-item score well, achieving a correlation of .9145 for the reduced form for the model fitting sample and .8952 for the validation sample. The results indicate that a drastically abbreviated version, which cuts administration time by more than 70%, can be safely administered as a predictor of PVRT performance. PMID:24577310

  19. Surveying Turkish high school and university students’ attitudes and approaches to physics problem solving

    Directory of Open Access Journals (Sweden)

    Nuri Balta

    2016-04-01

    Full Text Available Students’ attitudes and approaches to physics problem solving can impact how well they learn physics and how successful they are in solving physics problems. Prior research in the U.S. using a validated Attitude and Approaches to Problem Solving (AAPS survey suggests that there are major differences between students in introductory physics and astronomy courses and physics experts in terms of their attitudes and approaches to physics problem solving. Here we discuss the validation, administration, and analysis of data for the Turkish version of the AAPS survey for high school and university students in Turkey. After the validation and administration of the Turkish version of the survey, the analysis of the data was conducted by grouping the data by grade level, school type, and gender. While there are no statistically significant differences between the averages of various groups on the survey, overall, the university students in Turkey were more expertlike than vocational high school students. On an item by item basis, there are statistically differences between the averages of the groups on many items. For example, on average, the university students demonstrated less expertlike attitudes about the role of equations and formulas in problem solving, in solving difficult problems, and in knowing when the solution is not correct, whereas they displayed more expertlike attitudes and approaches on items related to metacognition in physics problem solving. A principal component analysis on the data yields item clusters into which the student responses on various survey items can be grouped. A comparison of the responses of the Turkish and American university students enrolled in algebra-based introductory physics courses shows that on more than half of the items, the responses of these two groups were statistically significantly different, with the U.S. students on average responding to the items in a more expertlike manner.

  20. Establishing key components of yoga interventions for musculoskeletal conditions: a Delphi survey

    Science.gov (United States)

    2014-01-01

    Background Evidence suggests yoga is a safe and effective intervention for the management of physical and psychosocial symptoms associated with musculoskeletal conditions. However, heterogeneity in the components and reporting of clinical yoga trials impedes both the generalization of study results and the replication of study protocols. The aim of this Delphi survey was to address these issues of heterogeneity, by developing a list of recommendations of key components for the design and reporting of yoga interventions for musculoskeletal conditions. Methods Recognised experts involved in the design, conduct, and teaching of yoga for musculoskeletal conditions were identified from a systematic review, and invited to contribute to the Delphi survey. Forty-one of the 58 experts contacted, representing six countries, agreed to participate. A three-round Delphi was conducted via electronic surveys. Round 1 presented an open-ended question, allowing panellists to individually identify components they considered key to the design and reporting of yoga interventions for musculoskeletal conditions. Thematic analysis of Round 1 identified items for quantitative rating in Round 2; items not reaching consensus were forwarded to Round 3 for re-rating. Results Thirty-six panellists (36/41; 88%) completed the three rounds of the Delphi survey. Panellists provided 348 comments to the Round 1 question. These comments were reduced to 49 items, grouped under five themes, for rating in subsequent rounds. A priori group consensus of ≥80% was reached on 28 items related to five themes concerning defining the yoga intervention, types of yoga practices to include in an intervention, delivery of the yoga protocol, domains of outcome measures, and reporting of yoga interventions for musculoskeletal conditions. Additionally, a priori consensus of ≥50% was reached on five items relating to minimum values for intervention parameters. Conclusions Expert consensus has provided a non

  1. Assessing cross-cultural item bias in questionnaires : Acculturation and the Measurement of Social Support and Family Cohesion for Adolescents

    NARCIS (Netherlands)

    Hemert, Dianne A. van; Baerveldt, Chris; Vermande, Marjolijn

    2001-01-01

    Amethod is presented for evaluating the presence and size of cross-cultural item biases. The examined items concern parental support and family cohesion in a Likert-type questionnaire for adolescents in The Netherlands. Each evaluated item has two versions, a collectivist and an individualistic one,

  2. Quality Measurements in Radiology: A Systematic Review of the Literature and Survey of Radiology Benefit Management Groups.

    Science.gov (United States)

    Narayan, Anand; Cinelli, Christina; Carrino, John A; Nagy, Paul; Coresh, Josef; Riese, Victoria G; Durand, Daniel J

    2015-11-01

    As the US health care system transitions toward value-based reimbursement, there is an increasing need for metrics to quantify health care quality. Within radiology, many quality metrics are in use, and still more have been proposed, but there have been limited attempts to systematically inventory these measures and classify them using a standard framework. The purpose of this study was to develop an exhaustive inventory of public and private sector imaging quality metrics classified according to the classic Donabedian framework (structure, process, and outcome). A systematic review was performed in which eligibility criteria included published articles (from 2000 onward) from multiple databases. Studies were double-read, with discrepancies resolved by consensus. For the radiology benefit management group (RBM) survey, the six known companies nationally were surveyed. Outcome measures were organized on the basis of standard categories (structure, process, and outcome) and reported using Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The search strategy yielded 1,816 citations; review yielded 110 reports (29 included for final analysis). Three of six RBMs (50%) responded to the survey; the websites of the other RBMs were searched for additional metrics. Seventy-five unique metrics were reported: 35 structure (46%), 20 outcome (27%), and 20 process (27%) metrics. For RBMs, 35 metrics were reported: 27 structure (77%), 4 process (11%), and 4 outcome (11%) metrics. The most commonly cited structure, process, and outcome metrics included ACR accreditation (37%), ACR Appropriateness Criteria (85%), and peer review (95%), respectively. Imaging quality metrics are more likely to be structural (46%) than process (27%) or outcome (27%) based (P < .05). As national value-based reimbursement programs increasingly emphasize outcome-based metrics, radiologists must keep pace by developing the data infrastructure required to collect outcome

  3. Patient Safety Culture Survey in Pediatric Complex Care Settings: A Factor Analysis.

    Science.gov (United States)

    Hessels, Amanda J; Murray, Meghan; Cohen, Bevin; Larson, Elaine L

    2017-04-19

    Children with complex medical needs are increasing in number and demanding the services of pediatric long-term care facilities (pLTC), which require a focus on patient safety culture (PSC). However, no tool to measure PSC has been tested in this unique hybrid acute care-residential setting. The objective of this study was to evaluate the psychometric properties of the Nursing Home Survey on Patient Safety Culture tool slightly modified for use in the pLTC setting. Factor analyses were performed on data collected from 239 staff at 3 pLTC in 2012. Items were screened by principal axis factoring, and the original structure was tested using confirmatory factor analysis. Exploratory factor analysis was conducted to identify the best model fit for the pLTC data, and factor reliability was assessed by Cronbach alpha. The extracted, rotated factor solution suggested items in 4 (staffing, nonpunitive response to mistakes, communication openness, and organizational learning) of the original 12 dimensions may not be a good fit for this population. Nevertheless, in the pLTC setting, both the original and the modified factor solutions demonstrated similar reliabilities to the published consistencies of the survey when tested in adult nursing homes and the items factored nearly identically as theorized. This study demonstrates that the Nursing Home Survey on Patient Safety Culture with minimal modification may be an appropriate instrument to measure PSC in pLTC settings. Additional psychometric testing is recommended to further validate the use of this instrument in this setting, including examining the relationship to safety outcomes. Increased use will yield data for benchmarking purposes across these specialized settings to inform frontline workers and organizational leaders of areas of strength and opportunity for improvement.

  4. Development and validation of the Bullying and Cyberbullying Scale for Adolescents: A multi-dimensional measurement model.

    Science.gov (United States)

    Thomas, Hannah J; Scott, James G; Coates, Jason M; Connor, Jason P

    2018-05-03

    Intervention on adolescent bullying is reliant on valid and reliable measurement of victimization and perpetration experiences across different behavioural expressions. This study developed and validated a survey tool that integrates measurement of both traditional and cyber bullying to test a theoretically driven multi-dimensional model. Adolescents from 10 mainstream secondary schools completed a baseline and follow-up survey (N = 1,217; M age  = 14 years; 66.2% male). The Bullying and cyberbullying Scale for Adolescents (BCS-A) developed for this study comprised parallel victimization and perpetration subscales, each with 20 items. Additional measures of bullying (Olweus Global Bullying and the Forms of Bullying Scale [FBS]), as well as measures of internalizing and externalizing problems, school connectedness, social support, and personality, were used to further assess validity. Factor structure was determined, and then, the suitability of items was assessed according to the following criteria: (1) factor interpretability, (2) item correlations, (3) model parsimony, and (4) measurement equivalence across victimization and perpetration experiences. The final models comprised four factors: physical, verbal, relational, and cyber. The final scale was revised to two 13-item subscales. The BCS-A demonstrated acceptable concurrent and convergent validity (internalizing and externalizing problems, school connectedness, social support, and personality), as well as predictive validity over 6 months. The BCS-A has sound psychometric properties. This tool establishes measurement equivalence across types of involvement and behavioural forms common among adolescents. An improved measurement method could add greater rigour to the evaluation of intervention programmes and also enable interventions to be tailored to subscale profiles. © 2018 The British Psychological Society.

  5. Designing an Instrument to Measure the QoS of a Spanish Virtual Store

    Science.gov (United States)

    de Abajo, Beatriz Sainz; de La Torre Díez, Isabel; Salcines, Enrique García; Fernández, Javier Burón; Pernas, Francisco Díaz; Coronado, Miguel López; de Castro Lozano, Carlos

    This article describes the development of an instrument, in the form of a survey, which is distributed to users of a B2C website selling electronic books in order to ascertain their satisfaction. The opinions compiled from a pilot sample and the exploratory factor analysis carried out point to factors that best summarise the quality of the application analysed here. Analysis of the initial survey, with a total of 40 items, shaped the final instrument, encompassing 18 items divided into 6 dimensions, which measure the perceptions of users of the application in order to improve the contents of the website. Subsequently, a confirmatory factorial analysis is performed, ensuring the reliability of the study and which confirms that the structure of the instrument developed truly measures service quality in accordance with the requirements of the website in terms of offering a space that fulfils consumer expectations in the Information Society.

  6. Psychometric properties of the Epworth Sleepiness Scale: A factor analysis and item-response theory approach.

    Science.gov (United States)

    Pilcher, June J; Switzer, Fred S; Munc, Alec; Donnelly, Janet; Jellen, Julia C; Lamm, Claus

    2018-04-01

    The purpose of this study is to examine the psychometric properties of the Epworth Sleepiness Scale (ESS) in two languages, German and English. Students from a university in Austria (N = 292; 55 males; mean age = 18.71 ± 1.71 years; 237 females; mean age = 18.24 ± 0.88 years) and a university in the US (N = 329; 128 males; mean age = 18.71 ± 0.88 years; 201 females; mean age = 21.59 ± 2.27 years) completed the ESS. An exploratory-factor analysis was completed to examine dimensionality of the ESS. Item response theory (IRT) analyses were used to provide information about the response rates on the items on the ESS and provide differential item functioning (DIF) analyses to examine whether the items were interpreted differently between the two languages. The factor analyses suggest that the ESS measures two distinct sleepiness constructs. These constructs indicate that the ESS is probing sleepiness in settings requiring active versus passive responding. The IRT analyses found that overall, the items on the ESS perform well as a measure of sleepiness. However, Item 8 and to a lesser extent Item 6 were being interpreted differently by respondents in comparison to the other items. In addition, the DIF analyses showed that the responses between German and English were very similar indicating that there are only minor measurement differences between the two language versions of the ESS. These findings suggest that the ESS provides a reliable measure of propensity to sleepiness; however, it does convey a two-factor approach to sleepiness. Researchers and clinicians can use the German and English versions of the ESS but may wish to exclude Item 8 when calculating a total sleepiness score.

  7. Initial validation of the Nine Item Avoidant/Restrictive Food Intake disorder screen (NIAS): A measure of three restrictive eating patterns.

    Science.gov (United States)

    Zickgraf, Hana F; Ellis, Jordan M

    2018-04-01

    Avoidant/Restrictive Food Intake Disorder (ARFID) is an eating or feeding disorder characterized by inadequate nutritional or caloric intake leading to weight loss, nutritional deficiency, supplement dependence, and/or significant psychosocial impairment. DSM-5 lists three different eating patterns that can lead to symptoms of ARFID: avoidance of foods due to their sensory properties (e.g., picky eating), poor appetite or limited interest in eating, or fear of negative consequences from eating. Research on the prevalence and psychopathology of ARFID is limited by the lack of validated instruments to measure these eating behaviors. The present study describes the development and validation of the nine-item ARFID screen (NIAS), a brief multidimensional instrument to measure ARFID-associated eating behaviors. Participants were 455 adults recruited on Amazon's Mechanical Turk, 505 adults recruited from a nationally-representative subject pool, and 311 undergraduates participating in research for course credit. Exploratory and confirmatory factor analyses provided evidence for three factors. The NIAS subscales demonstrated high internal consistency, test-retest reliability, invariant item loadings between two samples, and convergent/discriminant validity with other measures of picky eating, appetite, fear of negative consequences, and psychopathology. The scales were also correlated with measures of ARFID-like symptoms (e.g., low BMI, low fruit/vegetable variety and intake, and eating-related psychosocial interference/distress), although the picky eating, appetite, and fear scales had distinct independent relationships with these constructs. The NIAS is a brief, reliable instrument that may be used to further investigate ARFID-related eating behaviors. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Report on achievements in fiscal 1998. Surveys on development of an at-home welfare device system to rationalize energy use. (Ube City); 1998 nendo energy shiyo gorika zaitaku fukushi kiki system kaihatsu chosa (Ube) saiitaku kenkyu seika hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1999-03-01

    The present study utilizes a Welfare Techno-House to analyze structural characteristics of residential houses arranged with considerations for elderly people, and operation characteristics of at-home welfare devices. It is also intended to identify the status of energy consumption, and research and develop energy saving devices. The research and development items for the current fiscal year are as follows: (1) survey on power consumption in at-home welfare devices, and (2) development of at-home welfare device systems utilizing energy more effectively - sub-item a.: studies on leveling of energy use, b.: studies on identification of load applied to riders of power driven wheelchairs when they are operated, c.: studies on next generation housing for elderly and physically handicapped people, d.: surveys on discharged VOC concentration in houses built in warm districts. In item (1), power consumption of air conditioners for room heating in winter was measured to have derived time series data of daily change in the energy consumption. In the sub-item a, discussions were given on system efficiency evaluation on ice heat storing devices and floor cooling devices, and the indoor thermal environment characteristics. In the sub-item b, load applied to riders of power driven wheelchairs when they are operated was experimented for verification. In the sub-item c, surveys were performed on hot heat environment in a greenhouse attached to a residential houses arranged with considerations for elderly people. In the sub-item d, measurements were carried out on formaldehyde concentration and VOC in houses newly built in warm and cold districts to discuss preventive measures for indoor air pollution. (NEDO)

  9. Measuring fertility through mobile‒phone based household surveys: Methods, data quality, and lessons learned from PMA2020 surveys

    OpenAIRE

    Yoonjoung Choi; Qingfeng Li; Blake Zachary

    2018-01-01

    Background: PMA2020 is a survey platform with resident enumerators using mobile phones. Instead of collecting full birth history, total fertility rates (TFR) have been measured with a limited number of questions on recent births. Employing new approaches provides opportunities to test and advance survey methods. Objective: This study aims to assess the quality of fertility data in PMA2020 surveys, focusing on bias introduced from the questionnaire and completeness and distribution of birth...

  10. Analyzing Repeated Measures Marginal Models on Sample Surveys with Resampling Methods

    Directory of Open Access Journals (Sweden)

    James D. Knoke

    2005-12-01

    Full Text Available Packaged statistical software for analyzing categorical, repeated measures marginal models on sample survey data with binary covariates does not appear to be available. Consequently, this report describes a customized SAS program which accomplishes such an analysis on survey data with jackknifed replicate weights for which the primary sampling unit information has been suppressed for respondent confidentiality. First, the program employs the Macro Language and the Output Delivery System (ODS to estimate the means and covariances of indicator variables for the response variables, taking the design into account. Then, it uses PROC CATMOD and ODS, ignoring the survey design, to obtain the design matrix and hypothesis test specifications. Finally, it enters these results into another run of CATMOD, which performs automated direct input of the survey design specifications and accomplishes the appropriate analysis. This customized SAS program can be employed, with minor editing, to analyze general categorical, repeated measures marginal models on sample surveys with replicate weights. Finally, the results of our analysis accounting for the survey design are compared to the results of two alternate analyses of the same data. This comparison confirms that such alternate analyses, which do not properly account for the design, do not produce useful results.

  11. Measuring Nonresponse Bias in a Cross-Country Enterprise Survey

    Directory of Open Access Journals (Sweden)

    Katarzyna Bańkowska

    2015-04-01

    Full Text Available Nonresponse is a common issue affecting the vast majority of surveys. Efforts to convince those unwilling to participate in a survey might not necessary result in a better picture of the target population and can lead to higher, not lower, nonresponse bias.We investigate the impact of non-response in the European Commission & European Central Bank Survey on the Access to Finance of Enterprises (SAFE, which collects evidence on the financing conditions faced by European SMEs compared with those of large firms. This survey, conducted by telephone bi-annually since 2009 by the ECB and the European Commission, provides a valuable means to search for this kind of bias, given the high heterogeneity of response propensities across countries.The study relies on so-called “Representativity Indicators” developed within the Representativity Indicators of Survey Quality (RISQ project, which measure the distance to a fully representative response. On this basis, we examine the quality of the SAFE Survey at different stages of the fieldwork as well as across different survey waves and countries. The RISQ methodology relies on rich sampling frame information, which is however partly limited in the case of the SAFE. We also assess the representativeness of the SAFE particular subsample created by linking the survey responses with the companies’ financial information from a business register; this sub-sampling is another potential source of bias which we also attempt to quantify. Finally, we suggest possible ways how to improve monitoring of the possible nonresponse bias in the future rounds of the survey.

  12. Measurements with the new PHE neutron survey instrument

    International Nuclear Information System (INIS)

    Eakins, J.S.; Tanner, R.J.; Hager, L.G.

    2014-01-01

    A novel design of survey instrument has been developed to accurately estimate ambient dose equivalent from neutrons with energies in the range from thermal to 20 MeV. The device features moderating and attenuating layers to ease measurement of fast and intermediate energy neutrons, combined with guides that channel low-energy neutrons to the single, central detector. A prototype of this device has been constructed and exposed to a set of calibration fields: the resulting measured responses are presented and discussed here, and compared against Monte Carlo data. A simple simulated workplace neutron field has also been developed to test the device. Measured response data have been determined for a prototype design of neutron survey instrument, using facilities at PHE and NPL. In general, the results demonstrated good directional invariance and agreed well with data obtained by Monte Carlo modelling, raising confidence in the accuracy of the response characteristics expected for the device. A simple simulated workplace field has also been developed and characterised, and the performance of the device assessed in it: agreement between measured and modelled results suggests that the device would behave as anticipated in real workplace fields. These performances will be investigated further in the future, as the design makes the transition from a research prototype to a commercially available instrument. (authors)

  13. Interpreting Mini-Mental State Examination Performance in Highly Proficient Bilingual Spanish-English and Asian Indian-English Speakers: Demographic Adjustments, Item Analyses, and Supplemental Measures.

    Science.gov (United States)

    Milman, Lisa H; Faroqi-Shah, Yasmeen; Corcoran, Chris D; Damele, Deanna M

    2018-04-17

    Performance on the Mini-Mental State Examination (MMSE), among the most widely used global screens of adult cognitive status, is affected by demographic variables including age, education, and ethnicity. This study extends prior research by examining the specific effects of bilingualism on MMSE performance. Sixty independent community-dwelling monolingual and bilingual adults were recruited from eastern and western regions of the United States in this cross-sectional group study. Independent sample t tests were used to compare 2 bilingual groups (Spanish-English and Asian Indian-English) with matched monolingual speakers on the MMSE, demographically adjusted MMSE scores, MMSE item scores, and a nonverbal cognitive measure. Regression analyses were also performed to determine whether language proficiency predicted MMSE performance in both groups of bilingual speakers. Group differences were evident on the MMSE, on demographically adjusted MMSE scores, and on a small subset of individual MMSE items. Scores on a standardized screen of language proficiency predicted a significant proportion of the variance in the MMSE scores of both bilingual groups. Bilingual speakers demonstrated distinct performance profiles on the MMSE. Results suggest that supplementing the MMSE with a language screen, administering a nonverbal measure, and/or evaluating item-based patterns of performance may assist with test interpretation for this population.

  14. Bias in patient satisfaction surveys: a threat to measuring healthcare quality.

    Science.gov (United States)

    Dunsch, Felipe; Evans, David K; Macis, Mario; Wang, Qiao

    2018-01-01

    Patient satisfaction surveys are an increasingly common element of efforts to evaluate the quality of healthcare. Many patient satisfaction surveys in low/middle-income countries frame statements positively and invite patients to agree or disagree, so that positive responses may reflect either true satisfaction or bias induced by the positive framing. In an experiment with more than 2200 patients in Nigeria, we distinguish between actual satisfaction and survey biases. Patients randomly assigned to receive negatively framed statements expressed significantly lower levels of satisfaction (87%) than patients receiving the standard positively framed statements (95%-pquality of health services. Providers and policymakers wishing to gauge the quality of care will need to avoid framing that induces bias and to complement patient satisfaction measures with more objective measures of quality.

  15. Psychometric Evaluation of Chinese-Language 44-Item and 10-Item Big Five Personality Inventories, Including Correlations with Chronotype, Mindfulness and Mind Wandering.

    Science.gov (United States)

    Carciofo, Richard; Yang, Jiaoyan; Song, Nan; Du, Feng; Zhang, Kan

    2016-01-01

    The 44-item and 10-item Big Five Inventory (BFI) personality scales are widely used, but there is a lack of psychometric data for Chinese versions. Eight surveys (total N = 2,496, aged 18-82), assessed a Chinese-language BFI-44 and/or an independently translated Chinese-language BFI-10. Most BFI-44 items loaded strongly or predominantly on the expected dimension, and values of Cronbach's alpha ranged .698-.807. Test-retest coefficients ranged .694-.770 (BFI-44), and .515-.873 (BFI-10). The BFI-44 and BFI-10 showed good convergent and discriminant correlations, and expected associations with gender (females higher for agreeableness and neuroticism), and age (older age associated with more conscientiousness and agreeableness, and also less neuroticism and openness). Additionally, predicted correlations were found with chronotype (morningness positive with conscientiousness), mindfulness (negative with neuroticism, positive with conscientiousness), and mind wandering/daydreaming frequency (negative with conscientiousness, positive with neuroticism). Exploratory analysis found that the Self-discipline facet of conscientiousness positively correlated with morningness and mindfulness, and negatively correlated with mind wandering/daydreaming frequency. Furthermore, Self-discipline was found to be a mediator in the relationships between chronotype and mindfulness, and chronotype and mind wandering/daydreaming frequency. Overall, the results support the utility of the BFI-44 and BFI-10 for Chinese-language big five personality research.

  16. Racial differences in hypertension knowledge: effects of differential item functioning.

    Science.gov (United States)

    Ayotte, Brian J; Trivedi, Ranak; Bosworth, Hayden B

    2009-01-01

    Health-related knowledge is an important component in the self-management of chronic illnesses. The objective of this study was to more accurately assess racial differences in hypertension knowledge by using a latent variable modeling approach that controlled for sociodemographic factors and accounted for measurement issues in the assessment of hypertension knowledge. Cross-sectional data from 1,177 participants (45% African American; 35% female) were analyzed using a multiple indicator multiple causes (MIMIC) modeling approach. Available sociodemographic data included race, education, sex, financial status, and age. All participants completed six items on a hypertension knowledge questionnaire. Overall, the final model suggested that females, Whites, and patients with at least a high school diploma had higher latent knowledge scores than males, African Americans, and patients with less than a high school diploma, respectively. The model also detected differential item functioning (DIF) based on race for two of the items. Specifically, the error rate for African Americans was lower than would be expected given the lower level of latent knowledge on the items, on the questions related to: (a) the association between high blood pressure and kidney disease, and (b) the increased risk African Americans have for developing hypertension. Not accounting for DIF resulted in the difference between Whites and African Americans to be underestimated. These results are discussed in the context of the need for careful measurement of health-related constructs, and how measurement-related issues can result in an inaccurate estimation of racial differences in hypertension knowledge.

  17. School nutritional capacity, resources and practices are associated with availability of food/beverage items in schools

    Science.gov (United States)

    2013-01-01

    Background The school food environment is important to target as less healthful food and beverages are widely available at schools. This study examined whether the availability of specific food/beverage items was associated with a number of school environmental factors. Methods Principals from elementary (n = 369) and middle/high schools (n = 118) in British Columbia (BC), Canada completed a survey measuring characteristics of the school environment. Our measurement framework integrated constructs from the Theories of Organizational Change and elements from Stillman’s Tobacco Policy Framework adapted for obesity prevention. Our measurement framework included assessment of policy institutionalization of nutritional guidelines at the district and school levels, climate, nutritional capacity and resources (nutritional resources and participation in nutritional programs), nutritional practices, and school community support for enacting stricter nutritional guidelines. We used hierarchical mixed-effects logistic regression analyses to examine associations with the availability of fruit, vegetables, pizza/hamburgers/hot dogs, chocolate candy, sugar-sweetened beverages, and french fried potatoes. Results In elementary schools, fruit and vegetable availability was more likely among schools that have more nutritional resources (OR = 6.74 and 5.23, respectively). In addition, fruit availability in elementary schools was highest in schools that participated in the BC School Fruit and Vegetable Nutritional Program and the BC Milk program (OR = 4.54 and OR = 3.05, respectively). In middle/high schools, having more nutritional resources was associated with vegetable availability only (OR = 5.78). Finally, middle/high schools that have healthier nutritional practices (i.e., which align with upcoming provincial/state guidelines) were less likely to have the following food/beverage items available at school: chocolate candy (OR = .80) and sugar

  18. A note on monotonicity of item response functions for ordered polytomous item response theory models.

    Science.gov (United States)

    Kang, Hyeon-Ah; Su, Ya-Hui; Chang, Hua-Hua

    2018-03-08

    A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales. © 2018 The British Psychological Society.

  19. The impact of item order on ratings of cancer risk perception.

    Science.gov (United States)

    Taylor, Kathryn L; Shelby, Rebecca A; Schwartz, Marc D; Ackerman, Josh; LaSalle, V Holland; Gelmann, Edward P; McGuire, Colleen

    2002-07-01

    Although perceived risk is central to most theories of health behavior, there is little consensus on its measurement with regard to item wording, response set, or the number of items to include. In a methodological assessment of perceived risk, we assessed the impact of changing the order of three commonly used perceived risk items: quantitative personal risk, quantitative population risk, and comparative risk. Participants were 432 men and women enrolled in an ancillary study of the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial. Three groups of consecutively enrolled participants responded to the three items in one of three question orders. Results indicated that item order was related to the perceived risk ratings of both ovarian (P Perceptions of risk were significantly lower when the comparative rating was made first. The findings suggest that compelling participants to consider their own risk relative to the risk of others results in lower ratings of perceived risk. Although the use of multiple items may provide more information than when only a single method is used, different conclusions may be reached depending on the context in which an item is assessed.

  20. FIM-Minimum Data Set Motor Item Bank: Short Forms Development and Precision Comparison in Veterans.

    Science.gov (United States)

    Li, Chih-Ying; Romero, Sergio; Simpson, Annie N; Bonilha, Heather S; Simpson, Kit N; Hong, Ickpyo; Velozo, Craig A

    2018-03-01

    To improve the practical use of the short forms (SFs) developed from the item bank, we compared the measurement precision of the 4- and 8-item SFs generated from a motor item bank composed of the FIM and the Minimum Data Set (MDS). The FIM-MDS motor item bank allowed scores generated from different instruments to be co-calibrated. The 4- and 8-item SFs were developed based on Rasch analysis procedures. This article compared person strata, ceiling/floor effects, and test SE plots for each administration form and examined 95% confidence interval error bands of anchored person measures with the corresponding SFs. We used 0.3 SE as a criterion to reflect a reliability level of .90. Veterans' inpatient rehabilitation facilities and community living centers. Veterans (N=2500) who had both FIM and the MDS data within 6 days during 2008 through 2010. Not applicable. Four- and 8-item SFs of FIM, MDS, and FIM-MDS motor item bank. Six SFs were generated with 4 and 8 items across a range of difficulty levels from the FIM-MDS motor item bank. The three 8-item SFs all had higher correlations with the item bank (r=.82-.95), higher person strata, and less test error than the corresponding 4-item SFs (r=.80-.90). The three 4-item SFs did not meet the criteria of SE bank composed of existing instruments across the continuum of care in veterans. We also found that the number of items, not test specificity, determines the precision of the instrument. Copyright © 2017 American Congress of Rehabilitation Medicine. All rights reserved.