WorldWideScience

Sample records for promis global items

  1. Using Linear Equating to Map PROMIS(®) Global Health Items and the PROMIS-29 V2.0 Profile Measure to the Health Utilities Index Mark 3.

    Science.gov (United States)

    Hays, Ron D; Revicki, Dennis A; Feeny, David; Fayers, Peter; Spritzer, Karen L; Cella, David

    2016-10-01

    Preference-based health-related quality of life (HR-QOL) scores are useful as outcome measures in clinical studies, for monitoring the health of populations, and for estimating quality-adjusted life-years. This was a secondary analysis of data collected in an internet survey as part of the Patient-Reported Outcomes Measurement Information System (PROMIS(®)) project. To estimate Health Utilities Index Mark 3 (HUI-3) preference scores, we used the ten PROMIS(®) global health items, the PROMIS-29 V2.0 single pain intensity item and seven multi-item scales (physical functioning, fatigue, pain interference, depressive symptoms, anxiety, ability to participate in social roles and activities, sleep disturbance), and the PROMIS-29 V2.0 items. Linear regression analyses were used to identify significant predictors, followed by simple linear equating to avoid regression to the mean. The regression models explained 48 % (global health items), 61 % (PROMIS-29 V2.0 scales), and 64 % (PROMIS-29 V2.0 items) of the variance in the HUI-3 preference score. Linear equated scores were similar to observed scores, although differences tended to be larger for older study participants. HUI-3 preference scores can be estimated from the PROMIS(®) global health items or PROMIS-29 V2.0. The estimated HUI-3 scores from the PROMIS(®) health measures can be used for economic applications and as a measure of overall HR-QOL in research.

  2. Language-related differential item functioning between English and German PROMIS Depression items is negligible.

    Science.gov (United States)

    Fischer, H Felix; Wahl, Inka; Nolte, Sandra; Liegl, Gregor; Brähler, Elmar; Löwe, Bernd; Rose, Matthias

    2017-12-01

    To investigate differential item functioning (DIF) of PROMIS Depression items between US and German samples we compared data from the US PROMIS calibration sample (n = 780), a German general population survey (n = 2,500) and a German clinical sample (n = 621). DIF was assessed in an ordinal logistic regression framework, with 0.02 as criterion for R 2 -change and 0.096 for Raju's non-compensatory DIF. Item parameters were initially fixed to the PROMIS Depression metric; we used plausible values to account for uncertainty in depression estimates. Only four items showed DIF. Accounting for DIF led to negligible effects for the full item bank as well as a post hoc simulated computer-adaptive test (German general population sample was considerably lower compared to the US reference value of 50. Overall, we found little evidence for language DIF between US and German samples, which could be addressed by either replacing the DIF items by items not showing DIF or by scoring the short form in German samples with the corrected item parameters reported. Copyright © 2016 John Wiley & Sons, Ltd.

  3. PROMIS GH (Patient-Reported Outcomes Measurement Information System Global Health) Scale in Stroke: A Validation Study.

    Science.gov (United States)

    Katzan, Irene L; Lapin, Brittany

    2018-01-01

    The International Consortium for Health Outcomes Measurement recently included the 10-item PROMIS GH (Patient-Reported Outcomes Measurement Information System Global Health) scale as part of their recommended Standard Set of Stroke Outcome Measures. Before collection of PROMIS GH is broadly implemented, it is necessary to assess its performance in the stroke population. The objective of this study was to evaluate the psychometric properties of PROMIS GH in patients with ischemic stroke and intracerebral hemorrhage. PROMIS GH and 6 PROMIS domain scales measuring same/similar constructs were electronically collected on 1102 patients with ischemic and hemorrhagic strokes at various stages of recovery from their stroke who were seen in a cerebrovascular clinic from October 12, 2015, through June 2, 2017. Confirmatory factor analysis was performed to evaluate the adequacy of 2-factor structure of component scores. Test-retest reliability and convergent validity of PROMIS GH items and component scores were assessed. Discriminant validity and responsiveness were compared between PROMIS GH and PROMIS domain scales measuring the same or related constructs. Analyses were repeated stratified by stroke subtype and modified Rankin Scale score validity was good with significant correlations between all PROMIS GH items and PROMIS domain scales ( P 0.5) was demonstrated for 8 of the 10 PROMIS GH items. Reliability and validity remained consistent across stroke subtype and disability level (modified Rankin Scale, <2 versus ≥2). PROMIS GH exhibits acceptable performance in patients with stroke. Our findings support International Consortium for Health Outcomes Measurement recommendation to use PROMIS GH as part of the standard set of outcome measures in stroke. © 2017 American Heart Association, Inc.

  4. Calibration of the Dutch-Flemish PROMIS Pain Behavior item bank in patients with chronic pain.

    Science.gov (United States)

    Crins, M H P; Roorda, L D; Smits, N; de Vet, H C W; Westhovens, R; Cella, D; Cook, K F; Revicki, D; van Leeuwen, J; Boers, M; Dekker, J; Terwee, C B

    2016-02-01

    The aims of the current study were to calibrate the item parameters of the Dutch-Flemish PROMIS Pain Behavior item bank using a sample of Dutch patients with chronic pain and to evaluate cross-cultural validity between the Dutch-Flemish and the US PROMIS Pain Behavior item banks. Furthermore, reliability and construct validity of the Dutch-Flemish PROMIS Pain Behavior item bank were evaluated. The 39 items in the bank were completed by 1042 Dutch patients with chronic pain. To evaluate unidimensionality, a one-factor confirmatory factor analysis (CFA) was performed. A graded response model (GRM) was used to calibrate the items. To evaluate cross-cultural validity, Differential item functioning (DIF) for language (Dutch vs. English) was evaluated. Reliability of the item bank was also examined and construct validity was studied using several legacy instruments, e.g. the Roland Morris Disability Questionnaire. CFA supported the unidimensionality of the Dutch-Flemish PROMIS Pain Behavior item bank (CFI = 0.960, TLI = 0.958), the data also fit the GRM, and demonstrated good coverage across the pain behavior construct (threshold parameters range: -3.42 to 3.54). Analysis showed good cross-cultural validity (only six DIF items), reliability (Cronbach's α = 0.95) and construct validity (all correlations ≥0.53). The Dutch-Flemish PROMIS Pain Behavior item bank was found to have good cross-cultural validity, reliability and construct validity. The development of the Dutch-Flemish PROMIS Pain Behavior item bank will serve as the basis for Dutch-Flemish PROMIS short forms and computer adaptive testing (CAT). © 2015 European Pain Federation - EFIC®

  5. Psychometric properties of the PROMIS Physical Function item bank in patients receiving physical therapy.

    Directory of Open Access Journals (Sweden)

    Martine H P Crins

    Full Text Available The Patient-Reported Outcomes Measurement Information System (PROMIS is a universally applicable set of instruments, including item banks, short forms and computer adaptive tests (CATs, measuring patient-reported health across different patient populations. PROMIS CATs are highly efficient and the use in practice is considered feasible with little administration time, offering standardized and routine patient monitoring. Before an item bank can be used as CAT, the psychometric properties of the item bank have to be examined. Therefore, the objective was to assess the psychometric properties of the Dutch-Flemish PROMIS Physical Function item bank (DF-PROMIS-PF in Dutch patients receiving physical therapy.Cross-sectional study.805 patients >18 years, who received any kind of physical therapy in primary care in the past year, completed the full DF-PROMIS-PF (121 items.Unidimensionality was examined by Confirmatory Factor Analysis and local dependence and monotonicity were evaluated. A Graded Response Model was fitted. Construct validity was examined with correlations between DF-PROMIS-PF T-scores and scores on two legacy instruments (SF-36 Health Survey Physical Functioning scale [SF36-PF10] and the Health Assessment Questionnaire Disability-Index [HAQ-DI]. Reliability (standard errors of theta was assessed.The results for unidimensionality were mixed (scaled CFI = 0.924, TLI = 0.923, RMSEA = 0.045, 1th factor explained 61.5% of variance. Some local dependence was found (8.2% of item pairs. The item bank showed a broad coverage of the physical function construct (threshold-parameters range: -4.28-2.33 and good construct validity (correlation with SF36-PF10 = 0.84 and HAQ-DI = -0.85. Furthermore, the DF-PROMIS-PF showed greater reliability over a broader score-range than the SF36-PF10 and HAQ-DI.The psychometric properties of the DF-PROMIS-PF item bank are sufficient. The DF-PROMIS-PF can now be used as short forms or CAT to measure the level of

  6. Development of six PROMIS pediatrics proxy-report item banks.

    Science.gov (United States)

    Irwin, Debra E; Gross, Heather E; Stucky, Brian D; Thissen, David; DeWitt, Esi Morgan; Lai, Jin Shei; Amtmann, Dagmar; Khastou, Leyla; Varni, James W; DeWalt, Darren A

    2012-02-22

    Pediatric self-report should be considered the standard for measuring patient reported outcomes (PRO) among children. However, circumstances exist when the child is too young, cognitively impaired, or too ill to complete a PRO instrument and a proxy-report is needed. This paper describes the development process including the proxy cognitive interviews and large-field-test survey methods and sample characteristics employed to produce item parameters for the Patient Reported Outcomes Measurement Information System (PROMIS) pediatric proxy-report item banks. The PROMIS pediatric self-report items were converted into proxy-report items before undergoing cognitive interviews. These items covered six domains (physical function, emotional distress, social peer relationships, fatigue, pain interference, and asthma impact). Caregivers (n = 25) of children ages of 5 and 17 years provided qualitative feedback on proxy-report items to assess any major issues with these items. From May 2008 to March 2009, the large-scale survey enrolled children ages 8-17 years to complete the self-report version and caregivers to complete the proxy-report version of the survey (n = 1548 dyads). Caregivers of children ages 5 to 7 years completed the proxy report survey (n = 432). In addition, caregivers completed other proxy instruments, PedsQL™ 4.0 Generic Core Scales Parent Proxy-Report version, PedsQL™ Asthma Module Parent Proxy-Report version, and KIDSCREEN Parent-Proxy-52. Item content was well understood by proxies and did not require item revisions but some proxies clearly noted that determining an answer on behalf of their child was difficult for some items. Dyads and caregivers of children ages 5-17 years old were enrolled in the large-scale testing. The majority were female (85%), married (70%), Caucasian (64%) and had at least a high school education (94%). Approximately 50% had children with a chronic health condition, primarily asthma, which was diagnosed or treated within 6

  7. Item response theory, computerized adaptive testing, and PROMIS: assessment of physical function.

    Science.gov (United States)

    Fries, James F; Witter, James; Rose, Matthias; Cella, David; Khanna, Dinesh; Morgan-DeWitt, Esi

    2014-01-01

    Patient-reported outcome (PRO) questionnaires record health information directly from research participants because observers may not accurately represent the patient perspective. Patient-reported Outcomes Measurement Information System (PROMIS) is a US National Institutes of Health cooperative group charged with bringing PRO to a new level of precision and standardization across diseases by item development and use of item response theory (IRT). With IRT methods, improved items are calibrated on an underlying concept to form an item bank for a "domain" such as physical function (PF). The most informative items can be combined to construct efficient "instruments" such as 10-item or 20-item PF static forms. Each item is calibrated on the basis of the probability that a given person will respond at a given level, and the ability of the item to discriminate people from one another. Tailored forms may cover any desired level of the domain being measured. Computerized adaptive testing (CAT) selects the best items to sharpen the estimate of a person's functional ability, based on prior responses to earlier questions. PROMIS item banks have been improved with experience from several thousand items, and are calibrated on over 21,000 respondents. In areas tested to date, PROMIS PF instruments are superior or equal to Health Assessment Questionnaire and Medical Outcome Study Short Form-36 Survey legacy instruments in clarity, translatability, patient importance, reliability, and sensitivity to change. Precise measures, such as PROMIS, efficiently incorporate patient self-report of health into research, potentially reducing research cost by lowering sample size requirements. The advent of routine IRT applications has the potential to transform PRO measurement.

  8. Development of six PROMIS pediatrics proxy-report item banks

    Directory of Open Access Journals (Sweden)

    Irwin Debra E

    2012-02-01

    Full Text Available Abstract Background Pediatric self-report should be considered the standard for measuring patient reported outcomes (PRO among children. However, circumstances exist when the child is too young, cognitively impaired, or too ill to complete a PRO instrument and a proxy-report is needed. This paper describes the development process including the proxy cognitive interviews and large-field-test survey methods and sample characteristics employed to produce item parameters for the Patient Reported Outcomes Measurement Information System (PROMIS pediatric proxy-report item banks. Methods The PROMIS pediatric self-report items were converted into proxy-report items before undergoing cognitive interviews. These items covered six domains (physical function, emotional distress, social peer relationships, fatigue, pain interference, and asthma impact. Caregivers (n = 25 of children ages of 5 and 17 years provided qualitative feedback on proxy-report items to assess any major issues with these items. From May 2008 to March 2009, the large-scale survey enrolled children ages 8-17 years to complete the self-report version and caregivers to complete the proxy-report version of the survey (n = 1548 dyads. Caregivers of children ages 5 to 7 years completed the proxy report survey (n = 432. In addition, caregivers completed other proxy instruments, PedsQL™ 4.0 Generic Core Scales Parent Proxy-Report version, PedsQL™ Asthma Module Parent Proxy-Report version, and KIDSCREEN Parent-Proxy-52. Results Item content was well understood by proxies and did not require item revisions but some proxies clearly noted that determining an answer on behalf of their child was difficult for some items. Dyads and caregivers of children ages 5-17 years old were enrolled in the large-scale testing. The majority were female (85%, married (70%, Caucasian (64% and had at least a high school education (94%. Approximately 50% had children with a chronic health condition, primarily

  9. Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain.

    Science.gov (United States)

    Crins, Martine H P; Roorda, Leo D; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Terwee, Caroline B

    2015-01-01

    The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT) model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA). Items were calibrated using the graded response model (GRM), an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF) for language (Dutch vs. English) was examined. Reliability was evaluated based on standard errors and Cronbach's alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire) were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986). Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44). The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF), good reliability (Cronbach's alpha = 0.98), and good construct validity (Pearson correlations between 0.62 and 0.75). A computer adaptive test (CAT) and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed.

  10. Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain.

    Directory of Open Access Journals (Sweden)

    Martine H P Crins

    Full Text Available The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA. Items were calibrated using the graded response model (GRM, an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF for language (Dutch vs. English was examined. Reliability was evaluated based on standard errors and Cronbach's alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986. Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44. The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF, good reliability (Cronbach's alpha = 0.98, and good construct validity (Pearson correlations between 0.62 and 0.75. A computer adaptive test (CAT and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed.

  11. Dutch-Flemish translation of nine pediatric item banks from the Patient-Reported Outcomes Measurement Information System (PROMIS)®.

    Science.gov (United States)

    Haverman, Lotte; Grootenhuis, Martha A; Raat, Hein; van Rossum, Marion A J; van Dulmen-den Broeder, Eline; Hoppenbrouwers, Karel; Correia, Helena; Cella, David; Roorda, Leo D; Terwee, Caroline B

    2016-03-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS(®)) is a new, state-of-the-art assessment system for measuring patient-reported health and well-being of adults and children. It has the potential to be more valid, reliable, and responsive than existing PROMs. The items banks are designed to be self-reported and completed by children aged 8-18 years. The PROMIS items can be administered in short forms or through computerized adaptive testing. This paper describes the translation and cultural adaption of nine PROMIS item banks (151 items) for children in Dutch-Flemish. The translation was performed by FACITtrans using standardized PROMIS methodology and approved by the PROMIS Statistical Center. The translation included four forward translations, two back-translations, three independent reviews (at least two Dutch, one Flemish), and pretesting in 24 children from the Netherlands and Flanders. For some items, it was necessary to have separate translations for Dutch and Flemish: physical function-mobility (three items), anger (one item), pain interference (two items), and asthma impact (one item). Challenges faced in the translation process included scarcity or overabundance of possible translations, unclear item descriptions, constructs broader/smaller in the target language, difficulties in rank ordering items, differences in unit of measurement, irrelevant items, or differences in performance of activities. By addressing these challenges, acceptable translations were obtained for all items. The Dutch-Flemish PROMIS items are linguistically equivalent to the original USA version. Short forms are now available for use, and entire item banks are ready for cross-cultural validation in the Netherlands and Flanders.

  12. Qualitative Development and Content Validation of the PROMIS Pediatric Sleep Health Items.

    Science.gov (United States)

    Bevans, Katherine B; Meltzer, Lisa J; De La Motte, Anna; Kratchman, Amy; Viél, Dominique; Forrest, Christopher B

    2018-04-25

    To develop the Patient Reported Outcome Measurement Information System (PROMIS) Pediatric Sleep Health item pool and evaluate its content validity. Participants included 8 expert sleep clinician-researchers, 64 children ages 8-17 years, and 54 parents of children ages 5-17 years. We started with item concepts and expressions from the PROMIS Sleep Disturbance and Sleep Related Impairment adult measures. Additional pediatric sleep health concepts were generated by expert (n = 8), child (n = 28), and parent (n = 33) concept elicitation interviews and a systematic review of existing pediatric sleep health questionnaires. Content validity of the item pool was evaluated with item translatability review, readability analysis, and child (n = 36) and parent (n = 21) cognitive interviews. The final pediatric Sleep Health item pool includes 43 items that assess sleep disturbance (children's capacity to fall and stay asleep, sleep quality, dreams, and parasomnias) and sleep-related impairments (daytime sleepiness, low energy, difficulty waking up, and the impact of sleep and sleepiness on cognition, affect, behavior, and daily activities). Items are translatable and relevant and well understood by children ages 8-17 and parents of children ages 5-17. Rigorous qualitative procedures were used to develop and evaluate the content validity of the PROMIS Pediatric Sleep Health item pool. Once the item pool's psychometric properties are established, the scales will be useful for measuring children's subjective experiences of sleep.

  13. Validation of the alcohol use item banks from the Patient-Reported Outcomes Measurement Information System (PROMIS).

    Science.gov (United States)

    Pilkonis, Paul A; Yu, Lan; Dodds, Nathan E; Johnston, Kelly L; Lawrence, Suzanne M; Daley, Dennis C

    2016-04-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS) includes five item banks for alcohol use. There are limited data, however, regarding their validity (e.g., convergent validity, responsiveness to change). To provide such data, we conducted a prospective study with 225 outpatients being treated for substance abuse. Assessments were completed shortly after intake and at 1-month and 3-month follow-ups. The alcohol item banks were administered as computerized adaptive tests (CATs). Fourteen CATs and one six-item short form were also administered from eight other PROMIS domains to generate a comprehensive health status profile. After modeling treatment outcome for the sample as a whole, correlates of outcome from the PROMIS health status profile were examined. For convergent validity, the largest correlation emerged between the PROMIS alcohol use score and the Alcohol Use Disorders Identification Test (r=.79 at intake). Regarding treatment outcome, there were modest changes across the target problem of alcohol use and other domains of the PROMIS health status profile. However, significant heterogeneity was found in initial severity of drinking and in rates of change for both abstinence and severity of drinking during follow-up. This heterogeneity was associated with demographic (e.g., gender) and health-profile (e.g., emotional support, social participation) variables. The results demonstrated the validity of PROMIS CATs, which require only 4-6 items in each domain. This efficiency makes it feasible to use a comprehensive health status profile within the substance use treatment setting, providing important prognostic information regarding abstinence and severity of drinking. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  14. Calibration of the PROMIS physical function item bank in Dutch patients with rheumatoid arthritis.

    Directory of Open Access Journals (Sweden)

    Martijn A H Oude Voshaar

    Full Text Available OBJECTIVE: To calibrate the Dutch-Flemish version of the PROMIS physical function (PF item bank in patients with rheumatoid arthritis (RA and to evaluate cross-cultural measurement equivalence with US general population and RA data. METHODS: Data were collected from RA patients enrolled in the Dutch DREAM registry. An incomplete longitudinal anchored design was used where patients completed all 121 items of the item bank over the course of three waves of data collection. Item responses were fit to a generalized partial credit model adapted for longitudinal data and the item parameters were examined for differential item functioning (DIF across country, age, and sex. RESULTS: In total, 690 patients participated in the study at time point 1 (T2, N = 489; T3, N = 311. The item bank could be successfully fitted to a generalized partial credit model, with the number of misfitting items falling within acceptable limits. Seven items demonstrated DIF for sex, while 5 items showed DIF for age in the Dutch RA sample. Twenty-five (20% items were flagged for cross-cultural DIF compared to the US general population. However, the impact of observed DIF on total physical function estimates was negligible. DISCUSSION: The results of this study showed that the PROMIS PF item bank adequately fit a unidimensional IRT model which provides support for applications that require invariant estimates of physical function, such as computer adaptive testing and targeted short forms. More studies are needed to further investigate the cross-cultural applicability of the US-based PROMIS calibration and standardized metric.

  15. The Dutch-Flemish PROMIS Physical Function item bank exhibited strong psychometric properties in patients with chronic pain.

    Science.gov (United States)

    Crins, Martine H P; Terwee, Caroline B; Klausch, Thomas; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis A; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Roorda, Leo D

    2017-07-01

    The objective of this study was to assess the psychometric properties of the Dutch-Flemish Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank in Dutch patients with chronic pain. A bank of 121 items was administered to 1,247 Dutch patients with chronic pain. Unidimensionality was assessed by fitting a one-factor confirmatory factor analysis and evaluating resulting fit statistics. Items were calibrated with the graded response model and its fit was evaluated. Cross-cultural validity was assessed by testing items for differential item functioning (DIF) based on language (Dutch vs. English). Construct validity was evaluated by calculation correlations between scores on the Dutch-Flemish PROMIS Physical Function measure and scores on generic and disease-specific measures. Results supported the Dutch-Flemish PROMIS Physical Function item bank's unidimensionality (Comparative Fit Index = 0.976, Tucker Lewis Index = 0.976) and model fit. Item thresholds targeted a wide range of physical function construct (threshold-parameters range: -4.2 to 5.6). Cross-cultural validity was good as four items only showed DIF for language and their impact on item scores was minimal. Physical Function scores were strongly associated with scores on all other measures (all correlations ≤ -0.60 as expected). The Dutch-Flemish PROMIS Physical Function item bank exhibited good psychometric properties. Development of a computer adaptive test based on the large bank is warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Examination of the PROMIS upper extremity item bank.

    Science.gov (United States)

    Hung, Man; Voss, Maren W; Bounsanga, Jerry; Crum, Anthony B; Tyser, Andrew R

    Clinical measurement. The psychometric properties of the PROMIS v1.2 UE item bank were tested on various samples prior to its release, but have not been fully evaluated among the orthopaedic population. This study assesses the performance of the UE item bank within the UE orthopaedic patient population. The UE item bank was administered to 1197 adult patients presenting to a tertiary orthopaedic clinic specializing in hand and UE conditions and was examined using traditional statistics and Rasch analysis. The UE item bank fits a unidimensional model (outfit MNSQ range from 0.64 to 1.70) and has adequate reliabilities (person = 0.84; item = 0.82) and local independence (item residual correlations range from -0.37 to 0.34). Only one item exhibits gender differential item functioning. Most items target low levels of function. The UE item bank is a useful clinical assessment tool. Additional items covering higher functions are needed to enhance validity. Supplemental testing is recommended for patients at higher levels of function until more high function UE items are developed. 2c. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.

  17. The PROMIS fatigue item bank has good measurement properties in patients with fibromyalgia and severe fatigue.

    Science.gov (United States)

    Yost, Kathleen J; Waller, Niels G; Lee, Minji K; Vincent, Ann

    2017-06-01

    Efficient management of fibromyalgia (FM) requires precise measurement of FM-specific symptoms. Our objective was to assess the measurement properties of the Patient-Reported Outcome Measurement Information System (PROMIS) fatigue item bank (FIB) in people with FM. We applied classical psychometric and item response theory methods to cross-sectional PROMIS-FIB data from two samples. Data on the clinical FM sample were obtained at a tertiary medical center. Data for the U.S. general population sample were obtained from the PROMIS network. The full 95-item bank was administered to both samples. We investigated dimensionality of the item bank in both samples by separately fitting a bifactor model with two group factors; experience and impact. We assessed measurement invariance between samples, and we explored an alternate factor structure with the normative sample and subsequently confirmed that structure in the clinical sample. Finally, we assessed whether reporting FM subdomain scores added value over reporting a single total score. The item bank was dominated by a general fatigue factor. The fit of the initial bifactor model and evidence of measurement invariance indicated that the same constructs were measured across the samples. An alternative bifactor model with three group factors demonstrated slightly improved fit. Subdomain scores add value over a total score. We demonstrated that the PROMIS-FIB is appropriate for measuring fatigue in clinical samples of FM patients. The construct can be presented by a single score; however, subdomain scores for the three group factors identified in the alternative model may also be reported.

  18. Validation of the PROMIS Sleep Disturbance and Sleep-Related Impairment item banks in Dutch adolescents.

    Science.gov (United States)

    van Kooten, Jojanneke A M C; van Litsenburg, Raphaёle R L; Yoder, Whitney R; Kaspers, Gertjan J L; Terwee, Caroline B

    2018-04-16

    Sleep problems are common in adolescents and have a negative impact on daytime functioning. However, there is a lack of well-validated adolescent sleep questionnaires. The Patient-Reported Outcomes Measurement Information System (PROMIS) Sleep Disturbance and Sleep-Related Impairment item banks are well-validated instruments developed for and tested in adults. The aim of this study was to evaluate their structural validity in adolescents. Test and retest data were collected for the Dutch-Flemish V1.0 PROMIS Sleep Disturbance (27) and Sleep-Related Impairment (16 items) item banks from 1046 adolescents (11-19 years). Cross-validation methods, Confirmatory (CFA), and Exploratory Factor Analyses (EFA) were used. Fit indices and factor loadings were used to improve the models. The final models were assessed for model fit using retest data. The one-factor Sleep Disturbance (CFI = 0.795, TLI = 0.778, RMSEA = 0.117) and Sleep-Related Impairment (CFI = 0.897, TLI = 0.882, RMSEA = 0.156) models could not be replicated in adolescents. Cross-validation resulted in a final Sleep Disturbance model of 23 and a Sleep-Related Impairment model of 11 items. Retest data CFA showed adequate fit for the Sleep-Related Impairment-11 (CFI = 0.981, TLI = 0.976, RMSEA = 0.116). The Sleep Disturbance-23 model fit indices stayed below the recommended values (CFI = 0.895, TLI = 0.885, RMSEA = 0.105). While the PROMIS Sleep Disturbance-23 for adolescents and PROMIS Sleep-Related Impairment-11 for adolescents provide a framework to assess adolescent sleep, additional research is needed to replicate these findings in a larger and more diverse sample.

  19. The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency.

    Science.gov (United States)

    Rose, Matthias; Bjorner, Jakob B; Gandek, Barbara; Bruce, Bonnie; Fries, James F; Ware, John E

    2014-05-01

    To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. The items were evaluated using qualitative and quantitative methods. A total of 16,065 adults answered item subsets (n>2,200/item) on the Internet, with oversampling of the chronically ill. Classical test and item response theory methods were used to evaluate 149 PROMIS PF items plus 10 Short Form-36 and 20 Health Assessment Questionnaire-Disability Index items. A graded response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living. In simulations, a 10-item computerized adaptive test (CAT) eliminated floor and decreased ceiling effects, achieving higher measurement precision than any comparable length static tool across four SDs of the measurement range. Improved psychometric properties were transferred to the CAT's superior ability to identify differences between age and disease groups. The item bank provides a common metric and can improve the measurement of PF by facilitating the standardization of patient-reported outcome measures and implementation of CATs for more efficient PF assessments over a larger range. Copyright © 2014. Published by Elsevier Inc.

  20. Assessing nicotine dependence in adolescent E-cigarette users: The 4-item Patient-Reported Outcomes Measurement Information System (PROMIS) Nicotine Dependence Item Bank for electronic cigarettes.

    Science.gov (United States)

    Morean, Meghan E; Krishnan-Sarin, Suchitra; S O'Malley, Stephanie

    2018-04-26

    Adolescent e-cigarette use (i.e., "vaping") likely confers risk for developing nicotine dependence. However, there have been no studies assessing e-cigarette nicotine dependence in youth. We evaluated the psychometric properties of the 4-item Patient-Reported Outcomes Measurement Information System Nicotine Dependence Item Bank for E-cigarettes (PROMIS-E) for assessing youth e-cigarette nicotine dependence and examined risk factors for experiencing stronger dependence symptoms. In 2017, 520 adolescent past-month e-cigarette users completed the PROMIS-E during a school-based survey (50.5% female, 84.8% White, 16.22[1.19] years old). Adolescents also reported on sex, grade, race, age at e-cigarette use onset, vaping frequency, nicotine e-liquid use, and past-month cigarette smoking. Analyses included conducting confirmatory factor analysis and examining the internal consistency of the PROMIS-E. Bivariate correlations and independent-samples t-tests were used to examine unadjusted relationships between e-cigarette nicotine dependence and the proposed risk factors. Regression models were run in which all potential risk factors were entered as simultaneous predictors of PROMIS-E scores. The single-factor structure of the PROMIS-E was confirmed and evidenced good internal consistency. Across models, larger PROMIS-E scores were associated with being in a higher grade, initiating e-cigarette use at an earlier age, vaping more frequently, using nicotine e-liquid (and higher nicotine concentrations), and smoking cigarettes. Adolescent e-cigarette users reported experiencing nicotine dependence, which was assessed using the psychometrically sound PROMIS-E. Experiencing stronger nicotine dependence symptoms was associated with characteristics that previously have been shown to confer risk for frequent vaping and tobacco cigarette dependence. Copyright © 2018 Elsevier B.V. All rights reserved.

  1. The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency

    DEFF Research Database (Denmark)

    Rose, Matthias; Bjørner, Jakob; Gandek, Barbara

    2014-01-01

    OBJECTIVE: To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. STUDY DESIGN AND SETTING: The items were evaluated using qualitative and quantitative methods. A total...... response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. RESULTS: The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living...... to identify differences between age and disease groups. CONCLUSION: The item bank provides a common metric and can improve the measurement of PF by facilitating the standardization of patient-reported outcome measures and implementation of CATs for more efficient PF assessments over a larger range....

  2. Qualitative Evaluation of Pediatric Pain Behavior, Quality, and Intensity Item Candidates and the PROMIS Pain Domain Framework in Children With Chronic Pain.

    Science.gov (United States)

    Jacobson, C Jeffrey; Kashikar-Zuck, Susmita; Farrell, Jennifer; Barnett, Kimberly; Goldschneider, Ken; Dampier, Carlton; Cunningham, Natoshia; Crosby, Lori; DeWitt, Esi Morgan

    2015-12-01

    As initial steps in a broader effort to develop and test pediatric pain behavior and pain quality item banks for the Patient-Reported Outcomes Measurement Information System (PROMIS), we used qualitative interview and item review methods to 1) evaluate the overall conceptual scope and content validity of the PROMIS pain domain framework among children with chronic/recurrent pain conditions, and 2) develop item candidates for further psychometric testing. To elicit the experiential and conceptual scope of pain outcomes across a variety of pediatric recurrent/chronic pain conditions, we conducted 32 semi-structured individual and 2 focus-group interviews with children and adolescents (8-17 years), and 32 individual and 2 focus-group interviews with parents of children with pain. Interviews with pain experts (10) explored the operational limits of pain measurement in children. For item bank development, we identified existing items from measures in the literature, grouped them by concept, removed redundancies, and modified the remaining items to match PROMIS formatting. New items were written as needed and cognitive debriefing was completed with the children and their parents, resulting in 98 pain behavior (47 self, 51 proxy), 54 quality, and 4 intensity items for further testing. Qualitative content analyses suggest that reportable pain outcomes that matter to children with pain are captured within and consistent with the pain domain framework in PROMIS. PROMIS pediatric pain behavior, quality, and intensity items were developed based on a theoretical framework of pain that was evaluated by multiple stakeholders in the measurement of pediatric pain, including researchers, clinicians, and children with pain and their parents, and the appropriateness of the framework was verified. Copyright © 2015 American Pain Society. Published by Elsevier Inc. All rights reserved.

  3. Calibration of the PROMIS Physical Function Item Bank in Dutch Patients with Rheumatoid Arthritis

    NARCIS (Netherlands)

    Oude Voshaar, M.A.H.; ten Klooster, P.M.; Glas, C.A.W.; Vonkeman, H.E.; Taal, E; Krishnan, E.; Moens, H.J.B.; Boers, M.; Terwee, C.B.; van Riel, P.L.C.M.; van de Laar, M.A.F.J.

    2014-01-01

    Objective: To calibrate the Dutch-Flemish version of the PROMIS physical function (PF) item bank in patients with rheumatoid arthritis (RA) and to evaluate cross-cultural measurement equivalence with US general population and RA data. Methods: Data were collected from RA patients enrolled in the

  4. Psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Nicotine Dependence Item Bank for use with electronic cigarettes.

    Science.gov (United States)

    Morean, Meghan; Krishnan-Sarin, Suchitra; Sussman, Steve; Foulds, Jonathan; Fishbein, Howard; Grana, Rachel; O'Malley, Stephanie S

    2018-01-02

    Psychometrically sound measures of e-cigarette dependence are lacking. We modified the PROMIS Nicotine Dependence Item Banks for use with e-cigarettes and evaluated the psychometrics of the 22-, 8- and 4-item adapted versions. 1009 adults who reported using e-cigarettes at least weekly completed an anonymous survey in Summer 2016 (50.2% male, 77.1% White, mean age 35.81 [10.71], 66.4% daily e-cigarette users, 72.6% current cigarette smokers). Psychometric analyses included confirmatory factor analysis, internal consistency, measurement invariance, examination of mean-level differences, convergent validity, and test-criterion relationships with e-cigarette use outcomes. All PROMIS-E versions had confirmable, internally consistent latent structures that were scalar invariant by sex, race, e-cigarette use (non-daily/daily), e-liquid nicotine content (no/yes), and current cigarette smoking status (no/yes). Daily e-cigarette users, nicotine e-liquid users, and cigarette smokers reported being more dependent on e-cigarettes than their counterparts. All PROMIS-E versions correlated strongly with one another, evidenced convergent validity with the Penn State E-cigarette Dependence Index and time to first e-cigarette use in the morning, and evidenced test-criterion relationships with vaping frequency, e-liquid nicotine concentration, and e-cigarette quit attempts. Similar results were observed when analyses were conducted within subsamples of exclusive e-cigarette users and duals-users of cigarettes and e-cigarettes. Each PROMIS-E version evidenced strong psychometric properties for assessing e-cigarette dependence in adults who either use e-cigarette exclusively or who are dual-users of cigarettes and e-cigarettes. However, results indicated little benefit of the longer versions over the 4-item PROMIS-E, which provides an efficient assessment of e-cigarette dependence. The availability of the novel, psychometrically sound PROMIS-E can further research on a wide range of

  5. Development and psychometric evaluation of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions.

    Science.gov (United States)

    Forrest, Christopher B; Devine, Janine; Bevans, Katherine B; Becker, Brandon D; Carle, Adam C; Teneralli, Rachel E; Moon, JeanHee; Tucker, Carole A; Ravens-Sieberer, Ulrike

    2018-01-01

    To describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions. A pool of 55 life satisfaction items was administered to 1992 children 8-17 years old and 964 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and assessment of construct validity. Thirteen items were deleted because of poor psychometric performance. An 8-item short form was administered to a national sample of 996 children 8-17 years old, and 1294 parents of children 5-17 years old. The combined sample (2988 children and 2258 parents) was used in item response theory (IRT) calibration analyses. The final item banks were unidimensional, the items were locally independent, and the items were free from impactful differential item functioning. The 8-item and 4-item short form scales showed excellent reliability, convergent validity, and discriminant validity. Life satisfaction decreased with declining socio-economic status, presence of a special health care need, and increasing age for girls, but not boys. After IRT calibration, we found that 4- and 8-item short forms had a high degree of precision (reliability) across a wide range (>4 SD units) of the latent variable. The PROMIS Pediatric Life Satisfaction item banks and their short forms provide efficient, precise, and valid assessments of life satisfaction in children and youth.

  6. Development of the NIH PROMIS ® Sexual Function and Satisfaction measures in patients with cancer.

    Science.gov (United States)

    Flynn, Kathryn E; Lin, Li; Cyranowski, Jill M; Reeve, Bryce B; Reese, Jennifer Barsky; Jeffery, Diana D; Smith, Ashley Wilder; Porter, Laura S; Dombeck, Carrie B; Bruner, Deborah Watkins; Keefe, Francis J; Weinfurt, Kevin P

    2013-02-01

    We describe the development and validation of the Patient-Reported Outcomes Measurement Information System(®) Sexual Function and Satisfaction (PROMIS(®) SexFS; National Institutes of Health) measures, version 1.0, for cancer populations. To develop a customizable self-report measure of sexual function and satisfaction as part of the U.S. National Institutes of Health PROMIS Network. Our multidisciplinary working group followed a comprehensive protocol for developing psychometrically robust patient-reported outcome measures including qualitative (scale development) and quantitative (psychometric evaluation) development. We performed an extensive literature review, conducted 16 focus groups with cancer patients and multiple discussions with clinicians, and evaluated candidate items in cognitive testing with patients. We administered items to 819 cancer patients. Items were calibrated using item-response theory and evaluated for reliability and validity. The PROMIS SexFS measures, version 1.0, include 81 items in 11 domains: Interest in Sexual Activity, Lubrication, Vaginal Discomfort, Erectile Function, Global Satisfaction with Sex Life, Orgasm, Anal Discomfort, Therapeutic Aids, Sexual Activities, Interfering Factors, and Screener Questions. In addition to content validity (patients indicate that items cover important aspects of their experiences) and face validity (patients indicate that items measure sexual function and satisfaction), the measure shows evidence for discriminant validity (domains discriminate between groups expected to be different) and convergent validity (strong correlations between scores on PROMIS and scores on conceptually similar older measures of sexual function), as well as favorable test-retest reliability among people not expected to change (interclass correlations from two administrations of the instrument, 1 month apart). The PROMIS SexFS offers researchers a reliable and valid set of tools to measure self-reported sexual function

  7. Use of PROMIS for Patients Undergoing Primary Total Shoulder Arthroplasty.

    Science.gov (United States)

    Dowdle, S Blake; Glass, Natalie; Anthony, Chris A; Hettrich, Carolyn M

    2017-09-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS) consists of question banks for health domains through computer adaptive testing (CAT). For patients with glenohumeral arthritis, (1) there would be high correlation between traditional patient-reported outcome (PRO) measures and the PROMIS upper extremity item bank (PROMIS UE) and PROMIS physical function CAT (PROMIS PF CAT), and (2) PROMIS PF CAT would not demonstrate ceiling effects. Cohort study (diagnosis); Level of evidence, 3. Sixty-one patients with glenohumeral osteoarthritis were included. Each patient completed the American Shoulder and Elbow Surgeons (ASES) assessment form, Marx Shoulder Activity Scale, Short Form-36 physical function scale (SF-36 PF), EuroQol 5 Dimensions (EQ-5D) questionnaire, Western Ontario Osteoarthritis Shoulder (WOOS) index, PROMIS PF CAT, and the PROMIS UE. Correlation was defined as high (>0.7), moderate (0.4-0.6), or weak (0.2-0.3). Significant floor and ceiling effects were present if more than 15% of individuals scored the lowest or highest possible total score on any PRO. The PROMIS PF demonstrated excellent correlation with the SF-36 PF ( r = 0.81, P ceiling or floor effects observed. The mean number of items administered by the PROMIS PRO was 4. These data suggest that for a patient population with operative shoulder osteoarthritis, PROMIS UE and PROMIS PF CAT may be valid alternative PROs. Additionally, PROMIS PF CAT offers a decreased question burden with no ceiling effects.

  8. Use of PROMIS for Patients Undergoing Primary Total Shoulder Arthroplasty

    Science.gov (United States)

    Dowdle, S. Blake; Glass, Natalie; Anthony, Chris A.; Hettrich, Carolyn M.

    2017-01-01

    Background: The Patient-Reported Outcomes Measurement Information System (PROMIS) consists of question banks for health domains through computer adaptive testing (CAT). Hypothesis: For patients with glenohumeral arthritis, (1) there would be high correlation between traditional patient-reported outcome (PRO) measures and the PROMIS upper extremity item bank (PROMIS UE) and PROMIS physical function CAT (PROMIS PF CAT), and (2) PROMIS PF CAT would not demonstrate ceiling effects. Study Design: Cohort study (diagnosis); Level of evidence, 3. Methods: Sixty-one patients with glenohumeral osteoarthritis were included. Each patient completed the American Shoulder and Elbow Surgeons (ASES) assessment form, Marx Shoulder Activity Scale, Short Form–36 physical function scale (SF-36 PF), EuroQol 5 Dimensions (EQ-5D) questionnaire, Western Ontario Osteoarthritis Shoulder (WOOS) index, PROMIS PF CAT, and the PROMIS UE. Correlation was defined as high (>0.7), moderate (0.4-0.6), or weak (0.2-0.3). Significant floor and ceiling effects were present if more than 15% of individuals scored the lowest or highest possible total score on any PRO. Results: The PROMIS PF demonstrated excellent correlation with the SF-36 PF (r = 0.81, P ceiling or floor effects observed. The mean number of items administered by the PROMIS PRO was 4. Conclusion: These data suggest that for a patient population with operative shoulder osteoarthritis, PROMIS UE and PROMIS PF CAT may be valid alternative PROs. Additionally, PROMIS PF CAT offers a decreased question burden with no ceiling effects. PMID:28944248

  9. Danish translation of a physical function item bank from the Patient-Reported Outcome Measurement Information System (PROMIS)

    DEFF Research Database (Denmark)

    Schnohr, Christina W.; Rasmussen, Charlotte L.; Langberg, Henning

    2017-01-01

    of the Physical Function item bank into Danish. METHODS: We followed the PROMIS standard procedure, including: 1) two independent translations, 2) back translation, 3) independent reviews of translation quality, and 4) cognitive interviews with a representative sample of the adult population from the municipality...

  10. Cognitive interviewing methodology in the development of a pediatric item bank: a patient reported outcomes measurement information system (PROMIS study

    Directory of Open Access Journals (Sweden)

    DeWalt Darren A

    2009-01-01

    Full Text Available Abstract Background The evaluation of patient-reported outcomes (PROs in health care has seen greater use in recent years, and methods to improve the reliability and validity of PRO instruments are advancing. This paper discusses the cognitive interviewing procedures employed by the Patient Reported Outcomes Measurement Information System (PROMIS pediatrics group for the purpose of developing a dynamic, electronic item bank for field testing with children and adolescents using novel computer technology. The primary objective of this study was to conduct cognitive interviews with children and adolescents to gain feedback on items measuring physical functioning, emotional health, social health, fatigue, pain, and asthma-specific symptoms. Methods A total of 88 cognitive interviews were conducted with 77 children and adolescents across two sites on 318 items. From this initial item bank, 25 items were deleted and 35 were revised and underwent a second round of cognitive interviews. A total of 293 items were retained for field testing. Results Children as young as 8 years of age were able to comprehend the majority of items, response options, directions, recall period, and identify problems with language that was difficult for them to understand. Cognitive interviews indicated issues with item comprehension on several items which led to alternative wording for these items. Conclusion Children ages 8–17 years were able to comprehend most item stems and response options in the present study. Field testing with the resulting items and response options is presently being conducted as part of the PROMIS Pediatric Item Bank development process.

  11. Comparing and transforming PROMIS utility values to the EQ-5D.

    Science.gov (United States)

    Hartman, John D; Craig, Benjamin M

    2018-03-01

    Summarizing patient-reported outcomes (PROs) on a quality-adjusted life year (QALY) scale is an essential component to any economic evaluation comparing alternative medical treatments. While multiple studies have compared PRO items and instruments based on their psychometric properties, no study has compared the preference-based summary of the EQ-5D-3L and Patient Reported Outcomes Measurement Information System (PROMIS-29) instruments. As part of this comparison, a major aim of this manuscript is to transform PROMIS-29 utility values to an EQ-5D-3L scale. A nationally representative survey of 2623 US adults completed the 29-item PROMIS health profile instrument (PROMIS-29) and the 3-level version of the EQ-5D instrument (EQ-5D-3L). Their responses were summarized on a health utility scale using published estimates. Using regression analysis, PROMIS-29 and EQ-5D-3L utility weights were compared with each other as well as with self-reported general health. PROMIS-29 utility weights were much lower than the EQ-5D-3L weights. However, a correlation coefficient of 0.769 between the utility values of the two instruments suggests that the main discordance is simply a difference in scale between the measures. It is also possible to map PROMIS-29 utility weights onto an EQ-5D-3L scale. EQ-5D-3L losses equal .1784 × (PROMIS-29 Losses) .7286 . The published estimates of the PROMIS-29 produce lower utility values than many other health instruments. Mapping the PROMIS-29 estimates to an EQ-5D-3L scale alleviates this issue and allows for a more straightforward comparison between the PROMIS-29 and other common health instruments.

  12. Evaluating PROMIS Physical Function Measures in Older Adults at Risk for Alzheimer’s Disease

    Directory of Open Access Journals (Sweden)

    Curtis Tatsuoka PhD

    2016-09-01

    Full Text Available Activities of daily living can be affected by cognitive decline. Self-report measurement of functioning is attractive due to ease of data collection, low cost, and accessibility via technology-assisted means, and for understanding patient perspective. A concern is with reliability of such measurement as cognitive decline occurs. We compared a widely used, self-report “legacy” measure of functioning, Lawton and Brody’s Instrumental Activities of Daily Living Scale (IADLS, with a subset of physical functioning items from the Patient-Reported Outcomes Measurement Information System (PROMIS. The study sample consisted of 304 individuals of varying cognitive status: normal, mild cognitive impairment (MCI, or early dementia. An expert consensus method was used to select PROMIS functional items most relevant to neurocognitive disorder and to identify major functional sub-domains. Selected PROMIS functional subscales and the IADLS were then evaluated with respect to cognitive status. Few PROMIS functional items were useful in identifying MCI, while we reaffirmed the utility of the IADLS. Also, even mild depression levels were found to have negative effects on functioning according to both PROMIS and IADLS.

  13. Differential item functioning of the patient-reported outcomes information system (PROMIS®) pain interference item bank by language (Spanish versus English).

    Science.gov (United States)

    Paz, Sylvia H; Spritzer, Karen L; Reise, Steven P; Hays, Ron D

    2017-06-01

    About 70% of Latinos, 5 years old or older, in the United States speak Spanish at home. Measurement equivalence of the PROMIS ® pain interference (PI) item bank by language of administration (English versus Spanish) has not been evaluated. A sample of 527 adult Spanish-speaking Latinos completed the Spanish version of the 41-item PROMIS ® pain interference item bank. We evaluate dimensionality, monotonicity and local independence of the Spanish-language items. Then we evaluate differential item functioning (DIF) using ordinal logistic regression with item response theory scores estimated from DIF-free "anchor" items. One of the 41 items in the Spanish version of the PROMIS ® PI item bank was identified as having significant uniform DIF. English- and Spanish-speaking subjects with the same level of pain interference responded differently to 1 of the 41 items in the PROMIS ® PI item bank. This item was not retained due to proprietary issues. The original English language item parameters can be used when estimating PROMIS ® PI scores.

  14. Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS: An item response theory approach

    Directory of Open Access Journals (Sweden)

    JOSEPH P. EIMICKE

    2009-06-01

    Full Text Available The aims of this paper are to present findings related to differential item functioning (DIF in the Patient Reported Outcome Measurement Information System (PROMIS depression item bank, and to discuss potential threats to the validity of results from studies of DIF. The 32 depression items studied were modified from several widely used instruments. DIF analyses of gender, age and education were performed using a sample of 735 individuals recruited by a survey polling firm. DIF hypotheses were generated by asking content experts to indicate whether or not they expected DIF to be present, and the direction of the DIF with respect to the studied comparison groups. Primary analyses were conducted using the graded item response model (for polytomous, ordered response category data with likelihood ratio tests of DIF, accompanied by magnitude measures. Sensitivity analyses were performed using other item response models and approaches to DIF detection. Despite some caveats, the items that are recommended for exclusion or for separate calibration were "I felt like crying" and "I had trouble enjoying things that I used to enjoy." The item, "I felt I had no energy," was also flagged as evidencing DIF, and recommended for additional review. On the one hand, false DIF detection (Type 1 error was controlled to the extent possible by ensuring model fit and purification. On the other hand, power for DIF detection might have been compromised by several factors, including sparse data and small sample sizes. Nonetheless, practical and not just statistical significance should be considered. In this case the overall magnitude and impact of DIF was small for the groups studied, although impact was relatively large for some individuals.

  15. Migrating from a legacy fixed-format measure to CAT administration: calibrating the PHQ-9 to the PROMIS depression measures.

    Science.gov (United States)

    Gibbons, Laura E; Feldman, Betsy J; Crane, Heidi M; Mugavero, Michael; Willig, James H; Patrick, Donald; Schumacher, Joseph; Saag, Michael; Kitahata, Mari M; Crane, Paul K

    2011-11-01

    We provide detailed instructions for analyzing patient-reported outcome (PRO) data collected with an existing (legacy) instrument so that scores can be calibrated to the PRO Measurement Information System (PROMIS) metric. This calibration facilitates migration to computerized adaptive test (CAT) PROMIS data collection, while facilitating research using historical legacy data alongside new PROMIS data. A cross-sectional convenience sample (n = 2,178) from the Universities of Washington and Alabama at Birmingham HIV clinics completed the PROMIS short form and Patient Health Questionnaire (PHQ-9) depression symptom measures between August 2008 and December 2009. We calibrated the tests using item response theory. We compared measurement precision of the PHQ-9, the PROMIS short form, and simulated PROMIS CAT. Dimensionality analyses confirmed the PHQ-9 could be calibrated to the PROMIS metric. We provide code used to score the PHQ-9 on the PROMIS metric. The mean standard errors of measurement were 0.49 for the PHQ-9, 0.35 for the PROMIS short form, and 0.37, 0.28, and 0.27 for 3-, 8-, and 9-item-simulated CATs. The strategy described here facilitated migration from a fixed-format legacy scale to PROMIS CAT administration and may be useful in other settings.

  16. Understanding health-related quality of life in caregivers of civilians and service members/veterans with traumatic brain injury: Establishing the reliability and validity of PROMIS Fatigue and Sleep Disturbance item banks.

    Science.gov (United States)

    Carlozzi, Noelle E; Ianni, Phillip A; Tulsky, David S; Brickell, Tracey A; Lange, Rael T; French, Louis M; Cella, David; Kallen, Michael A; Miner, Jennifer A; Kratz, Anna L

    2018-06-19

    To examine the reliability and validity of Patient Reported Outcomes Measurement Information System (PROMIS) measures of sleep disturbance and fatigue in TBI caregivers and to determine the severity of fatigue and sleep disturbance in these caregivers. Cross-sectional survey data collected through an online data capture platform. Four rehabilitation hospitals and Walter Reed National Military Medical Center. Caregivers (N=560) of civilians (n=344) and service member/veterans (n=216) with TBI. Not Applicable MAIN OUTCOME MEASURES: PROMIS sleep and fatigue measures administered as both computerized adaptive tests (CATs) and 4-item short forms (SFs). For both samples, floor and ceiling effects for the PROMIS measures were low (internal consistency was very good (all alphas ≥0.80), and test-retest reliability was acceptable (all r≥0.70 except for the fatigue CAT in the service member/veteran sample r=0.63). Convergent validity was supported by moderate correlations between the PROMIS and related measures. Discriminant validity was supported by low correlations between PROMIS measures and measures of dissimilar constructs. PROMIS scores indicated significantly worse sleep and fatigue for those caring for someone with high levels versus low levels of impairment. Findings support the reliability and validity of the PROMIS CAT and SF measures of sleep disturbance and fatigue in caregivers of civilians and service members/veterans with TBI. Copyright © 2018. Published by Elsevier Inc.

  17. Comparative Responsiveness of the PROMIS Pain Interference Short Forms, Brief Pain Inventory, PEG, and SF-36 Bodily Pain Subscale.

    Science.gov (United States)

    Kean, Jacob; Monahan, Patrick O; Kroenke, Kurt; Wu, Jingwei; Yu, Zhangsheng; Stump, Tim E; Krebs, Erin E

    2016-04-01

    To compare the sensitivity to change and the responsiveness to intervention of the PROMIS Pain Interference short forms, Brief Pain Inventory (BPI), 3-item PEG scale, and SF-36 Bodily Pain subscale in a sample of patients with persistent musculoskeletal pain of moderate severity. Standardized response means, standardized effect sizes, and receiver operating curve analyses were used to assess change between baseline and 3-month assessments in 250 participants who participated in a randomized clinical effectiveness trial of collaborative telecare management for moderate to severe and persistent musculoskeletal pain. The BPI, PEG, and SF-36 Bodily Pain measures were more sensitive to patient-reported global change than the PROMIS Pain Interference short forms, especially for the clinically improved group, for which the change detected by the PROMIS short forms was not statistically significant. The BPI was more responsive to the clinical intervention than the SF-36 Bodily Pain and PROMIS Pain Interference measures. Post hoc analyses exploring these findings did not suggest that differences in content or rating scale structure (number of response options or anchoring language) adequately explained the observed differences in the detection of change. In this clinical trial, the BPI and PEG measures were better able to detect change than the SF-36 Bodily Pain and PROMIS Pain Interference measures.

  18. Difference in method of administration did not significantly impact item response

    DEFF Research Database (Denmark)

    Bjorner, Jakob B; Rose, Matthias; Gandek, Barbara

    2014-01-01

    assistant (PDA), or personal computer (PC) on the Internet, and a second form by PC, in the same administration. Structural invariance, equivalence of item responses, and measurement precision were evaluated using confirmatory factor analysis and item response theory methods. RESULTS: Multigroup...... levels in IVR, PQ, or PDA administration as compared to PC. Availability of large item response theory-calibrated PROMIS item banks allowed for innovations in study design and analysis.......PURPOSE: To test the impact of method of administration (MOA) on the measurement characteristics of items developed in the Patient-Reported Outcomes Measurement Information System (PROMIS). METHODS: Two non-overlapping parallel 8-item forms from each of three PROMIS domains (physical function...

  19. Evaluating measurement invariance across assessment modes of phone interview and computer self-administered survey for the PROMIS measures in a population-based cohort of localized prostate cancer survivors.

    Science.gov (United States)

    Wang, Mian; Chen, Ronald C; Usinger, Deborah S; Reeve, Bryce B

    2017-11-01

    To evaluate measurement invariance (phone interview vs computer self-administered survey) of 15 PROMIS measures responded by a population-based cohort of localized prostate cancer survivors. Participants were part of the North Carolina Prostate Cancer Comparative Effectiveness and Survivorship Study. Out of the 952 men who took the phone interview at 24 months post-treatment, 401 of them also completed the same survey online using a home computer. Unidimensionality of the PROMIS measures was examined using single-factor confirmatory factor analysis (CFA) models. Measurement invariance testing was conducted using longitudinal CFA via a model comparison approach. For strongly or partially strongly invariant measures, changes in the latent factors and factor autocorrelations were also estimated and tested. Six measures (sleep disturbance, sleep-related impairment, diarrhea, illness impact-negative, illness impact-positive, and global satisfaction with sex life) had locally dependent items, and therefore model modifications had to be made on these domains prior to measurement invariance testing. Overall, seven measures achieved strong invariance (all items had equal loadings and thresholds), and four measures achieved partial strong invariance (each measure had one item with unequal loadings and thresholds). Three measures (pain interference, interest in sexual activity, and global satisfaction with sex life) failed to establish configural invariance due to between-mode differences in factor patterns. This study supports the use of phone-based live interviewers in lieu of PC-based assessment (when needed) for many of the PROMIS measures.

  20. The PROMIS physical function correlates with the QuickDASH in patients with upper extremity illness.

    Science.gov (United States)

    Overbeek, Celeste L; Nota, Sjoerd P F T; Jayakumar, Prakash; Hageman, Michiel G; Ring, David

    2015-01-01

    To assess disability more efficiently with less burden on the patient, the National Institutes of Health has developed the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function-an instrument based on item response theory and using computer adaptive testing (CAT). Initially, upper and lower extremity disabilities were not separated and we were curious if the PROMIS Physical Function CAT could measure upper extremity disability and the Quick Disability of Arm, Shoulder and Hand (QuickDASH). We aimed to find correlation between the PROMIS Physical Function and the QuickDASH questionnaires in patients with upper extremity illness. Secondarily, we addressed whether the PROMIS Physical Function and QuickDASH correlate with the PROMIS Depression CAT and PROMIS Pain Interference CAT instruments. Finally, we assessed factors associated with QuickDASH and PROMIS Physical Function in multivariable analysis. A cohort of 93 outpatients with upper extremity illnesses completed the QuickDASH and three PROMIS CAT questionnaires: Physical Function, Pain Interference, and Depression. Pain intensity was measured with an 11-point ordinal measure (0-10 numeric rating scale). Correlation between PROMIS Physical Function and the QuickDASH was assessed. Factors that correlated with the PROMIS Physical Function and QuickDASH were assessed in multivariable regression analysis after initial bivariate analysis. There was a moderate correlation between the PROMIS Physical Function and the QuickDASH questionnaire (r=-0.55, p<0.001). Greater disability as measured with the PROMIS and QuickDASH correlated most strongly with PROMIS Depression (r=-0.35, p<0.001 and r=0.34, p<0.001 respectively) and Pain Interference (r=-0.51, p<0.001 and r=0.74, p<0.001 respectively). The factors accounting for the variability in PROMIS scores are comparable to those for the QuickDASH except that the PROMIS Physical Function is influenced by other pain conditions while the QuickDASH is

  1. Understanding Health-related Quality of Life in Caregivers of Civilians and Service Members/Veterans with Traumatic Brain Injury: Establishing the Reliability and Validity of PROMIS Mental Health Measures.

    Science.gov (United States)

    Carlozzi, Noelle E; Hanks, Robin; Lange, Rael T; Brickell D Psych, Tracey A; Ianni, Phillip A; Miner, Jennifer A; French Psy D, Louis M; Kallen, Michael A; Sander, Angelle M

    2018-06-19

    To provide important reliability and validity data to support the use of the PROMIS Mental Health measures in caregivers of civilians or service members/veterans with traumatic brain injury (TBI). Patient-reported outcomes surveys administered through an electronic data collection platform. Three TBI Model Systems rehabilitation hospitals, an academic medical center, and a military medical treatment facility. 560 caregivers of individuals with a documented TBI (344 civilians and 216 military) INTERVENTION: Not Applicable MAIN OUTCOME MEASURES: PROMIS Anxiety, Depression, and Anger Item Banks RESULTS: Internal consistency for all of the PROMIS Mental Health item banks was very good (all α > .86) and three-week test retest reliability was good to adequate (ranged from .65 to .85). Convergent validity and discriminant validity of the PROMIS measures was also supported. Caregivers of individuals that were low functioning had worse emotional HRQOL (as measured by the three PROMIS measures) than caregivers of high functioning individuals, supporting known groups validity. Finally, levels of distress, as measured by the PROMIS measures, were elevated for those caring for low-functioning individuals in both samples (rates ranged from 26.2% to 43.6% for caregivers of low-functioning individuals). Results support the reliability and validity of the PROMIS Anxiety, Depression, and Anger item banks in caregivers of civilians and service members/veterans with TBI. Ultimately, these measures can be used to provide a standardized assessment of HRQOL as it relates to mental health in these caregivers. Copyright © 2018. Published by Elsevier Inc.

  2. Measuring anxiety after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Anxiety item bank and linkage with GAD-7.

    Science.gov (United States)

    Kisala, Pamela A; Tulsky, David S; Kalpakjian, Claire Z; Heinemann, Allen W; Pohlig, Ryan T; Carle, Adam; Choi, Seung W

    2015-05-01

    To develop a calibrated item bank and computer adaptive test to assess anxiety symptoms in individuals with spinal cord injury (SCI), transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a statistical linkage with the Generalized Anxiety Disorder (GAD)-7, a widely used anxiety measure. Grounded-theory based qualitative item development methods; large-scale item calibration field testing; confirmatory factor analysis; graded response model item response theory analyses; statistical linking techniques to transform scores to a PROMIS metric; and linkage with the GAD-7. Setting Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Participants Adults with traumatic SCI. Spinal Cord Injury-Quality of Life (SCI-QOL) Anxiety Item Bank Seven hundred sixteen individuals with traumatic SCI completed 38 items assessing anxiety, 17 of which were PROMIS items. After 13 items (including 2 PROMIS items) were removed, factor analyses confirmed unidimensionality. Item response theory analyses were used to estimate slopes and thresholds for the final 25 items (15 from PROMIS). The observed Pearson correlation between the SCI-QOL Anxiety and GAD-7 scores was 0.67. The SCI-QOL Anxiety item bank demonstrates excellent psychometric properties and is available as a computer adaptive test or short form for research and clinical applications. SCI-QOL Anxiety scores have been transformed to the PROMIS metric and we provide a method to link SCI-QOL Anxiety scores with those of the GAD-7.

  3. Readability and Comprehension of the Geriatric Depression Scale and PROMIS® Physical Function Items in Older African Americans and Latinos.

    Science.gov (United States)

    Paz, Sylvia H; Jones, Loretta; Calderón, José L; Hays, Ron D

    2017-02-01

    Depression and physical function are particularly important health domains for the elderly. The Geriatric Depression Scale (GDS) and the Patient-Reported Outcomes Measurement Information System (PROMIS ® ) physical function item bank are two surveys commonly used to measure these domains. It is unclear if these two instruments adequately measure these aspects of health in minority elderly. The aim of this study was to estimate the readability of the GDS and PROMIS ® physical function items and to assess their comprehensibility using a sample of African American and Latino elderly. Readability was estimated using the Flesch-Kincaid and Flesch Reading Ease (FRE) formulae for English versions, and a Spanish adaptation of the FRE formula for the Spanish versions. Comprehension of the GDS and PROMIS ® items by minority elderly was evaluated with 30 cognitive interviews. Readability estimates of a number of items in English and Spanish of the GDS and PROMIS ® physical functioning items exceed the U.S. recommended 5th-grade threshold for vulnerable populations, or were rated as 'fairly difficult', 'difficult', or 'very difficult' to read. Cognitive interviews revealed that many participants felt that more than the two (yes/no) GDS response options were needed to answer the questions. Wording of several PROMIS ® items was considered confusing, and interpreting responses was problematic because they were based on using physical aids. Problems with item wording and response options of the GDS and PROMIS ® physical function items may reduce reliability and validity of measurement when used with minority elderly.

  4. Better assessment of physical function: item improvement is neglected but essential.

    Science.gov (United States)

    Bruce, Bonnie; Fries, James F; Ambrosini, Debbie; Lingala, Bharathi; Gandek, Barbara; Rose, Matthias; Ware, John E

    2009-01-01

    Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models

  5. Establishing a common metric for depressive symptoms: linking the BDI-II, CES-D, and PHQ-9 to PROMIS depression.

    Science.gov (United States)

    Choi, Seung W; Schalet, Benjamin; Cook, Karon F; Cella, David

    2014-06-01

    Interest in measuring patient-reported outcomes has increased dramatically in recent decades. This has simultaneously produced numerous assessment options and confusion. In the case of depressive symptoms, there are many commonly used options for measuring the same or a very similar concept. Public and professional reporting of scores can be confused by multiple scale ranges, normative levels, and clinical thresholds. A common reporting metric would have great value and can be achieved when similar instruments are administered to a single sample and then linked to each other to produce cross-walk score tables (e.g., Dorans, 2007; Kolen & Brennan, 2004). Using multiple procedures based on item response theory and equipercentile methods, we produced cross-walk tables linking 3 popular "legacy" depression instruments-the Center for Epidemiologic Studies Depression Scale (Radloff, 1977; N = 747), the Beck Depression Inventory-II (Beck, Steer, & Brown, 1996; N = 748), and the 9-item Patient Health Questionnaire (Kroenke, Spitzer, & Williams, 2001; N = 1,120)-to the depression metric of the National Institutes of Health (NIH) Patient-Reported Outcomes Measurement Information System (PROMIS; Cella et al., 2010). The PROMIS Depression metric is centered on the U.S. general population, matching the marginal distributions of gender, age, race, and education in the 2000 U.S. census (Liu et al., 2010). The linking relationships were evaluated by resampling small subsets and estimating confidence intervals for the differences between the observed and linked PROMIS scores; in addition, PROMIS cutoff scores for depression severity were estimated to correspond with those commonly used with the legacy measures. Our results allow clinicians and researchers to retrofit existing data of 3 popular depression measures to the PROMIS Depression metric and vice versa.

  6. Using PROMIS for measuring recovery after abdominal surgery: a pilot study

    Directory of Open Access Journals (Sweden)

    Eva van der Meij

    2018-02-01

    Full Text Available Abstract Background To assess the construct validity and responsiveness of the PROMIS Physical Function v1.2 short form 8b (PROMIS-PF, and the PROMIS Ability to Participate in Social Roles and Activities v2.0 short form 8a (PROMIS-APS in postoperative recovery. Methods An observational pilot study was conducted in which 30 patients participated, undergoing various forms of abdominal surgery. Patients completed the PROMIS-PF and PROMIS-APS, the Short Form 36 Health Survey (SF-36 and the World Health Organization Disability Assessment Schedule 2.0 (WHODAS at several time points before and after surgery. The construct validity and responsiveness of the two PROMIS short forms were evaluated by testing pre-defined hypotheses and were considered adequate when at least 75% of the data was consistent with the hypotheses. Construct validity was evaluated by calculating Spearman correlations and the responsiveness by calculating effect sizes. Results 6/7 (85.7% of the results were consistent with the hypotheses supporting the construct validity of the PROMIS-PF. For the PROMIS-APS this was the case in 7/15 (46.7% of the results. For the PROMIS-PF, 6/7 (85.7% of the results were consistent with the hypotheses, supporting responsiveness. Regarding the responsiveness of the PROMIS-APS, only 7 out of 13 (53.8% of these results were consistent with the hypotheses. Conclusions This study supported the construct validity and the responsiveness of the PROMIS-PF v1.2 short form 8b for measuring recovery in abdominal surgery. Considering the major advantages of PROMIS, we recommend the use of the PROMIS-PF in abdominal surgery.

  7. Impact of National Institutes of Health Gastrointestinal PROMIS Measures in Clinical Practice: Results of a Multicenter Controlled Trial.

    Science.gov (United States)

    Almario, Christopher V; Chey, William D; Khanna, Dinesh; Mosadeghi, Sasan; Ahmed, Shahzad; Afghani, Elham; Whitman, Cynthia; Fuller, Garth; Reid, Mark; Bolus, Roger; Dennis, Buddy; Encarnacion, Rey; Martinez, Bibiana; Soares, Jennifer; Modi, Rushaba; Agarwal, Nikhil; Lee, Aaron; Kubomoto, Scott; Sharma, Gobind; Bolus, Sally; Spiegel, Brennan M R

    2016-11-01

    The National Institutes of Health (NIH) created the Patient Reported Outcomes Measurement Information System (PROMIS) to allow efficient, online measurement of patient-reported outcomes (PROs), but it remains untested whether PROMIS improves outcomes. Here, we aimed to compare the impact of gastrointestinal (GI) PROMIS measures vs. usual care on patient outcomes. We performed a pragmatic clinical trial with an off-on study design alternating weekly between intervention (GI PROMIS) and control arms at one Veterans Affairs and three university-affiliated specialty clinics. Adults with GI symptoms were eligible. Intervention patients completed GI PROMIS symptom questionnaires on an e-portal 1 week before their visit; PROs were available for review by patients and their providers before and during the clinic visit. Usual care patients were managed according to customary practices. Our primary outcome was patient satisfaction as determined by the Consumer Assessment of Healthcare Providers and Systems questionnaire. Secondary outcomes included provider interpersonal skills (Doctors' Interpersonal Skills Questionnaire (DISQ)) and shared decision-making (9-item Shared Decision Making Questionnaire (SDM-Q-9)). There were 217 and 154 patients in the GI PROMIS and control arms, respectively. Patient satisfaction was similar between groups (P>0.05). Intervention patients had similar assessments of their providers' interpersonal skills (DISQ 89.4±11.7 vs. 89.8±16.0, P=0.79) and shared decision-making (SDM-Q-9 79.3±12.4 vs. 79.0±22.0, P=0.85) vs. This is the first controlled trial examining the impact of NIH PROMIS in clinical practice. One-time use of GI PROMIS did not improve patient satisfaction or assessment of provider interpersonal skills and shared decision-making. Future studies examining how to optimize PROs in clinical practice are encouraged before widespread adoption.

  8. Floor Effect of PROMIS Depression CAT Associated With Hasty Completion in Orthopaedic Surgery Patients.

    Science.gov (United States)

    Guattery, Jason M; Dardas, Agnes Z; Kelly, Michael; Chamberlain, Aaron; McAndrew, Christopher; Calfee, Ryan P

    2018-04-01

    The Patient Reported Outcomes Measurement Information System (PROMIS) was developed to provide valid, reliable, and standardized measures to gather patient-reported outcomes for many health domains, including depression, independent of patient condition. Most studies confirming the performance of these measures were conducted with a consented, volunteer study population for testing. Using a study population that has undergone the process of informed consent may be differentiated from the validation group because they are educated specifically as to the purpose of the questions and they will not have answers recorded in their permanent health record. (1) When given as part of routine practice to an orthopaedic population, do PROMIS Physical Function and Depression item banks produce score distributions different than those produced by the populations used to calibrate and validate the item banks? (2) Does the presence of a nonnormal distribution in the PROMIS Depression scores in a clinical population reflect a deliberately hasty answering of questions by patients? (3) Are patients who are reporting minimal depressive symptoms by scoring the minimum score on the PROMIS Depression Computer Adaptive Testing (CAT) distinct from other patients according to demographic data or their scores on other PROMIS assessments? Univariate descriptive statistics and graphic histograms were used to describe the frequency distribution of scores for the Physical Function and Depression item banks for all orthopaedic patients 18 years or older who had an outpatient visit between June 2015 and December 2016. The study population was then broken into two groups based on whether they indicated a lack of depressive symptoms and scored the minimum score (34.2) on the Depression CAT assessment (Floor Group) or not (Standard Group). The distribution of Physical Function CAT scores was compared between the two groups. Finally, a time-per-question value was calculated for both the Physical

  9. Measuring depression after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Depression item bank and linkage with PHQ-9.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Bombardier, Charles H; Pohlig, Ryan T; Heinemann, Allen W; Carle, Adam; Choi, Seung W

    2015-05-01

    To develop a calibrated spinal cord injury-quality of life (SCI-QOL) item bank, computer adaptive test (CAT), and short form to assess depressive symptoms experienced by individuals with SCI, transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a crosswalk to the Patient Health Questionnaire (PHQ)-9. We used grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, item response theory (IRT) analyses, and statistical linking techniques to transform scores to a PROMIS metric and to provide a crosswalk with the PHQ-9. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. Spinal Cord Injury--Quality of Life (SCI-QOL) Depression Item Bank Individuals with SCI were involved in all phases of SCI-QOL development. A sample of 716 individuals with traumatic SCI completed 35 items assessing depression, 18 of which were PROMIS items. After removing 7 non-PROMIS items, factor analyses confirmed a unidimensional pool of items. We used a graded response IRT model to estimate slopes and thresholds for the 28 retained items. The SCI-QOL Depression measure correlated 0.76 with the PHQ-9. The SCI-QOL Depression item bank provides a reliable and sensitive measure of depressive symptoms with scores reported in terms of general population norms. We provide a crosswalk to the PHQ-9 to facilitate comparisons between measures. The item bank may be administered as a CAT or as a short form and is suitable for research and clinical applications.

  10. Assessing the Equivalence of Paper, Mobile Phone, and Tablet Survey Responses at a Community Mental Health Center Using Equivalent Halves of a 'Gold-Standard' Depression Item Bank.

    Science.gov (United States)

    Brodey, Benjamin B; Gonzalez, Nicole L; Elkin, Kathryn Ann; Sasiela, W Jordan; Brodey, Inger S

    2017-09-06

    The computerized administration of self-report psychiatric diagnostic and outcomes assessments has risen in popularity. If results are similar enough across different administration modalities, then new administration technologies can be used interchangeably and the choice of technology can be based on other factors, such as convenience in the study design. An assessment based on item response theory (IRT), such as the Patient-Reported Outcomes Measurement Information System (PROMIS) depression item bank, offers new possibilities for assessing the effect of technology choice upon results. To create equivalent halves of the PROMIS depression item bank and to use these halves to compare survey responses and user satisfaction among administration modalities-paper, mobile phone, or tablet-with a community mental health care population. The 28 PROMIS depression items were divided into 2 halves based on content and simulations with an established PROMIS response data set. A total of 129 participants were recruited from an outpatient public sector mental health clinic based in Memphis. All participants took both nonoverlapping halves of the PROMIS IRT-based depression items (Part A and Part B): once using paper and pencil, and once using either a mobile phone or tablet. An 8-cell randomization was done on technology used, order of technologies used, and order of PROMIS Parts A and B. Both Parts A and B were administered as fixed-length assessments and both were scored using published PROMIS IRT parameters and algorithms. All 129 participants received either Part A or B via paper assessment. Participants were also administered the opposite assessment, 63 using a mobile phone and 66 using a tablet. There was no significant difference in item response scores for Part A versus B. All 3 of the technologies yielded essentially identical assessment results and equivalent satisfaction levels. Our findings show that the PROMIS depression assessment can be divided into 2 equivalent

  11. Performance of PROMIS for Healthy Patients Undergoing Meniscal Surgery.

    Science.gov (United States)

    Hancock, Kyle J; Glass, Natalie; Anthony, Chris A; Hettrich, Carolyn M; Albright, John; Amendola, Annunziato; Wolf, Brian R; Bollier, Matthew

    2017-06-07

    The Patient-Reported Outcomes Measurement Information System (PROMIS) was developed as an extensive question bank with multiple health domains that could be utilized for computerized adaptive testing (CAT). In the present study, we investigated the use of the PROMIS Physical Function CAT (PROMIS PF CAT) in an otherwise healthy population scheduled to undergo surgery for meniscal injury with the hypotheses that (1) the PROMIS PF CAT would correlate strongly with patient-reported outcome instruments that measure physical function and would not correlate strongly with those that measure other health domains, (2) there would be no ceiling effects, and (3) the test burden would be significantly less than that of the traditional measures. Patients scheduled to undergo meniscal surgery completed the PROMIS PF CAT, Knee injury and Osteoarthritis Outcome Score (KOOS), Marx Knee Activity Rating Scale, Short Form-36 (SF-36), and EuroQol-5 Dimension (EQ-5D) questionnaires. Correlations were defined as high (≥0.7), high-moderate (0.61 to 0.69), moderate (0.4 to 0.6), moderate-weak (0.31 to 0.39), or weak (≤0.3). If ≥15% respondents to a patient-reported outcome measure obtained the highest or lowest possible score, the instrument was determined to have a significant ceiling or floor effect. A total of 107 participants were analyzed. The PROMIS PF CAT had a high correlation with the SF-36 Physical Functioning (PF) (r = 0.82, p ceiling effects, with 0% of the participants achieving the lowest and highest score, respectively. The PROMIS PF CAT correlates strongly with currently used patient-reported outcome measures of physical function and demonstrates no ceiling effects for patients with meniscal injury requiring surgery. It may be a reasonable alternative to more burdensome patient-reported outcome measures.

  12. Psychometric evaluation of the pediatric and parent-proxy Patient-Reported Outcomes Measurement Information System and the Neurology and Traumatic Brain Injury Quality of Life measurement item banks in pediatric traumatic brain injury.

    Science.gov (United States)

    Bertisch, Hilary; Rivara, Frederick P; Kisala, Pamela A; Wang, Jin; Yeates, Keith Owen; Durbin, Dennis; Zonfrillo, Mark R; Bell, Michael J; Temkin, Nancy; Tulsky, David S

    2017-07-01

    The primary objective is to provide evidence of convergent and discriminant validity for the pediatric and parent-proxy versions of the Patient-Reported Outcomes Measurement Information System (PROMIS) Anxiety, Depression, Anger, Peer Relations, Mobility, Pain Interference, and Fatigue item banks, the Neurology Quality of Life measurement system (Neuro-QOL) Cognition-General Concerns and Stigma item banks, and the Traumatic Brain Injury Quality of Life (TBI-QOL) Executive Function and Headache item banks in a pediatric traumatic brain injury (TBI) sample. Participants were 134 parent-child (ages 8-18 years) days. Children all sustained TBI and the dyads completed outcome ratings 6 months after injury at one of six medical centers across the United States. Ratings included PROMIS, Neuro-QOL, and TBI-QOL item banks, as well as the Pediatric Quality of Life inventory (PedsQL), the Health Behavior Inventory (HBI), and the Strengths and Difficulties Questionnaire (SDQ) as legacy criterion measures against which these item banks were validated. The PROMIS, Neuro-QOL, and TBI-QOL item banks demonstrated good convergent validity, as evidenced by moderate to strong correlations with comparable scales on the legacy measures. PROMIS, Neuro-QOL, and TBI-QOL item banks showed weaker correlations with ratings of unrelated constructs on legacy measures, providing evidence of discriminant validity. Our results indicate that the constructs measured by the PROMIS, Neuro-QOL, and TBI-QOL item banks are valid in our pediatric TBI sample and that it is appropriate to use these standardized scores for our primary study analyses.

  13. Adaptação transcultural e validação da escala de Saúde Global do PROMIS para a língua portuguesa

    Directory of Open Access Journals (Sweden)

    Camila Eugênia Zumpano

    Full Text Available Resumo: O objetivo deste estudo foi realizar a adaptação transcultural da escala de Saúde Global do Patient-Reported Outcomes Measurement Information System (PROMIS para a língua portuguesa. Os dez itens sobre Saúde Global foram adaptados transculturalmente por meio do método proposto pelo Functional Assessment of Chronic Illness Therapy (FACIT. A versão final do instrumento para a língua portuguesa foi autoadministrada em 1.010 participantes no Brasil. A precisão da escala foi verificada usando-se a análise dos efeitos piso e teto, confiabilidade da consistência interna e confiabilidade teste-reteste. Utilizou-se a análise fatorial exploratória e confirmatória para avaliação da validade de construto e dimensionalidade do instrumento. A calibração dos itens foi realizada por meio do Modelo de Resposta Gradual proposto por Samejima. Quatro itens globais necessitaram de ajustes após a realização do pré-teste. A análise das propriedades psicométricas demonstrou que a escala de Saúde Global tem boa confiabilidade, com coeficiente alfa de Cronbach de 0,83 e coeficiente de correlação intraclasse de 0,89. As análises fatorial exploratória e confirmatória revelaram um bom ajuste ao modelo previamente estabelecido de duas dimensões. As escalas de Saúde Física Global e Saúde Mental Global apresentaram uma boa cobertura do traço latente, de acordo com o Modelo de Resposta Gradual. Os itens Saúde Global do PROMIS para a língua portuguesa apresentaram equivalência em relação à versão original e propriedades psicométricas satisfatórias para a aplicação direcionada à população brasileira na prática clínica e em pesquisas.

  14. An Item Bank to Measure Systems, Services, and Policies: Environmental Factors Affecting People With Disabilities.

    Science.gov (United States)

    Lai, Jin-Shei; Hammel, Joy; Jerousek, Sara; Goldsmith, Arielle; Miskovic, Ana; Baum, Carolyn; Wong, Alex W; Dashner, Jessica; Heinemann, Allen W

    2016-12-01

    To develop a measure of perceived systems, services, and policies facilitators (see Chapter 5 of the International Classification of Functioning, Disability and Health) for people with neurologic disabilities and to evaluate the effect of perceived systems, services, and policies facilitators on health-related quality of life. Qualitative approaches to develop and refine items. Confirmatory factor analysis including 1-factor confirmatory factor analysis and bifactor analysis to evaluate unidimensionality of items. Rasch analysis to identify misfitting items. Correlational and analysis of variance methods to evaluate construct validity. Community-dwelling individuals participated in telephone interviews or traveled to the academic medical centers where this research took place. Participants (N=571) had a diagnosis of spinal cord injury, stroke, or traumatic brain injury. They were 18 years or older and English speaking. Not applicable. An item bank to evaluate environmental access and support levels of services, systems, and policies for people with disabilities. We identified a general factor defined as "access and support levels of the services, systems, and policies at the level of community living" and 3 local factors defined as "health services," "community living," and "community resources." The systems, services, and policies measure correlated moderately with participation measures: Community Participation Indicators (CPI) - Involvement, CPI - Control over Participation, Quality of Life in Neurological Disorders - Ability to Participate, Quality of Life in Neurological Disorders - Satisfaction with Role Participation, Patient-Reported Outcomes Measurement Information System (PROMIS) Ability to Participate, PROMIS Satisfaction with Role Participation, and PROMIS Isolation. The measure of systems, services, and policies facilitators contains items pertaining to health services, community living, and community resources. Investigators and clinicians can measure

  15. Dutch translation and cross-cultural adaptation of the PROMIS® physical function item bank and cognitive pre-test in Dutch arthritis patients.

    Science.gov (United States)

    Oude Voshaar, Martijn Ah; Ten Klooster, Peter M; Taal, Erik; Krishnan, Eswar; van de Laar, Mart Afj

    2012-03-05

    Patient-reported physical function is an established outcome domain in clinical studies in rheumatology. To overcome the limitations of the current generation of questionnaires, the Patient-Reported Outcomes Measurement Information System (PROMIS®) project in the USA has developed calibrated item banks for measuring several domains of health status in people with a wide range of chronic diseases. The aim of this study was to translate and cross-culturally adapt the PROMIS physical function item bank to the Dutch language and to pretest it in a sample of patients with arthritis. The items of the PROMIS physical function item bank were translated using rigorous forward-backward protocols and the translated version was subsequently cognitively pretested in a sample of Dutch patients with rheumatoid arthritis. Few issues were encountered in the forward-backward translation. Only 5 of the 124 items to be translated had to be rewritten because of culturally inappropriate content. Subsequent pretesting showed that overall, questions of the Dutch version were understood as they were intended, while only one item required rewriting. Results suggest that the translated version of the PROMIS physical function item bank is semantically and conceptually equivalent to the original. Future work will be directed at creating a Dutch-Flemish final version of the item bank to be used in research with Dutch speaking populations.

  16. PROMIS (Procurement Management Information System)

    Science.gov (United States)

    1987-01-01

    The PROcurement Management Information System (PROMIS) provides both detailed and summary level information on all procurement actions performed within NASA's procurement offices at Marshall Space Flight Center (MSFC). It provides not only on-line access, but also schedules procurement actions, monitors their progress, and updates Forecast Award Dates. Except for a few computational routines coded in FORTRAN, the majority of the systems is coded in a high level language called NATURAL. A relational Data Base Management System called ADABAS is utilized. Certain fields, called descriptors, are set up on each file to allow the selection of records based on a specified value or range of values. The use of like descriptors on different files serves as the link between the falls, thus producing a relational data base. Twenty related files are currently being maintained on PROMIS.

  17. PROMYS – Programming synthetic networks for bio-based production of value chemicals – FP7 project

    DEFF Research Database (Denmark)

    Sommer, Morten Otto Alexander

    2017-01-01

    ) Synthetic pathway construction 2) Cell factory optimization 3) Control of populations during fermentation Ligand responsive regulation and selection systems will directly couple the presence of a desired chemical product or flux state within a cell, to the survival of the cell. As such, they allow......The global chemical industry is transitioning from petrochemical production processes to bio-based production processes. This transition creates a clear market need for technologies that reduce the development time and cost of cell factories. PROMYS will develop, validate and implement a novel...... will drastically accelerate the construction, optimization and performance of cell factories by enabling industrial users to impose non-natural objectives on the engineered cell factory. PROMYS will address three major challenges in metabolic engineering that limit the development of new cell factories: 1...

  18. PROMIS PF CAT Outperforms the ODI and SF-36 Physical Function Domain in Spine Patients.

    Science.gov (United States)

    Brodke, Darrel S; Goz, Vadim; Voss, Maren W; Lawrence, Brandon D; Spiker, William Ryan; Hung, Man

    2017-06-15

    The Oswestry Disability Index v2.0 (ODI), SF36 Physical Function Domain (SF-36 PFD), and PROMIS Physical Function CAT v1.2 (PF CAT) questionnaires were prospectively collected from 1607 patients complaining of back or leg pain, visiting a university-based spine clinic. All questionnaires were collected electronically, using a tablet computer. The aim of this study was to compare the psychometric properties of the PROMIS PF CAT with the ODI and SF36 Physical Function Domain in the same patient population. Evidence-based decision-making is improved by using high-quality patient-reported outcomes measures. Prior studies have revealed the shortcomings of the ODI and SF36, commonly used in spine patients. The PROMIS Network has developed measures with excellent psychometric properties. The Physical Function domain, delivered by Computerized Adaptive Testing (PF CAT), performs well in the spine patient population, though to-date direct comparisons with common measures have not been performed. Standard Rasch analysis was performed to directly compare the psychometrics of the PF CAT, ODI, and SF36 PFD. Spearman correlations were computed to examine the correlations of the three instruments. Time required for administration was also recorded. One thousand six hundred seven patients were administered all assessments. The time required to answer all items in the PF CAT, ODI, and SF-36 PFD was 44, 169, and 99 seconds. The ceiling and floor effects were excellent for the PF CAT (0.81%, 3.86%), while the ceiling effects were marginal and floor effects quite poor for the ODI (6.91% and 44.24%) and SF-36 PFD (5.97% and 23.65%). All instruments significantly correlated with each other. The PROMIS PF CAT outperforms the ODI and SF-36 PFD in the spine patient population and is highly correlated. It has better coverage, while taking less time to administer with fewer questions to answer. 2.

  19. Performance of the PROMIS in Patients After Anterior Cruciate Ligament Reconstruction.

    Science.gov (United States)

    Scott, Elizabeth J; Westermann, Robert; Glass, Nathalie A; Hettrich, Carolyn; Wolf, Brian R; Bollier, Matthew J

    2018-05-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS) is designed to advance patient-reported outcome (PRO) instruments by utilizing question banks for major health domains. To compare the responsiveness and construct validity of the PROMIS physical function computer adaptive test (PF CAT) with current PRO instruments for patients before and up to 2 years after anterior cruciate ligament (ACL) reconstruction. Cohort study (diagnosis); Level of evidence, 2. Initially, 157 patients completed the PROMIS PF CAT, Short Form-36 Health Survey (SF-36 physical function [PF] and general health [GH]), Marx Activity Rating Scale (MARS), Knee injury and Osteoarthritis Outcome Score (KOOS activities of daily living [ADL], sport, and quality of life [QOL]), and EuroQol-5 dimensions questionnaire (EQ-5D) at 6 weeks, 6 months, and 2 years after ACL reconstruction. Correlations between instruments, ceiling and floor effects, effect sizes (Cohen d ), and standardized response means to describe responsiveness were evaluated. Subgroup analyses compared participants with and without additional arthroscopic procedures using linear mixed models. At baseline, 6 weeks, and 6 months, the PROMIS PF CAT showed excellent or excellent-good correlations with the SF-36 PF ( r = 0.75-0.80, P ceiling or floor effects of all instruments tested, and patients answered, on average, 4 questions. There was no significant difference in baseline physical function scores between subgroups; at follow-up, all groups showed improvements in scores that were not statistically different. The PROMIS PF CAT is a valid tool to assess outcomes after ACL reconstruction up to 2 years after surgery, demonstrating the highest responsiveness to change with the fewest ceiling and floor effects and a low time burden among all instruments tested. The PROMIS PF CAT is a beneficial alternative for assessing physical function in adults before and after ACL reconstruction.

  20. PROMIS Physical Function Correlation With NDI and mJOA in the Surgical Cervical Myelopathy Patient Population.

    Science.gov (United States)

    Owen, Robert J; Zebala, Lukas P; Peters, Colleen; McAnany, Steven

    2018-04-15

    Retrospective review. To determine the correlation of Patient-Reported Outcomes Measurement Information System (PROMIS) physical function with Neck Disability Index (NDI) and Modified Japanese Orthopedic Association (mJOA) scores in the surgical cervical myelopathy patient population. Outcome measures such as NDI and mJOA are essential for analyzing treatments for cervical myelopathy. Administrative burdens impose limits on completion of these measures. The PROMIS group developed an outcome measure to improve reporting of patient symptoms and function and to reduce administrative burden. Despite early success, NDI and mJOA have not been compared with PROMIS in patients with cervical myelopathy. This study determines the correlation of NDI and mJOA with PROMIS in surgical patients with cervical myelopathy. A total of 60 patients with cervical myelopathy undergoing surgery were included. PROMIS, NDI, and mJOA were collected preoperatively, and in the first 6 months postoperatively. Correlations between NDI, mJOA, and PROMIS were quantified using Pearson correlation coefficients. Students t tests were used to test significance. All 60 (100%) of patients completed preoperative questionnaires. Fifty-five (92%) of patients completed initial follow-up questionnaires within the first 6 months. PROMIS physical function and NDI demonstrated a strong negative correlation at baseline and in initial follow-up (R = -0.69, -0.76). PROMIS and mJOA demonstrated a strong positive correlation at baseline and in initial follow-up (R = 0.61, 0.72). PROMIS physical function has a strong negative correlation with NDI and a strong positive correlation with mJOA at baseline and in the early postoperative course in patients undergoing surgery for cervical myelopathy. Surgeons may factor these outcomes into the delivery and interpretation of patient-reported outcome measures in this population. Use of PROMIS may improve completion of outcome measures in the office and reduce

  1. Reliability and Validity of Selected PROMIS Measures in People with Rheumatoid Arthritis.

    Directory of Open Access Journals (Sweden)

    Susan J Bartlett

    Full Text Available To evaluate the reliability and validity of 11 PROMIS measures to assess symptoms and impacts identified as important by people with rheumatoid arthritis (RA.Consecutive patients (N = 177 in an observational study completed PROMIS computer adapted tests (CATs and a short form (SF assessing pain, fatigue, physical function, mood, sleep, and participation. We assessed test-test reliability and internal consistency using correlation and Cronbach's alpha. We assessed convergent validity by examining Pearson correlations between PROMIS measures and existing measures of similar domains and known groups validity by comparing scores across disease activity levels using ANOVA.Participants were mostly female (82% and white (83% with mean (SD age of 56 (13 years; 24% had ≤ high school, 29% had RA ≤ 5 years with 13% ≤ 2 years, and 22% were disabled. PROMIS Physical Function, Pain Interference and Fatigue instruments correlated moderately to strongly (rho's ≥ 0.68 with corresponding PROs. Test-retest reliability ranged from .725-.883, and Cronbach's alpha from .906-.991. A dose-response relationship with disease activity was evident in Physical Function with similar trends in other scales except Anger.These data provide preliminary evidence of reliability and construct validity of PROMIS CATs to assess RA symptoms and impacts, and feasibility of use in clinical care. PROMIS instruments captured the experiences of RA patients across the broad continuum of RA symptoms and function, especially at low disease activity levels. Future research is needed to evaluate performance in relevant subgroups, assess responsiveness and identify clinically meaningful changes.

  2. Individuals with knee impairments identify items in need of clarification in the Patient Reported Outcomes Measurement Information System (PROMIS®) pain interference and physical function item banks - a qualitative study.

    Science.gov (United States)

    Lynch, Andrew D; Dodds, Nathan E; Yu, Lan; Pilkonis, Paul A; Irrgang, James J

    2016-05-11

    The content and wording of the Patient Reported Outcome Measurement Information System (PROMIS) Physical Function and Pain Interference item banks have not been qualitatively assessed by individuals with knee joint impairments. The purpose of this investigation was to identify items in the PROMIS Physical Function and Pain Interference Item Banks that are irrelevant, unclear, or otherwise difficult to respond to for individuals with impairment of the knee and to suggest modifications based on cognitive interviews. Twenty-nine individuals with knee joint impairments qualitatively assessed items in the Pain Interference and Physical Function Item Banks in a mixed-methods cognitive interview. Field notes were analyzed to identify themes and frequency counts were calculated to identify items not relevant to individuals with knee joint impairments. Issues with clarity were identified in 23 items in the Physical Function Item Bank, resulting in the creation of 43 new or modified items, typically changing words within the item to be clearer. Interpretation issues included whether or not the knee joint played a significant role in overall health and age/gender differences in items. One quarter of the original items (31 of 124) in the Physical Function Item Bank were identified as irrelevant to the knee joint. All 41 items in the Pain Interference Item Bank were identified as clear, although individuals without significant pain substituted other symptoms which interfered with their life. The Physical Function Item Bank would benefit from additional items that are relevant to individuals with knee joint impairments and, by extension, to other lower extremity impairments. Several issues in clarity were identified that are likely to be present in other patient cohorts as well.

  3. Validation of Patient-Reported Outcomes Measurement Information System (PROMIS) computerized adaptive tests in cervical spine surgery.

    Science.gov (United States)

    Boody, Barrett S; Bhatt, Surabhi; Mazmudar, Aditya S; Hsu, Wellington K; Rothrock, Nan E; Patel, Alpesh A

    2018-03-01

    OBJECTIVE The Patient-Reported Outcomes Measurement Information System (PROMIS), which is funded by the National Institutes of Health, is a set of adaptive, responsive assessment tools that measures patient-reported health status. PROMIS measures have not been validated for surgical patients with cervical spine disorders. The objective of this project is to evaluate the validity (e.g., convergent validity, known-groups validity, responsiveness to change) of PROMIS computer adaptive tests (CATs) for pain behavior, pain interference, and physical function in patients undergoing cervical spine surgery. METHODS The legacy outcome measures Neck Disability Index (NDI) and SF-12 were used as comparisons with PROMIS measures. PROMIS CATs, NDI-10, and SF-12 measures were administered prospectively to 59 consecutive tertiary hospital patients who were treated surgically for degenerative cervical spine disorders. A subscore of NDI-5 was calculated from NDI-10 by eliminating the lifting, headaches, pain intensity, reading, and driving sections and multiplying the final score by 4. Assessments were administered preoperatively (baseline) and postoperatively at 6 weeks and 3 months. Patients presenting for revision surgery, tumor, infection, or trauma were excluded. Participants completed the measures in Assessment Center, an online data collection tool accessed by using a secure login and password on a tablet computer. Subgroup analysis was also performed based on a primary diagnosis of either cervical radiculopathy or cervical myelopathy. RESULTS Convergent validity for PROMIS CATs was supported with multiple statistically significant correlations with the existing legacy measures, NDI and SF-12, at baseline. Furthermore, PROMIS CATs demonstrated known-group validity and identified clinically significant improvements in all measures after surgical intervention. In the cervical radiculopathy and myelopathic cohorts, the PROMIS measures demonstrated similar responsiveness to the

  4. Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations.

    Science.gov (United States)

    Teresi, Jeanne A; Ocepek-Welikson, Katja; Cook, Karon F; Kleinman, Marjorie; Ramirez, Mildred; Reid, M Carrington; Siu, Albert

    2016-01-01

    Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System ® (PROMIS ® ) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, "How much did pain interfere with enjoyment of social activities?" was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and sensitivity

  5. Development and Evaluation of the PROMIS® Pediatric Positive Affect Item Bank, Child-Report and Parent-Proxy Editions.

    Science.gov (United States)

    Forrest, Christopher B; Ravens-Sieberer, Ulrike; Devine, Janine; Becker, Brandon D; Teneralli, Rachel; Moon, JeanHee; Carle, Adam; Tucker, Carole A; Bevans, Katherine B

    2018-03-01

    The purpose of this study is to describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Positive Affect item bank, child-report and parent-proxy editions. The initial item pool comprising 53 items, previously developed using qualitative methods, was administered to 1,874 children 8-17 years old and 909 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and construct validity. A total of 14 items were deleted, because of poor psychometric performance, and an 8-item short form constructed from the remaining 39 items was administered to a national sample of 1,004 children 8-17 years old, and 1,306 parents of children 5-17 years old. The combined sample was used in item response theory (IRT) calibration analyses. The final item bank appeared unidimensional, the items appeared locally independent, and the items were free from differential item functioning. The scales showed excellent reliability and convergent and discriminant validity. Positive affect decreased with children's age and was lower for those with a special health care need. After IRT calibration, we found that 4 and 8 item short forms had a high degree of precision (reliability) across a wide range of the latent trait (>4 SD units). The PROMIS Pediatric Positive Affect item bank and its short forms provide an efficient, precise, and valid assessment of positive affect in children and youth.

  6. External Data and Attribute Hyperlink Programs for Promis*e(Registered Trademark)

    Science.gov (United States)

    Derengowski, Rich; Gruel, Andrew

    2001-01-01

    External Data and Attribute Hyperlink are computer programs that can be added to Promis*e(trademark) which is a commercial software system that automates routine tasks in the design (including drawing schematic diagrams) of electrical control systems. The programs were developed under the Stennis Space Center's (SSC) Dual Use Technology Development Program to provide capabilities for SSC's BMCS configuration management system which uses Promis*e(trademark). The External Data program enables the storage and management of information in an external database linked to a drawing. Changes can be made either in the database or on the drawing. Information that originates outside Promis*e(trademark) can be stored in custom fields that can be added to the database. Although this information is not available in Promis*e(trademark) printed drawings, it can be associated with symbols in the drawings, and can be retrieved through the drawings when the software is running. The Attribute Hyperlink program enables the addition of hyperlink information as attributes of symbols. This program enables the formation of a direct hyperlink between a schematic diagram and an Internet site or a file on a compact disk, on the user's hard drive, or on another computer on a network to which the user's computer is connected. The user can then obtain information directly related to the part (e.g., maintenance, or troubleshooting information) associated with the hyperlink.

  7. Development of the PROMIS positive emotional and sensory expectancies of smoking item banks.

    Science.gov (United States)

    Tucker, Joan S; Shadel, William G; Edelen, Maria Orlando; Stucky, Brian D; Li, Zhen; Hansen, Mark; Cai, Li

    2014-09-01

    The positive emotional and sensory expectancies of cigarette smoking include improved cognitive abilities, positive affective states, and pleasurable sensorimotor sensations. This paper describes development of Positive Emotional and Sensory Expectancies of Smoking item banks that will serve to standardize the assessment of this construct among daily and nondaily cigarette smokers. Data came from daily (N = 4,201) and nondaily (N =1,183) smokers who completed an online survey. To identify a unidimensional set of items, we conducted item factor analyses, item response theory analyses, and differential item functioning analyses. Additionally, we evaluated the performance of fixed-item short forms (SFs) and computer adaptive tests (CATs) to efficiently assess the construct. Eighteen items were included in the item banks (15 common across daily and nondaily smokers, 1 unique to daily, 2 unique to nondaily). The item banks are strongly unidimensional, highly reliable (reliability = 0.95 for both), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.86). Results from simulated CATs indicated that, on average, less than 8 items are needed to assess the construct with adequate precision using the item banks. These analyses identified a new set of items that can assess the positive emotional and sensory expectancies of smoking in a reliable and standardized manner. Considerable efficiency in assessing this construct can be achieved by using the item bank SF, employing computer adaptive tests, or selecting subsets of items tailored to specific research or clinical purposes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. PROMIS Pain Interference and Physical Function Scores Correlate With the Foot and Ankle Ability Measure (FAAM) in Patients With Hallux Valgus.

    Science.gov (United States)

    Nixon, Devon C; McCormick, Jeremy J; Johnson, Jeffrey E; Klein, Sandra E

    2017-11-01

    Traditional patient-reported outcome instruments like the Foot and Ankle Ability Measure (FAAM) quantify patient disability but often are limited by responder burden and incomplete questionnaires. The Patient-Reported Outcome Measurement Information System (PROMIS) overcomes such obstacles through computer-adaptive technology and can capture outcome data from various domains including physical and psychosocial function. Prior work has compared the FAAM with PROMIS physical function; however, there is little evidence comparing the association between foot and ankle-specific tools like the FAAM with more general outcomes measures of PROMIS pain interference and depression in foot and ankle conditions. (1) We asked whether there was a relationship between FAAM Activities of Daily Living (ADL) scores with PROMIS physical function, pain interference, and depression in patients with hallux valgus. (2) Additionally, we asked if we could identify specific factors that are associated with variance in FAAM and PROMIS physical function scores in patients with hallux valgus. Eighty-five new patients with either a primary or secondary diagnosis of hallux valgus based on clinic billing codes from July 2015 to February 2016 were retrospectively identified. Patients completed FAAM ADL paper-based surveys and electronic PROMIS questionnaires for physical function, pain interference, and depression from new patient visits at a single time. Spearman rho correlations were performed between FAAM ADL and PROMIS scores. Analyses then were used to identify differences in FAAM ADL and PROMIS physical function measures based on demographic variables. Stepwise linear regressions then determined which demographic and/or outcome variable(s) accounted for the variance in FAAM ADL and PROMIS physical function scores. FAAM scores correlated strongly with PROMIS physical function (r = 0.70, p hallux valgus. PROMIS tools allow for more-efficient data collection across multiple domains and, moving

  9. Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations

    Science.gov (United States)

    Teresi, Jeanne A.; Ocepek-Welikson, Katja; Cook, Karon F.; Kleinman, Marjorie; Ramirez, Mildred; Reid, M. Carrington; Siu, Albert

    2017-01-01

    Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System® (PROMIS®) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. Methods DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. Results The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, “How much did pain interfere with enjoyment of social activities?” was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and

  10. Validation of Patient Reported Outcomes Measurement Information System (PROMIS) Computer Adaptive Tests (CATs) in the Surgical Treatment of Lumbar Spinal Stenosis.

    Science.gov (United States)

    Patel, Alpesh A; Dodwad, Shah-Nawaz M; Boody, Barrett S; Bhatt, Surabhi; Savage, Jason W; Hsu, Wellington K; Rothrock, Nan E

    2018-03-19

    Prospective, cohort study. Demonstrate validity of PROMIS physical function, pain interference, and pain behavior computer adaptive tests (CATs) in surgically treated lumbar stenosis patients. There has been increasing attention given to patient reported outcomes associated with spinal interventions. Historical patient outcome measures have inadequate validation, demonstrate floor/ceiling effects, and infrequently used due to time constraints. PROMIS is an adaptive, responsive NIH assessment tool that measures patient-reported health status. 98 consecutive patients were surgically treated for lumbar spinal stenosis and were assessed using PROMIS CATs, ODI, ZCQ and SF-12. Prior lumbar surgery, history of scoliosis, cancer, trauma, or infection were excluded. Completion time, preoperative assessment, 6 week and 3 month postoperative scores were collected. At baseline, 49%, 79%, and 81% of patients had PROMIS PB, PI, and PF scores greater than 1 SD worse than the general population. 50.6% were categorized as severely disabled, crippled, or bed bound by ODI. PROMIS CATs demonstrated convergent validity through moderate to high correlations with legacy measures (r = 0.35-0.73). PROMIS CATs demonstrated known groups validity when stratified by ODI levels of disability. ODI improvements of at least 10 points on average had changes in PROMIS scores in the expected direction (PI = -12.98, PB = -9.74, PF = 7.53). PROMIS CATs demonstrated comparable responsiveness to change when evaluated against legacy measures. PROMIS PB and PI decreased 6.66 and 9.62 and PROMIS PF increased 6.8 points between baseline and 3-months post-op (p validity, known groups validity, and responsiveness for surgically treated patients with lumbar stenosis to detect change over time and are more efficient than legacy instruments. 2.

  11. PROMIS Sleep Disturbance and Sleep-Related Impairment in Adolescents: Examining Psychometrics Using Self-Report and Actigraphy.

    Science.gov (United States)

    Hanish, Alyson E; Lin-Dyken, Deborah C; Han, Joan C

    The National Institutes of Health Patient-Reported Outcomes Measurement Information System (PROMIS) has self-reported health measures available for both pediatric and adult populations, but no pediatric measures are available currently in the sleep domains. The purpose of this observational study was to perform preliminary validation studies on age-appropriate, self-reported sleep measures in healthy adolescents. This study examined 25 healthy adolescents' self-reported daytime sleepiness, sleep disturbance, sleep-related impairment, and sleep patterns. Healthy adolescents completed a physical exam at the National Institutes of Health Clinical Center (Bethesda, MD), had no chronic medical conditions, and were not taking any chronic medications. The Cleveland Adolescent Sleepiness Questionnaire (CASQ), PROMIS Sleep Disturbance (v. 1.0; 8a), and PROMIS Sleep-Related Impairment (v. 1.0; 8b) questionnaires were completed, and sleep patterns were assessed using actigraphy. Total scores on the three sleep questionnaires were correlated (all Spearman's r > .70, p psychometrically sound sleep questionnaires. Findings suggest the potential research and clinical utility of adult versions of PROMIS sleep measures in adolescents. Future studies should include larger, more diverse samples and explore additional psychometric properties of PROMIS sleep measures to provide age-appropriate, validated, and reliable measures of sleep in adolescents.

  12. Comparison of pediatric self reports and parent proxy reports utilizing PROMIS: Results from a chiropractic practice-based research network.

    Science.gov (United States)

    Alcantara, Joel; Ohm, Jeanne; Alcantara, Junjoe

    2017-11-01

    To measure the cross-informant variant of pediatric quality of life (QoL) based on self-reports and parent proxy measures. A secondary analysis of baseline data obtained from two independent studies measuring the QoL based on the pediatric PROMIS-25 self-report and the PROMIS parent-proxy items banks. A scoring manual associated raw scores to a T score metric (mean = 50; SD = 10). Reliability of QoL ratings utilized the ICC while comparison of mean T Scores utilized the unpaired t-test. A total of 289 parent-child dyads comprised our study responders. Average age for parents and children was 41.27 years and 12.52 years, respectively. The mean T score (child self-report: parent proxy) for each QoL domains were: mobility (50.82:52.58), anxiety (46.73:44.21), depression (45.18:43.60), fatigue (45.59:43.92), peer-relationships (52.15:52.88) and pain interference (47.47:44.80). Parents tend to over-estimate their child's QoL based on measures of anxiety, depression, fatigue, peer-relationships and pain interference. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  13. Validation of the PROMIS® measures of self-efficacy for managing chronic conditions.

    Science.gov (United States)

    Gruber-Baldini, Ann L; Velozo, Craig; Romero, Sergio; Shulman, Lisa M

    2017-07-01

    The Patient-Reported Outcomes Measurement Information System ® (PROMIS ® ) was designed to develop, validate, and standardize item banks to measure key domains of physical, mental, and social health in chronic conditions. This paper reports the calibration and validation testing of the PROMIS Self-Efficacy for Managing Chronic Conditions measures. PROMIS Self-Efficacy for Managing Chronic Conditions item banks comprise five domains, Self-Efficacy for Managing: Daily Activities, Symptoms, Medications and Treatments, Emotions, and Social Interactions. Banks were calibrated in 1087 subjects from two data sources: 837 patients with chronic neurologic conditions (epilepsy, multiple sclerosis, neuropathy, Parkinson disease, and stroke) and 250 subjects from an online Internet sample of adults with general chronic conditions. Scores were compared with one legacy scale: Self-Efficacy for Managing Chronic Disease 6-Item scale (SEMCD6) and five PROMIS short forms: Global Health (Physical and Mental), Physical Function, Fatigue, Depression, and Anxiety. The sample was 57% female, mean age = 53.8 (SD = 14.7), 76% white, 21% African American, 6% Hispanic, and 76% with greater than high school education. Full-item banks were created for each domain. All measures had good internal consistency and correlated well with SEMCD6 (r  = 0.56-0.75). Significant correlations were seen between the Self-Efficacy measures and other PROMIS short forms (r  > 0.38). The newly developed PROMIS Self-Efficacy for Managing Chronic Conditions measures include five domains of self-efficacy that were calibrated across diverse chronic conditions and show good internal consistency and cross-sectional validity.

  14. An Item Bank for Abuse of Prescription Pain Medication from the Patient-Reported Outcomes Measurement Information System (PROMIS®).

    Science.gov (United States)

    Pilkonis, Paul A; Yu, Lan; Dodds, Nathan E; Johnston, Kelly L; Lawrence, Suzanne M; Hilton, Thomas F; Daley, Dennis C; Patkar, Ashwin A; McCarty, Dennis

    2017-08-01

    There is a need to monitor patients receiving prescription opioids to detect possible signs of abuse. To address this need, we developed and calibrated an item bank for severity of abuse of prescription pain medication as part of the Patient-Reported Outcomes Measurement Information System (PROMIS ® ). Comprehensive literature searches yielded an initial bank of 5,310 items relevant to substance use and abuse, including abuse of prescription pain medication, from over 80 unique instruments. After qualitative item analysis (i.e., focus groups, cognitive interviewing, expert review, and item revision), 25 items for abuse of prescribed pain medication were included in field testing. Items were written in a first-person, past-tense format, with a three-month time frame and five response options reflecting frequency or severity. The calibration sample included 448 respondents, 367 from the general population (ascertained through an internet panel) and 81 from community treatment programs participating in the National Drug Abuse Treatment Clinical Trials Network. A final bank of 22 items was calibrated using the two-parameter graded response model from item response theory. A seven-item static short form was also developed. The test information curve showed that the PROMIS ® item bank for abuse of prescription pain medication provided substantial information in a broad range of severity. The initial psychometric characteristics of the item bank support its use as a computerized adaptive test or short form, with either version providing a brief, precise, and efficient measure relevant to both clinical and community samples. © 2016 American Academy of Pain Medicine. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

  15. Development and psychometric characteristics of the SCI-QOL Ability to Participate and Satisfaction with Social Roles and Activities item banks and short forms.

    Science.gov (United States)

    Heinemann, Allen W; Kisala, Pamela A; Hahn, Elizabeth A; Tulsky, David S

    2015-05-01

    To develop a spinal cord injury (SCI)-focused version of PROMIS and Neuro-QOL social domain item banks; evaluate the psychometric properties of items developed for adults with SCI; and report information to facilitate clinical and research use. We used a mixed-methods design to develop and evaluate Ability to Participate in Social Roles and Activities and Satisfaction with Social Roles and Activities items. Focus groups helped define the constructs; cognitive interviews helped revise items; and confirmatory factor analysis and item response theory methods helped calibrate item banks and evaluate differential item functioning related to demographic and injury characteristics. Five SCI Model System sites and one Veterans Administration medical center. The calibration sample consisted of 641 individuals; a reliability sample consisted of 245 individuals residing in the community. A subset of 27 Ability to Participate and 35 Satisfaction items demonstrated good measurement properties and negligible differential item functioning related to demographic and injury characteristics. The SCI-specific measures correlate strongly with the PROMIS and Neuro-QOL versions. Ten item short forms correlate >0.96 with the full banks. Variable-length CATs with a minimum of 4 items, variable-length CATs with a minimum of 8 items, fixed-length CATs of 10 items, and the 10-item short forms demonstrate construct coverage and measurement error that is comparable to the full item bank. The Ability to Participate and Satisfaction with Social Roles and Activities CATs and short forms demonstrate excellent psychometric properties and are suitable for clinical and research applications.

  16. Method of administration of PROMIS scales did not significantly impact score level, reliability, or validity

    DEFF Research Database (Denmark)

    Bjorner, Jakob B; Rose, Matthias; Gandek, Barbara

    2014-01-01

    OBJECTIVES: To test the impact of the method of administration (MOA) on score level, reliability, and validity of scales developed in the Patient Reported Outcomes Measurement Information System (PROMIS). STUDY DESIGN AND SETTING: Two nonoverlapping parallel forms each containing eight items from......, no significant mode differences were found and all confidence intervals were within the prespecified minimal important difference of 0.2 standard deviation. Parallel-forms reliabilities were very high (ICC = 0.85-0.93). Only one across-mode ICC was significantly lower than the same-mode ICC. Tests of validity...... questionnaire (PQ), personal digital assistant (PDA), or personal computer (PC) and a second form by PC, in the same administration. Method equivalence was evaluated through analyses of difference scores, intraclass correlations (ICCs), and convergent/discriminant validity. RESULTS: In difference score analyses...

  17. Should Global Items on Student Rating Scales Be Used for Summative Decisions?

    Science.gov (United States)

    Berk, Ronald A.

    2013-01-01

    One of the simplest indicators of teaching or course effectiveness is student ratings on one or more global items from the entire rating scale. That approach seems intuitively sound and easy to use. Global items have even been recommended by a few researchers to get a quick-read, at-a-glance summary for summative decisions about faculty. The…

  18. Impact of the Patient-Reported Outcomes Management Information System (PROMIS) upon the design and operation of multi-center clinical trials: a qualitative research study.

    Science.gov (United States)

    Eisenstein, Eric L; Diener, Lawrence W; Nahm, Meredith; Weinfurt, Kevin P

    2011-12-01

    New technologies may be required to integrate the National Institutes of Health's Patient Reported Outcome Management Information System (PROMIS) into multi-center clinical trials. To better understand this need, we identified likely PROMIS reporting formats, developed a multi-center clinical trial process model, and identified gaps between current capabilities and those necessary for PROMIS. These results were evaluated by key trial constituencies. Issues reported by principal investigators fell into two categories: acceptance by key regulators and the scientific community, and usability for researchers and clinicians. Issues reported by the coordinating center, participating sites, and study subjects were those faced when integrating new technologies into existing clinical trial systems. We then defined elements of a PROMIS Tool Kit required for integrating PROMIS into a multi-center clinical trial environment. The requirements identified in this study serve as a framework for future investigators in the design, development, implementation, and operation of PROMIS Tool Kit technologies.

  19. ProMIS Augmented Reality Training of Laparoscopic Procedures Face Validity

    NARCIS (Netherlands)

    Botden, Sanne M. B. I.; Buzink, Sonja N.; Schijven, Marlies P.; Jakimowicz, Jack J.

    2008-01-01

    Background: Conventional video trainers lack the ability to assess the trainee objectively, but offer modalities that are often missing in virtual reality simulation, such as realistic haptic feedback. The ProMIS augmented reality laparoscopic simulator retains the benefit of a traditional box

  20. ProMIS augmented reality training of laparoscopic procedures face validity

    NARCIS (Netherlands)

    Botden, Sanne M. B. I.; Buzink, Sonja N.; Schijven, Marlies P.; Jakimowicz, Jack J.

    2008-01-01

    BACKGROUND: Conventional video trainers lack the ability to assess the trainee objectively, but offer modalities that are often missing in virtual reality simulation, such as realistic haptic feedback. The ProMIS augmented reality laparoscopic simulator retains the benefit of a traditional box

  1. Assessment of the psychometrics of a PROMIS item bank: self-efficacy for managing daily activities.

    Science.gov (United States)

    Hong, Ickpyo; Velozo, Craig A; Li, Chih-Ying; Romero, Sergio; Gruber-Baldini, Ann L; Shulman, Lisa M

    2016-09-01

    The aim of this study is to investigate the psychometrics of the Patient-Reported Outcomes Measurement Information System self-efficacy for managing daily activities item bank. The item pool was field tested on a sample of 1087 participants via internet (n = 250) and in-clinic (n = 837) surveys. All participants reported having at least one chronic health condition. The 35 item pool was investigated for dimensionality (confirmatory factor analyses, CFA and exploratory factor analysis, EFA), item-total correlations, local independence, precision, and differential item functioning (DIF) across gender, race, ethnicity, age groups, data collection modes, and neurological chronic conditions (McFadden Pseudo R (2) less than 10 %). The item pool met two of the four CFA fit criteria (CFI = 0.952 and SRMR = 0.07). EFA analysis found a dominant first factor (eigenvalue = 24.34) and the ratio of first to second eigenvalue was 12.4. The item pool demonstrated good item-total correlations (0.59-0.85) and acceptable internal consistency (Cronbach's alpha = 0.97). The item pool maintained its precision (reliability over 0.90) across a wide range of theta (3.70), and there was no significant DIF. The findings indicated the item pool has sound psychometric properties and the test items are eligible for development of computerized adaptive testing and short forms.

  2. Impact of the Patient-Reported Outcomes Management Information System (PROMIS) upon the Design and Operation of Multi-center Clinical Trials: a Qualitative Research Study

    OpenAIRE

    Eisenstein, Eric L.; Diener, Lawrence W.; Nahm, Meredith; Weinfurt, Kevin P.

    2010-01-01

    New technologies may be required to integrate the National Institutes of Health’s Patient Reported Outcome Management Information System (PROMIS) into multi-center clinical trials. To better understand this need, we identified likely PROMIS reporting formats, developed a multi-center clinical trial process model, and identified gaps between current capabilities and those necessary for PROMIS. These results were evaluated by key trial constituencies. Issues reported by principal investigators ...

  3. Calibration of communication skills items in OSCE checklists according to the MAAS-Global.

    Science.gov (United States)

    Setyonugroho, Winny; Kropmans, Thomas; Kennedy, Kieran M; Stewart, Brian; van Dalen, Jan

    2016-01-01

    Communication skills (CS) are commonly assessed using 'communication items' in Objective Structured Clinical Examination (OSCE) station checklists. Our aim is to calibrate the communication component of OSCE station checklists according to the MAAS-Global which is a valid and reliable standard to assess CS in undergraduate medical education. Three raters independently compared 280 checklists from 4 disciplines contributing to the undergraduate year 4 OSCE against the 17 items of the MAAS-Global standard. G-theory was used to analyze the reliability of this calibration procedure. G-Kappa was 0.8. For two raters G-Kappa is 0.72 and it fell to 0.57 for one rater. 46% of the checklist items corresponded to section three of the MAAS-Global (i.e. medical content of the consultation), whilst 12% corresponded to section two (i.e. general CS), and 8.2% to section one (i.e. CS for each separate phase of the consultation). 34% of the items were not considered to be CS. A G-Kappa of 0.8 confirms a reliable and valid procedure for calibrating OSCE CS checklist items using the MAAS-Global. We strongly suggest that such a procedure is more widely employed to arrive at a stable (valid and reliable) judgment of the communication component in existing checklists for medical students' communication behaviours. It is possible to measure the 'true' caliber of CS in OSCE stations. Students' results are thereby comparable between and across stations, students and institutions. A reliable calibration procedure requires only two raters. Copyright © 2015. Published by Elsevier Ireland Ltd.

  4. Performance of PROMIS Physical Function Compared with KOOS, SF-36, Eq5d And Marx Activity Scale in Patients Who Undergo ACL Reconstruction

    Science.gov (United States)

    Scott, Elizabeth; Glass, Natalie; Wolf, Brian R.; Hettrich, Carolyn M.; Bollier, Matthew

    2018-01-01

    Objectives: Anterior cruciate ligament reconstruction is a commonly performed orthopaedic procedure. PROMIS (Patient-Reported Outcome Measurement Information System) was developed by the National Institutes of Health in an effort to advance patient-reported outcome (PRO) instruments by developing question banks for major health domains. Our goal was to compare the responsiveness and construct validity of the PROMIS physical function (PF) computer adaptive test (CAT) with current PRO instruments utilized in patients who undergo anterior cruciate ligament reconstruction. Methods: A total of 174 patients ages 14-53 scheduled to undergo anterior cruciate ligament reconstruction were asked to complete PROMIS PF-CAT, Short Form-36 Health Survey (SF36-PF and -GH), Marx activity rating scale (Marx), Knee Injury and Osteoarthritis Score (KOOS-ADL, -Sport, -QOL), and the EuroQol five dimensions questionnaire (EQ5D) at their preoperative visit. These surveys were repeated at six weeks and six months after surgery. Correlations between PRO instruments was defined as excellent (>0.7), excellent-good (0.61-0.7), good (0.4-0.6), and poor (0.2-0.3) using Spearman Correlation Coefficients. The effect size (Cohen d) and standardized response mean (SRM) were used to describe the responsiveness of each PRO at the 6 week and 6 month follow-up visits and were defined as small (0.2), medium (0.5) and large (0.8). Ceiling and floor effects were defined as present if ≥15% of participants scored the highest or lowest score on a PRO, respectively. Subgroup analyses were performed comparing change in PRO scores at follow-up between participants with and without additional arthroscopic procedures (meniscal debridement and/or repair, microfracture, or OATS vs ACL reconstruction only) using linear mixed models. Results: There were excellent and excellent-good correlations between the PROMIS PF-CAT and physical function PROs including the SF36-PF (r=0.75-0.80, p0.05) to poor correlation with

  5. Correlation of PROMIS Physical Function and Pain CAT Instruments With Oswestry Disability Index and Neck Disability Index in Spine Patients.

    Science.gov (United States)

    Papuga, Mark O; Mesfin, Addisu; Molinari, Robert; Rubery, Paul T

    2016-07-15

    A prospective and retrospective cross-sectional cohort analysis. The aim of this study was to show that Patient-Reported Outcomes Measurement Information System (PROMIS) computer adaptive testing (CAT) assessments for physical function and pain interference can be efficiently collected in a standard office visit and to evaluate these scores with scores from previously validated Oswestry Disability Index (ODI) and Neck Disability Index (NDI) providing evidence of convergent validity for use in patients with spine pathology. Spinal surgery outcomes are highly variable, and substantial debate continues regarding the role and value of spine surgery. The routine collection of patient-based outcomes instruments in spine surgery patients may inform this debate. Traditionally, the inefficiency associated with collecting standard validated instruments has been a barrier to routine use in outpatient clinics. We utilized several CAT instruments available through PROMIS and correlated these with the results obtained using "gold standard" legacy outcomes measurement instruments. All measurements were collected at a routine clinical visit. The ODI and the NDI assessments were used as "gold standard" comparisons for patient-reported outcomes. PROMIS CAT instruments required 4.5 ± 1.8 questions and took 35 ± 16 seconds to complete, compared with ODI/NDI requiring 10 questions and taking 188 ± 85 seconds when administered electronically. Linear regression analysis of retrospective scores involving a primary back complaint revealed moderate to strong correlations between ODI and PROMIS physical function with r values ranging from 0.5846 to 0.8907 depending on the specific assessment and patient subsets examined. Routine collection of physical function outcome measures in clinical practice offers the ability to inform and improve patient care. We have shown that several PROMIS CAT instruments can be efficiently administered during routine clinical visits. The

  6. The case for an international patient-reported outcomes measurement information system (PROMIS®) initiative.

    Science.gov (United States)

    Alonso, Jordi; Bartlett, Susan J; Rose, Matthias; Aaronson, Neil K; Chaplin, John E; Efficace, Fabio; Leplège, Alain; Lu, Aiping; Tulsky, David S; Raat, Hein; Ravens-Sieberer, Ulrike; Revicki, Dennis; Terwee, Caroline B; Valderas, Jose M; Cella, David; Forrest, Christopher B

    2013-12-20

    Patient-reported outcomes (PROs) play an increasingly important role in clinical practice and research. Modern psychometric methods such as item response theory (IRT) enable the creation of item banks that support fixed-length forms as well as computerized adaptive testing (CAT), often resulting in improved measurement precision and responsiveness. Here we describe and discuss the case for developing an international core set of PROs building from the US PROMIS® network.PROMIS is a U.S.-based cooperative group of research sites and centers of excellence convened to develop and standardize PRO measures across studies and settings. If extended to a global collaboration, PROMIS has the potential to transform PRO measurement by creating a shared, unifying terminology and metric for reporting of common symptoms and functional life domains. Extending a common set of standardized PRO measures to the international community offers great potential for improving patient-centered research, clinical trials reporting, population monitoring, and health care worldwide. Benefits of such standardization include the possibility of: international syntheses (such as meta-analyses) of research findings; international population monitoring and policy development; health services administrators and planners access to relevant information on the populations they serve; better assessment and monitoring of patients by providers; and improved shared decision making.The goal of the current PROMIS International initiative is to ensure that item banks are translated and culturally adapted for use in adults and children in as many countries as possible. The process includes 3 key steps: translation/cultural adaptation, calibration, and validation. A universal translation, an approach focusing on commonalities, rather than differences across versions developed in regions or countries speaking the same language, is proposed to ensure conceptual equivalence for all items. International item

  7. Measuring pain phenomena after spinal cord injury: Development and psychometric properties of the SCI-QOL Pain Interference and Pain Behavior assessment tools.

    Science.gov (United States)

    Cohen, Matthew L; Kisala, Pamela A; Dyson-Hudson, Trevor A; Tulsky, David S

    2018-05-01

    To develop modern patient-reported outcome measures that assess pain interference and pain behavior after spinal cord injury (SCI). Grounded-theory based qualitative item development; large-scale item calibration field-testing; confirmatory factor analyses; graded response model item response theory analyses; statistical linking techniques to transform scores to the Patient Reported Outcome Measurement Information System (PROMIS) metric. Five SCI Model Systems centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. N/A. Spinal Cord Injury - Quality of Life (SCI-QOL) Pain Interference item bank, SCI-QOL Pain Interference short form, and SCI-QOL Pain Behavior scale. Seven hundred fifty-seven individuals with traumatic SCI completed 58 items addressing various aspects of pain. Items were then separated by whether they assessed pain interference or pain behavior, and poorly functioning items were removed. Confirmatory factor analyses confirmed that each set of items was unidimensional, and item response theory analyses were used to estimate slopes and thresholds for the items. Ultimately, 7 items (4 from PROMIS) comprised the Pain Behavior scale and 25 items (18 from PROMIS) comprised the Pain Interference item bank. Ten of these 25 items were selected to form the Pain Interference short form. The SCI-QOL Pain Interference item bank and the SCI-QOL Pain Behavior scale demonstrated robust psychometric properties. The Pain Interference item bank is available as a computer adaptive test or short form for research and clinical applications, and scores are transformed to the PROMIS metric.

  8. Methodology for the development and calibration of the SCI-QOL item banks.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Victorson, David; Choi, Seung W; Gershon, Richard; Heinemann, Allen W; Cella, David

    2015-05-01

    To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Individual interviews (n=44) and focus groups (n=65 individuals with SCI and n=42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n=877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n=245) to assess test-retest reliability and stability. A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury--Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM.

  9. Use of Adult Patient Focus Groups to Develop the Initial Item Bank for a Cochlear Implant Quality-of-Life Instrument.

    Science.gov (United States)

    McRackan, Theodore R; Velozo, Craig A; Holcomb, Meredith A; Camposeo, Elizabeth L; Hatch, Jonathan L; Meyer, Ted A; Lambert, Paul R; Melvin, Cathy L; Dubno, Judy R

    2017-10-01

    No instrument exists to assess quality of life (QOL) in adult cochlear implant (CI) users that has been developed and validated using accepted scientific standards. To develop a CI-specific QOL instrument for adults in accordance with the Patient Reported Outcomes Measurement Information System (PROMIS) guidelines. As required in the PROMIS guidelines, patient focus groups participated in creation of the initial item bank. Twenty-three adult CI users were divided into 1 of 3 focus groups stratified by word recognition ability. Three moderator-led focus groups were conducted based on grounded theory on December 3, 2016. Two reviewers independently analyzed focus group recordings and transcripts, with a third reviewer available to resolve discrepancies. All data were reviewed and reported according to the Consolidated Criteria for Reporting Qualitative Research. The setting was a tertiary referral center. Coded focus group data. The 23 focus group participants (10 [43%] female; mean [range] age, 68.1 [46.2-84.2] years) represented a wide range of income levels, education levels, listening modalities, CI device manufacturers, duration of CI use, and age at implantation. Data saturation was determined to be reached before the conclusion of each of the focus groups. After analysis of the transcripts, the central themes identified were communication, emotion, environmental sounds, independence and work function, listening effort, social isolation and ability to socialize, and sound clarity. Cognitive interviews were carried out on 20 adult CI patients who did not participate in the focus groups to ensure item clarity. Based on these results, the initial QOL item bank and prototype were developed. Patient focus groups drawn from the target population are the preferred method of identifying content areas and domains for developing the item bank for a CI-specific QOL instrument. Compared with previously used methods, the use of patient-centered item development for a CI

  10. Domains of health-related quality of life important and relevant to multiethnic English-speaking Asian systemic lupus erythematosus patients: a focus group study.

    Science.gov (United States)

    Ow, Yen Ling Mandy; Thumboo, Julian; Cella, David; Cheung, Yin Bun; Yong Fong, Kok; Wee, Hwee Lin

    2011-06-01

    To identify health-related quality of life (HRQOL) domains of importance to multiethnic Asian systemic lupus erythematosus (SLE) patients, to identify content gaps in existing SLE-specific HRQOL measures, and to determine whether the Patient-Reported Outcomes Measurement Information System (PROMIS) item banks could serve as a core set of questions for HRQOL assessment among SLE patients. English-speaking patients with physician-diagnosed SLE from a specialist clinic in a tertiary care hospital in Singapore and a patient support group were recruited. Thematic analysis was performed to distill themes from transcripts through open coding by 2 independent coders and axial coding for refinement of categories. Items from 3 existing SLE-specific measures and PROMIS Version 1.0 Item Banks were compared with identified subthemes. Twenty-seven female and 2 male participants (21 Chinese, 4 Malay, 3 Indian, 1 other) ages 23-62 years participated in 6 focus groups and 2 individual interviews, respectively. Twenty-one domains and 92 subthemes were identified. Domains of family, relationships, stigma and discrimination, and freedom were unaddressed by existing SLE-specific measures. Forty subthemes from 14 domains were addressed by the PROMIS Version 1.0 Item Banks (Physical Function, Pain, Fatigue, Sleep Disturbance, Sleep-Related Impairment, Anger, Anxiety, and Depression banks). Family and stigma and discrimination (identified as content gaps) may be accentuated in the Asian sociocultural context. PROMIS item banks have tremendous potential to serve as a core set of items for HRQOL assessment in SLE patients. Additional items may be written to fill the gaps in existing PROMIS item banks. Copyright © 2011 by the American College of Rheumatology.

  11. The use of focus groups in the development of the PROMIS pediatrics item bank.

    Science.gov (United States)

    Walsh, Tasanee R; Irwin, Debra E; Meier, Andrea; Varni, James W; DeWalt, Darren A

    2008-06-01

    To understand differences in perceptions of patient-reported outcome domains between children with asthma and children from the general population. We used this information in the development of patient-reported outcome items for the Patient-Reported Outcomes Measurement Information System Pediatrics project. We conducted focus groups composed of ethnically, racially, and geographically diverse youth (8-12, 13-17 years) from the general population and youth with asthma. We performed content analysis to identify important themes. We identified five unique and different challenges that may confront youth with asthma as compared to general population youth: (1) They experience more difficulties when participating in physical activities; (2) They may experience anxiety about having an asthma attack at anytime and anywhere; (3) They may experience sleep disturbances and fatigue secondary to their asthma symptoms; (4) Their health condition has a greater effect on their emotional well-being and interpersonal relationships; and (5) Youth with asthma report that asthma often leaves them with insufficient energy to complete their school activities, especially physical activities. The results confirm unique experiences for children with asthma across a broad range of health domains and enhance the breadth of all domains when creating an item bank.

  12. Developing core elements and checklist items for global hospital antimicrobial stewardship programmes: a consensus approach.

    Science.gov (United States)

    Pulcini, C; Binda, F; Lamkang, A S; Trett, A; Charani, E; Goff, D A; Harbarth, S; Hinrichsen, S L; Levy-Hara, G; Mendelson, M; Nathwani, D; Gunturu, R; Singh, S; Srinivasan, A; Thamlikitkul, V; Thursky, K; Vlieghe, E; Wertheim, H; Zeng, M; Gandra, S; Laxminarayan, R

    2018-04-03

    With increasing global interest in hospital antimicrobial stewardship (AMS) programmes, there is a strong demand for core elements of AMS to be clearly defined on the basis of principles of effectiveness and affordability. To date, efforts to identify such core elements have been limited to Europe, Australia, and North America. The aim of this study was to develop a set of core elements and their related checklist items for AMS programmes that should be present in all hospitals worldwide, regardless of resource availability. A literature review was performed by searching Medline and relevant websites to retrieve a list of core elements and items that could have global relevance. These core elements and items were evaluated by an international group of AMS experts using a structured modified Delphi consensus procedure, using two-phased online in-depth questionnaires. The literature review identified seven core elements and their related 29 checklist items from 48 references. Fifteen experts from 13 countries in six continents participated in the consensus procedure. Ultimately, all seven core elements were retained, as well as 28 of the initial checklist items plus one that was newly suggested, all with ≥80% agreement; 20 elements and items were rephrased. This consensus on core elements for hospital AMS programmes is relevant to both high- and low-to-middle-income countries and could facilitate the development of national AMS stewardship guidelines and adoption by healthcare settings worldwide. Copyright © 2018 European Society of Clinical Microbiology and Infectious Diseases. All rights reserved.

  13. The impact of integrated prevention and treatment on child malnutrition and health: the PROMIS project, a randomized control trial in Burkina Faso and Mali.

    Science.gov (United States)

    Huybregts, Lieven; Becquey, Elodie; Zongrone, Amanda; Le Port, Agnes; Khassanova, Regina; Coulibaly, Lazare; Leroy, Jef L; Rawat, Rahul; Ruel, Marie T

    2017-03-09

    specific program impact pathways (PIPs). Cost-effectiveness analysis will assess the economic feasibility of the intervention. The PROMIS study assesses the effectiveness of an innovative model to integrate prevention and treatment interventions for greater and more sustainable impacts on the incidence and prevalence of AM using a rigorous, theory-based randomized control trial approach. This type of programmatic research is urgently needed to help program implementers, policy makers, and investors prioritize, select and scale-up the best program models to prevent and treat AM and achieve the World Health Assembly goal of reducing childhood wasting to less than 5% globally by the year 2025. Clinicaltrials.gov NCT02323815 (registered on December 18, 2014) and NCT02245152 (registered on September 16, 2014).

  14. The impact of integrated prevention and treatment on child malnutrition and health: the PROMIS project, a randomized control trial in Burkina Faso and Mali

    Directory of Open Access Journals (Sweden)

    Lieven Huybregts

    2017-03-01

    implementation of the intervention guided by country specific program impact pathways (PIPs. Cost-effectiveness analysis will assess the economic feasibility of the intervention. Discussion The PROMIS study assesses the effectiveness of an innovative model to integrate prevention and treatment interventions for greater and more sustainable impacts on the incidence and prevalence of AM using a rigorous, theory-based randomized control trial approach. This type of programmatic research is urgently needed to help program implementers, policy makers, and investors prioritize, select and scale-up the best program models to prevent and treat AM and achieve the World Health Assembly goal of reducing childhood wasting to less than 5% globally by the year 2025. Trial registration Clinicaltrials.gov NCT02323815 (registered on December 18, 2014 and NCT02245152 (registered on September 16, 2014

  15. Psychometric properties of the Global Operative Assessment of Laparoscopic Skills (GOALS) using item response theory.

    Science.gov (United States)

    Watanabe, Yusuke; Madani, Amin; Ito, Yoichi M; Bilgic, Elif; McKendy, Katherine M; Feldman, Liane S; Fried, Gerald M; Vassiliou, Melina C

    2017-02-01

    The extent to which each item assessed using the Global Operative Assessment of Laparoscopic Skills (GOALS) contributes to the total score remains unknown. The purpose of this study was to evaluate the level of difficulty and discriminative ability of each of the 5 GOALS items using item response theory (IRT). A total of 396 GOALS assessments for a variety of laparoscopic procedures over a 12-year time period were included. Threshold parameters of item difficulty and discrimination power were estimated for each item using IRT. The higher slope parameters seen with "bimanual dexterity" and "efficiency" are indicative of greater discriminative ability than "depth perception", "tissue handling", and "autonomy". IRT psychometric analysis indicates that the 5 GOALS items do not demonstrate uniform difficulty and discriminative power, suggesting that they should not be scored equally. "Bimanual dexterity" and "efficiency" seem to have stronger discrimination. Weighted scores based on these findings could improve the accuracy of assessing individual laparoscopic skills. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Adaptação transcultural dos Bancos de Itens de Ansiedade e Depressão do Patient-Reported Outcomes Measurement Information System (PROMIS para língua portuguesa

    Directory of Open Access Journals (Sweden)

    Natália Fontes Caputo de Castro

    2014-04-01

    Full Text Available O Patient-Reported Outcome Measurement Information System (PROMIS, estruturado em domínios físicos e psicossociais, superou lacunas ao propor nova ferramenta de avaliação de resultados aplicáveis às doenças crônicas com base em técnicas avançadas de estatística (TRI e testes adaptativos computadorizados (CAT. O objetivo do estudo foi adaptar culturalmente os Bancos de Itens de Ansiedade e Depressão do PROMIS para a língua portuguesa. O processo seguiu rigorosas recomendações do FACIT por meio da tradução avançada, reconciliação, retrotradução, revisão do FACIT, revisores independentes, finalização das etapas pelo FACIT, pré-teste e incorporação dos resultados do pré- teste. A versão traduzida foi pré-testada em dez pacientes, sendo necessária a modificação nos itens 3, 46 e 53 de Ansiedade e no item 46 de Depressão. As alterações alcançaram a equivalência de significado e a versão final foi compatível com as habilidades linguísticas e culturais da população brasileira. Concluiu-se que a versão traduzida é semântica e conceitualmente equivalente aos originais.

  17. The Quality of Life of Children Under Chiropractic Care Using PROMIS-25: Results from a Practice-Based Research Network.

    Science.gov (United States)

    Alcantara, Joel; Lamont, Andrea E; Ohm, Jeanne; Alcantara, Junjoe

    2018-04-01

    To characterize pediatric chiropractic and assess pediatric quality of life (QoL). A prospective cohort. Setting/Locations: Individual offices within a practice-based research network located throughout the United States. A convenience sample of children (8-17 years) under chiropractic care and their parents. Chiropractic spinal adjustments and adjunctive therapies. Survey instrument measuring sociodemographic information and correlates from the clinical encounter along with the Patient Reported Outcomes Measurement Information System (PROMIS)-25 to measure QoL (i.e., depression, anxiety, and pain interference). Sociodemographic and clinical correlates were analyzed using descriptive statistics (i.e., frequencies/percentages, means, and standard deviations). The PROMIS-25 data were analyzed using scoring manuals, converting raw scores to T score metric (mean = 50; SD = 10). A generalized linear mixed model was utilized to examine covariates (i.e., sex, number of visits, and motivation for care) that may have played an important role on the PROMIS outcome. The original data set consisted of 915 parent-child dyads. After data cleaning, a total of 881 parents (747 females, 134 males; mean age = 42.03 years) and 881 children (467 females and 414 males; mean age = 12.49 years) comprised this study population. The parents were highly educated and presented their child for mainly wellness care. The mean number of days and patient visits from baseline to comparative QoL measures was 38.12 days and 2.74 (SD = 2.61), respectively. After controlling for the effects of motivation for care, patient visits, duration of complaint, sex, and pain rating, significant differences were observed in the probability of experiencing problems (vs. no reported problems) across all QoL domains (Wald = 82.897, df = 4, p < 0.05). Post hoc comparisons demonstrated the children were less likely to report any symptoms of depression (Wald = 6.1474, df = 1

  18. Adults with an epilepsy history fare significantly worse on positive mental and physical health than adults with other common chronic conditions-Estimates from the 2010 National Health Interview Survey and Patient Reported Outcome Measurement System (PROMIS) Global Health Scale.

    Science.gov (United States)

    Kobau, Rosemarie; Cui, Wanjun; Zack, Matthew M

    2017-07-01

    Healthy People 2020, a national health promotion initiative, calls for increasing the proportion of U.S. adults who self-report good or better health. The Patient-Reported Outcomes Measurement Information System (PROMIS) Global Health Scale (GHS) was identified as a reliable and valid set of items of self-reported physical and mental health to monitor these two domains across the decade. The purpose of this study was to examine the percentage of adults with an epilepsy history who met the Healthy People 2020 target for self-reported good or better health and to compare these percentages to adults with history of other common chronic conditions. Using the 2010 National Health Interview Survey, we compared and estimated the age-standardized prevalence of reporting good or better physical and mental health among adults with five selected chronic conditions including epilepsy, diabetes, heart disease, cancer, and hypertension. We examined response patterns for physical and mental health scale among adults with these five conditions. The percentages of adults with epilepsy who reported good or better physical health (52%) or mental health (54%) were significantly below the Healthy People 2020 target estimate of 80% for both outcomes. Significantly smaller percentages of adults with an epilepsy history reported good or better physical health than adults with heart disease, cancer, or hypertension. Significantly smaller percentages of adults with an epilepsy history reported good or better mental health than adults with all other four conditions. Health and social service providers can implement and enhance existing evidence-based clinical interventions and public health programs and strategies shown to improve outcomes in epilepsy. These estimates can be used to assess improvements in the Healthy People 2020 Health-Related Quality of Life and Well-Being Objective throughout the decade. Published by Elsevier Inc.

  19. Translation and linguistic validation of the Pediatric Patient-Reported Outcomes Measurement Information System measures into simplified Chinese using cognitive interviewing methodology.

    Science.gov (United States)

    Liu, Yanyan; Hinds, Pamela S; Wang, Jichuan; Correia, Helena; Du, Shizheng; Ding, Jian; Gao, Wen Jun; Yuan, Changrong

    2013-01-01

    The Pediatric Patient-Reported Outcomes Measurement Information System (PROMIS) measures were developed using modern measurement theory and tested in a variety of settings to assess the quality of life, function, and symptoms of children and adolescents experiencing a chronic illness and its treatment. Developed in English, this set of measures had not been translated into Chinese. The objective of this study was to develop the Chinese version of the Pediatric PROMIS measures (C-Ped-PROMIS), specifically 8 short forms, and to pretest the translated measures in children and adolescents through cognitive interviewing methodology. The C-Ped-PROMIS was developed following the standard Functional Assessment of Chronic Illness Therapy Translation Methodology. Bilingual teams from the United States and China reviewed the translation to develop a provisional version, which was then pretested with cognitive interview by probing 10 native Chinese-speaking children aged 8 to 17 years in China. The translation was finalized by the bilingual teams. Most items, response options, and instructions were well understood by the children, and some revisions were made to address patient's comments during the cognitive interview. The results indicated that the C-Ped-PROMIS items were semantically and conceptually equivalent to the original. Children aged 8 to 17 years in China were able to comprehend these measures and express their experience and feelings about illness or their life. The C-Ped-PROMIS is available for psychometric validation. Future work will be directed at translating the rest of the item banks, calibrating them and creating a Chinese final version of the short forms.

  20. Item Banks for Substance Use from the Patient-Reported Outcomes Measurement Information System (PROMIS®): Severity of Use and Positive Appeal of Use*

    Science.gov (United States)

    Pilkonis, Paul A.; Yu, Lan; Dodds, Nathan E.; Johnston, Kelly L.; Lawrence, Suzanne; Hilton, Thomas F.; Daley, Dennis C.; Patkar, Ashwin A.; McCarty, Dennis

    2015-01-01

    Background Two item banks for substance use were developed as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®): severity of substance use and positive appeal of substance use. Methods Qualitative item analysis (including focus groups, cognitive interviewing, expert review, and item revision) reduced an initial pool of more than 5,300 items for substance use to 119 items included in field testing. Items were written in a first-person, past-tense format, with 5 response options reflecting frequency or severity. Both 30-day and 3-month time frames were tested. The calibration sample of 1,336 respondents included 875 individuals from the general population (ascertained through an internet panel) and 461patients from addiction treatment centers participating in the National Drug Abuse Treatment Clinical Trials Network. Results Final banks of 37 and 18 items were calibrated for severity of substance use and positive appeal of substance use, respectively, using the two-parameter graded response model from item response theory (IRT). Initial calibrations were similar for the 30-day and 3-month time frames, and final calibrations used data combined across the time frames, making the items applicable with either interval. Seven-item static short forms were also developed from each item bank. Conclusions Test information curves showed that the PROMIS item banks provided substantial information in a broad range of severity, making them suitable for treatment, observational, and epidemiological research in both clinical and community settings. PMID:26423364

  1. Quality of life and urolithiasis: the patient - reported outcomes measurement information system (PROMIS

    Directory of Open Access Journals (Sweden)

    Nishant Patel

    Full Text Available ABSTRACT Background: With a high rate of recurrence, urolithiasis is a chronic disease that impacts quality of life. The Patient Reported Outcomes Measurement Information System is an NIH validated questionnaire to assess patient quality of life. We evaluated the impact of urolithiasis on quality of life using the NIH-sponsored PROMIS-43 questionnaire. Materials and Methods: Patients reporting to the kidney stone clinic were interviewed to collect information on stone history and demographic information and were asked to complete the PROMIS-43 questionnaire. Quality of life scores were analyzed using gender and age matched groups for the general US population. Statistical comparisons were made based on demographic information and patient stone history. Statistical significance was P<0.05. Results: 103 patients completed the survey. 36% of respondents were male, the average age of the group was 52 years old, with 58% primary income earners, and 35% primary caregivers. 7% had never passed a stone or had a procedure while 17% passed 10 or more stones in their lifetime. Overall, pain and physical function were worse in patients with urolithiasis. Primary income earners had better quality of life while primary caregivers and those with other chronic medical conditions were worse. Patients on dietary and medical therapy had better quality of life scores. Conclusions: Urolithiasis patients subjectively have worse pain and physical function than the general population. The impact of pain on quality of life was greatest in those patients who had more stone episodes, underscoring the importance of preventive measures. Stone prevention measures improve quality of life.

  2. Incorporating PROMIS Symptom Measures into Primary Care Practice-a Randomized Clinical Trial.

    Science.gov (United States)

    Kroenke, Kurt; Talib, Tasneem L; Stump, Timothy E; Kean, Jacob; Haggstrom, David A; DeChant, Paige; Lake, Kittie R; Stout, Madison; Monahan, Patrick O

    2018-04-05

    Symptoms account for more than 400 million clinic visits annually in the USA. The SPADE symptoms (sleep, pain, anxiety, depression, and low energy/fatigue) are particularly prevalent and undertreated. To assess the effectiveness of providing PROMIS (Patient-Reported Outcome Measure Information System) symptom scores to clinicians on symptom outcomes. Randomized clinical trial conducted from March 2015 through May 2016 in general internal medicine and family practice clinics in an academic healthcare system. Primary care patients who screened positive for at least one SPADE symptom. After completing the PROMIS symptom measures electronically immediately prior to their visit, the 300 study participants were randomized to a feedback group in which their clinician received a visual display of symptom scores or a control group in which scores were not provided to clinicians. The primary outcome was the 3-month change in composite SPADE score. Secondary outcomes were individual symptom scores, symptom documentation in the clinic note, symptom-specific clinician actions, and patient satisfaction. Most patients (84%) had multiple clinically significant (T-score ≥ 55) SPADE symptoms. Both groups demonstrated moderate symptom improvement with a non-significant trend favoring the feedback compared to control group (between-group difference in composite T-score improvement, 1.1; P = 0.17). Symptoms present at baseline resolved at 3-month follow-up only one third of the time, and patients frequently still desired treatment. Except for pain, clinically significant symptoms were documented less than half the time. Neither symptom documentation, symptom-specific clinician actions, nor patient satisfaction differed between treatment arms. Predictors of greater symptom improvement included female sex, black race, fewer medical conditions, and receiving care in a family medicine clinic. Simple feedback of symptom scores to primary care clinicians in the absence of

  3. PROMIS®-29 v2.0 profile physical and mental health summary scores.

    Science.gov (United States)

    Hays, Ron D; Spritzer, Karen L; Schalet, Benjamin D; Cella, David

    2018-03-22

    The PROMIS-29 v2.0 profile assesses pain intensity using a single 0-10 numeric rating item and seven health domains (physical function, fatigue, pain interference, depressive symptoms, anxiety, ability to participate in social roles and activities, and sleep disturbance) using four items per domain. This paper describes the development of physical and mental health summary scores for the PROMIS-29 v2.0. We conducted factor analyses of PROMIS-29 scales on data collected from two internet panels (n = 3000 and 2000). Confirmatory factor analyses provided support for a physical health factor defined by physical function, pain (interference and intensity), and ability to participate in social roles and activities, and a mental health factor defined primarily by emotional distress (anxiety and depressive symptoms). Reliabilities for these two summary scores were 0.98 (physical health) and 0.97 (mental health). Correlations of the PROMIS-29 v2.0 physical and mental health summary scores with chronic conditions and other health-related quality of life measures were consistent with a priori hypotheses. This study develops and provides preliminary evidence supporting the reliability and validity of PROMIS-29 v2.0 physical and mental health summary scores that can be used in future studies to assess impacts of health care interventions and track changes in health over time. Further evaluation of these and alternative summary measures is recommended.

  4. Are Mindfulness and Self-Compassion Associated with Sleep and Resilience in Health Professionals?

    Science.gov (United States)

    Kemper, Kathi J; Mo, Xiaokui; Khayat, Rami

    2015-08-01

    To describe the relationship between trainable qualities (mindfulness and self-compassion), with factors conceptually related to burnout and quality of care (sleep and resilience) in young health professionals and trainees. Cross-sectional survey. Large Midwestern academic health center. 213 clinicians and trainees. Sleep and resilience were assessed by using the 8-item PROMIS Sleep scale and the 6-item Brief Resilience Scale. Mindfulness and self-compassion were assessed using the 10-item Cognitive and Affective Mindfulness Scale, Revised and the 12-item Self-Compassion Scale. Health was assessed with Patient-Reported Outcomes Measurement Information System (PROMIS) Global Health measures, and stress was assessed with the 10-item Perceived Stress Scale. After examination of descriptive statistics and Pearson correlations, multiple regression analyses were done to determine whether mindfulness and self-compassion were associated with better sleep and resilience. Respondents had an average age of 28 years; 73% were female. Professions included dieticians (11%), nurses (14%), physicians (38%), social workers (24%), and other (12%). Univariate analyses showed normative values for all variables. Sleep disturbances were significantly and most strongly correlated with perceived stress and poorer health, but also with less mindfulness and self-compassion. Resilience was strongly and significantly correlated with less stress and better mental health, more mindfulness, and more self-compassion. In these young health professionals and trainees, sleep and resilience are correlated with both mindfulness and self-compassion. Prospective studies are needed to determine whether training to increase mindfulness and self-compassion can improve clinicians' sleep and resilience or whether decreasing sleep disturbances and building resilience improves mindfulness and compassion.

  5. Initial report of the cancer Patient-Reported Outcomes Measurement Information System (PROMIS) sexual function committee: review of sexual function measures and domains used in oncology.

    Science.gov (United States)

    Jeffery, Diana D; Tzeng, Janice P; Keefe, Francis J; Porter, Laura S; Hahn, Elizabeth A; Flynn, Kathryn E; Reeve, Bryce B; Weinfurt, Kevin P

    2009-03-15

    For this report, the authors described the initial activities of the Cancer Patient-Reported Outcomes Measurement Information System (PROMIS)-Sexual Function domain group, which is part of the National Institutes of Health Roadmap Initiative to develop brief questionnaires or individually tailored assessments of quality-of-life domains. Presented are a literature review of sexual function measures used in cancer populations and descriptions of the domains found in those measures. By using a consensus-driven approach, an electronic bibliographic search was conducted for articles that were published from 1991 to 2007, and 486 articles were identified for in-depth review. In total, 257 articles reported the administration of a psychometrically evaluated sexual function measure to individuals who were diagnosed with cancer. Apart from the University of California-Los Angeles Prostate Cancer Index, the International Index of Erectile Function, and the Female Sexual Function Index, the 31 identified measures have not been tested widely in cancer populations. Most measures were multidimensional and included domains related to the sexual response cycle and to general sexual satisfaction. The current review supports the need for a flexible, psychometrically robust measure of sexual function for use in oncology settings and strongly justifies the development of the PROMIS-Sexual Function instrument. When the PROMIS-Sexual Function instrument is available publicly, cancer clinicians and researchers will have another measure with which to assess patient-reported sexual function outcomes in addition to the few legacy measures that were identified through this review. Copyright (c) 2009 American Cancer Society.

  6. Varying the item format improved the range of measurement in patient-reported outcome measures assessing physical function.

    Science.gov (United States)

    Liegl, Gregor; Gandek, Barbara; Fischer, H Felix; Bjorner, Jakob B; Ware, John E; Rose, Matthias; Fries, James F; Nolte, Sandra

    2017-03-21

    Physical function (PF) is a core patient-reported outcome domain in clinical trials in rheumatic diseases. Frequently used PF measures have ceiling effects, leading to large sample size requirements and low sensitivity to change. In most of these instruments, the response category that indicates the highest PF level is the statement that one is able to perform a given physical activity without any limitations or difficulty. This study investigates whether using an item format with an extended response scale, allowing respondents to state that the performance of an activity is easy or very easy, increases the range of precise measurement of self-reported PF. Three five-item PF short forms were constructed from the Patient-Reported Outcomes Measurement Information System (PROMIS®) wave 1 data. All forms included the same physical activities but varied in item stem and response scale: format A ("Are you able to …"; "without any difficulty"/"unable to do"); format B ("Does your health now limit you …"; "not at all"/"cannot do"); format C ("How difficult is it for you to …"; "very easy"/"impossible"). Each short-form item was answered by 2217-2835 subjects. We evaluated unidimensionality and estimated a graded response model for the 15 short-form items and remaining 119 items of the PROMIS PF bank to compare item and test information for the short forms along the PF continuum. We then used simulated data for five groups with different PF levels to illustrate differences in scoring precision between the short forms using different item formats. Sufficient unidimensionality of all short-form items and the original PF item bank was supported. Compared to formats A and B, format C increased the range of reliable measurement by about 0.5 standard deviations on the positive side of the PF continuum of the sample, provided more item information, and was more useful in distinguishing known groups with above-average functioning. Using an item format with an extended

  7. Careless responding in internet-based quality of life assessments.

    Science.gov (United States)

    Schneider, Stefan; May, Marcella; Stone, Arthur A

    2018-04-01

    Quality of life (QoL) measurement relies upon participants providing meaningful responses, but not all respondents may pay sufficient attention when completing self-reported QoL measures. This study examined the impact of careless responding on the reliability and validity of Internet-based QoL assessments. Internet panelists (n = 2000) completed Patient-Reported Outcomes Measurement Information System (PROMIS®) short-forms (depression, fatigue, pain impact, applied cognitive abilities) and single-item QoL measures (global health, pain intensity) as part of a larger survey that included multiple checks of whether participants paid attention to the items. Latent class analysis was used to identify groups of non-careless and careless responders from the attentiveness checks. Analyses compared psychometric properties of the QoL measures (reliability of PROMIS short-forms, correlations among QoL scores, "known-groups" validity) between non-careless and careless responder groups. Whether person-fit statistics derived from PROMIS measures accurately discriminated careless and non-careless responders was also examined. About 7.4% of participants were classified as careless responders. No substantial differences in the reliability of PROMIS measures between non-careless and careless responder groups were observed. However, careless responding meaningfully and significantly affected the correlations among QoL domains, as well as the magnitude of differences in QoL between medical and disability groups (presence or absence of disability, depression diagnosis, chronic pain diagnosis). Person-fit statistics significantly and moderately distinguished between non-careless and careless responders. The results support the importance of identifying and screening out careless responders to ensure high-quality self-report data in Internet-based QoL research.

  8. The emotion dysregulation inventory: Psychometric properties and item response theory calibration in an autism spectrum disorder sample.

    Science.gov (United States)

    Mazefsky, Carla A; Yu, Lan; White, Susan W; Siegel, Matthew; Pilkonis, Paul A

    2018-04-06

    Individuals with autism spectrum disorder (ASD) often present with prominent emotion dysregulation that requires treatment but can be difficult to measure. The Emotion Dysregulation Inventory (EDI) was created using methods developed by the Patient-Reported Outcomes Measurement Information System (PROMIS ® ) to capture observable indicators of poor emotion regulation. Caregivers of 1,755 youth with ASD completed 66 candidate EDI items, and the final 30 items were selected based on classical test theory and item response theory (IRT) analyses. The analyses identified two factors: (a) Reactivity, characterized by intense, rapidly escalating, sustained, and poorly regulated negative emotional reactions, and (b) Dysphoria, characterized by anhedonia, sadness, and nervousness. The final items did not show differential item functioning (DIF) based on gender, age, intellectual ability, or verbal ability. Because the final items were calibrated using IRT, even a small number of items offers high precision, minimizing respondent burden. IRT co-calibration of the EDI with related measures demonstrated its superiority in assessing the severity of emotion dysregulation with as few as seven items. Validity of the EDI was supported by expert review, its association with related constructs (e.g., anxiety and depression symptoms, aggression), higher scores in psychiatric inpatients with ASD compared to a community ASD sample, and demonstration of test-retest stability and sensitivity to change. In sum, the EDI provides an efficient and sensitive method to measure emotion dysregulation for clinical assessment, monitoring, and research in youth with ASD of any level of cognitive or verbal ability. Autism Res 2018. © 2018 International Society for Autism Research, Wiley Periodicals, Inc. This paper describes a new measure of poor emotional control called the Emotion Dysregulation Inventory (EDI). Caregivers of 1,755 youth with ASD completed candidate items, and advanced statistical

  9. Development of a composite measure of physical functioning for older persons

    Science.gov (United States)

    Gross, Alden L.; Jones, Richard N.; Inouye, Sharon K.

    2015-01-01

    We scaled a measure of physical functioning to a population-based normative sample by extending self-reported basic and instrumental activities of daily living with items from the MOS SF-12. We used item response theory to place items administered to a sample of older elective surgery patients on a common metric linked to the PROMIS normative sample using published data. The summary measure for physical functioning was internally consist (Cronbach’s alpha=0.83), reliable across a broad range of functioning, and was moderately correlated with walking speed (r=0.52) and energy expenditure (r=0.40). Demonstrating predictive criterion validity, less impaired scores were associated with lower risk of discharge to a rehabilitation facility (OR=0.38, 95% CI: 0.22,0.66) and shorter hospital stays (IRR=0.87, 95% CI: 0.79,0.97). Our approach may facilitate direct comparison of physical functioning measures across existing and future studies using a common, population-based metric, when overlapping items with the NIH PROMIS item bank are present. PMID:25651587

  10. Kristološki naglasci govora o Božjem milosrđu u promišljanjima izabranih crkvenih otaca

    OpenAIRE

    Filić, Andrea

    2017-01-01

    Polazišno pitanje ovoga članka, nadahnuto tvrdnjama Waltera Kaspera, glasi: Ako milosrđe po definiciji u sebi nužno uključuje trpljenje može li se, dok držimo sigurnim da je netrpljivost jedna od bitnih vlastitosti božanske naravi, o njemu uopće govoriti, kako Kasper potiče, kao o temeljnom Božjem svojstvu i odrednici Božje biti? To se pitanje i odgovor na nj obrađuje pod kristološkim vidom i to na temelju promišljanja izabranih crkvenih otaca u čijim djelima pronalazimo govor o Božjem milosr...

  11. Comparability of the Patient-Reported Outcomes Measurement Information System Pediatric short form symptom measures across culture: examination between Chinese and American children with cancer.

    Science.gov (United States)

    Liu, Yanyan; Yuan, Changrong; Wang, Jichuan; Brown, Jeanne Geiger; Zhou, Fen; Zhao, Xiufang; Shen, Min; Hinds, Pamela S

    2016-10-01

    Patient-Reported Outcomes Measurement Information System (PROMIS) Pediatric forms measure symptoms and function of pediatric patients experiencing chronic disease by using the same measures. Comparability is one of the most important purposes of the PROMIS initiative. This study aimed to test the factorial structures of four symptom measures (i.e., Anxiety, Depression, Fatigue, and Pain Interference) in the original English and the Chinese versions and examine the measurement invariance of the measures across two cultures. Four PROMIS Pediatric measures were used to assess symptoms, respectively, in Chinese (n = 232) and American (n = 200) children and adolescents (8-17 years old) in treatment for cancer or in survivorship. The categorical confirmatory factor analysis (CCFA) model was used to examine factorial structures, and multigroup CCFA was applied to test measurement invariance of these measures between the Chinese and American samples. The CCFA models of the four PROMIS Pediatric symptom measures fit the data well for both the Chinese and American children and adolescents. Minor partial measurement invariance was identified. Factor means and factor variances of the four PROMIS measures were not significantly different between the two populations. Our results provide evidence that the four PROMIS Pediatric symptom measures have valid factorial structures and a statistical property of measurement invariance across American and Chinese children and adolescents with cancer. This means that the items of these measures were interpreted in a conceptually similar manner by two groups. They could be readily used for meaningful cross-cultural comparisons involving pediatric oncology patients in these two countries.

  12. Item-focussed Trees for the Identification of Items in Differential Item Functioning.

    Science.gov (United States)

    Tutz, Gerhard; Berger, Moritz

    2016-09-01

    A novel method for the identification of differential item functioning (DIF) by means of recursive partitioning techniques is proposed. We assume an extension of the Rasch model that allows for DIF being induced by an arbitrary number of covariates for each item. Recursive partitioning on the item level results in one tree for each item and leads to simultaneous selection of items and variables that induce DIF. For each item, it is possible to detect groups of subjects with different item difficulties, defined by combinations of characteristics that are not pre-specified. The way a DIF item is determined by covariates is visualized in a small tree and therefore easily accessible. An algorithm is proposed that is based on permutation tests. Various simulation studies, including the comparison with traditional approaches to identify items with DIF, show the applicability and the competitive performance of the method. Two applications illustrate the usefulness and the advantages of the new method.

  13. Development of a proxy-reported pulmonary outcome scale for preterm infants with bronchopulmonary dysplasia

    Directory of Open Access Journals (Sweden)

    Laughon Matthew M

    2011-07-01

    Full Text Available Abstract Background To develop an accurate, proxy-reported bedside measurement tool for assessment of the severity of bronchopulmonary dysplasia (also called chronic lung disease in preterm infants to supplement providers' current biometric measurements of the disease. Methods We adapted Patient-Reported Outcomes Measurement Information System (PROMIS methodology to develop the Proxy-Reported Pulmonary Outcomes Scale (PRPOS. A multidisciplinary group of registered nurses, nurse practitioners, neonatologists, developmental specialists, and feeding specialists at five academic medical centers participated in the PRPOS development, which included five phases: (1 identification of domains, items, and responses; (2 item classification and selection using a modified Delphi process; (3 focus group exploration of items and response options; (4 cognitive interviews on a preliminary scale; and (5 final revision before field testing. Results Each phase of the process helped us to identify, classify, review, and revise possible domains, questions, and response options. The final items for field testing include 26 questions or observations that a nurse assesses before, during, and after routine care time and feeding. Conclusions We successfully created a prototype scale using modified PROMIS methodology. This process can serve as a model for the development of proxy-reported outcomes scales in other pediatric populations.

  14. Validation of Patient-Reported Outcomes Measurement Information System Short Forms for Use in Childhood-Onset Systemic Lupus Erythematosus.

    Science.gov (United States)

    Jones, Jordan T; Carle, Adam C; Wootton, Janet; Liberio, Brianna; Lee, Jiha; Schanberg, Laura E; Ying, Jun; Morgan DeWitt, Esi; Brunner, Hermine I

    2017-01-01

    To validate the pediatric Patient-Reported Outcomes Measurement Information System short forms (PROMIS-SFs) in childhood-onset systemic lupus erythematosus (SLE) in a clinical setting. At 3 study visits, childhood-onset SLE patients completed the PROMIS-SFs (anger, anxiety, depressive symptoms, fatigue, physical function-mobility, physical function-upper extremity, pain interference, and peer relationships) using the PROMIS assessment center, and health-related quality of life (HRQoL) legacy measures (Pediatric Quality of Life Inventory, Childhood Health Assessment Questionnaire, Simple Measure of Impact of Lupus Erythematosus in Youngsters [SMILEY], and visual analog scales [VAS] of pain and well-being). Physicians rated childhood-onset SLE activity on a VAS and completed the Systemic Lupus Erythematosus Disease Activity Index 2000. Using a global rating scale of change (GRC) between study visits, physicians rated change of childhood-onset SLE activity (GRC-MD1: better/same/worse) and change of patient overall health (GRC-MD2: better/same/worse). Questionnaire scores were compared in support of validity and responsiveness to change (external standards: GRC-MD1, GRC-MD2). In this population-based cohort (n = 100) with a mean age of 15.8 years (range 10-20 years), the PROMIS-SFs were completed in less than 5 minutes in a clinical setting. The PROMIS-SF scores correlated at least moderately (Pearson's r ≥ 0.5) with those of legacy HRQoL measures, except for the SMILEY. Measures of childhood-onset SLE activity did not correlate with the PROMIS-SFs. Responsiveness to change of the PROMIS-SFs was supported by path, mixed-model, and correlation analyses. To assess HRQoL in childhood-onset SLE, the PROMIS-SFs demonstrated feasibility, internal consistency, construct validity, and responsiveness to change in a clinical setting. © 2016, American College of Rheumatology.

  15. Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Applied Cognition - General Concerns, Short Forms in Ethnically Diverse Groups.

    Science.gov (United States)

    Fieo, Robert; Ocepek-Welikson, Katja; Kleinman, Marjorie; Eimicke, Joseph P; Crane, Paul K; Cella, David; Teresi, Jeanne A

    2016-01-01

    The goals of these analyses were to examine the psychometric properties and measurement equivalence of a self-reported cognition measure, the Patient Reported Outcome Measurement Information System ® (PROMIS ® ) Applied Cognition - General Concerns short form. These items are also found in the PROMIS Cognitive Function (version 2) item bank. This scale consists of eight items related to subjective cognitive concerns. Differential item functioning (DIF) analyses of gender, education, race, age, and (Spanish) language were performed using an ethnically diverse sample ( n = 5,477) of individuals with cancer. This is the first analysis examining DIF in this item set across ethnic and racial groups. DIF hypotheses were derived by asking content experts to indicate whether they posited DIF for each item and to specify the direction. The principal DIF analytic model was item response theory (IRT) using the graded response model for polytomous data, with accompanying Wald tests and measures of magnitude. Sensitivity analyses were conducted using ordinal logistic regression (OLR) with a latent conditioning variable. IRT-based reliability, precision and information indices were estimated. DIF was identified consistently only for the item, brain not working as well as usual. After correction for multiple comparisons, this item showed significant DIF for both the primary and sensitivity analyses. Black respondents and Hispanics in comparison to White non-Hispanic respondents evidenced a lower conditional probability of endorsing the item, brain not working as well as usual. The same pattern was observed for the education grouping variable: as compared to those with a graduate degree, conditioning on overall level of subjective cognitive concerns, those with less than high school education also had a lower probability of endorsing this item. DIF was also observed for age for two items after correction for multiple comparisons for both the IRT and OLR-based models: "I have had

  16. Item validity vs. item discrimination index: a redundancy?

    Science.gov (United States)

    Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.

    2018-03-01

    In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.

  17. Sources of interference in item and associative recognition memory.

    Science.gov (United States)

    Osth, Adam F; Dennis, Simon

    2015-04-01

    A powerful theoretical framework for exploring recognition memory is the global matching framework, in which a cue's memory strength reflects the similarity of the retrieval cues being matched against the contents of memory simultaneously. Contributions at retrieval can be categorized as matches and mismatches to the item and context cues, including the self match (match on item and context), item noise (match on context, mismatch on item), context noise (match on item, mismatch on context), and background noise (mismatch on item and context). We present a model that directly parameterizes the matches and mismatches to the item and context cues, which enables estimation of the magnitude of each interference contribution (item noise, context noise, and background noise). The model was fit within a hierarchical Bayesian framework to 10 recognition memory datasets that use manipulations of strength, list length, list strength, word frequency, study-test delay, and stimulus class in item and associative recognition. Estimates of the model parameters revealed at most a small contribution of item noise that varies by stimulus class, with virtually no item noise for single words and scenes. Despite the unpopularity of background noise in recognition memory models, background noise estimates dominated at retrieval across nearly all stimulus classes with the exception of high frequency words, which exhibited equivalent levels of context noise and background noise. These parameter estimates suggest that the majority of interference in recognition memory stems from experiences acquired before the learning episode. (c) 2015 APA, all rights reserved).

  18. [Evaluation of the Charing Cross Venous Ulcer Questionnaire in patients with chronic venous ulcers in Uruguay].

    Science.gov (United States)

    Tafernaberry, Gabriela; Otero, Gabriela; Agorio, Caroline; Dapueto, Juan J

    2016-01-01

    Chronic venous ulcers (CVU) represent a frequent condition, with difficult therapeutic approaches, that impact on patients’ quality of life, and generate an economic burden to patients and health systems. To perform the cultural adaptation and initial evaluation of the Charing Cross Venous Ulcer Questionnaire (CCVUQ) for Uruguay, and to study the health-related quality of life (HRQL) of patients with CVU. The translated and culturally adapted version of the CCVUQ was applied to a convenience sample of 50 patients. In addition, the PROMIS Global Health Survey was included in the assessment. Both questionnaires showed good internal consistency (Cronbach alfa > 0.70). A statistically significant association was observed between the CCVUQ total scores, its subscales and both dimensions of the PROMIS: Global Physical (GPH) and Global Mental Health (GMH) (rho ≥ 0.40). The CCVUQ mean score was 54.9 ± 42 points while GPH and GMH mean scores were 37.9 ± 29 points, and 43.1 ± 35.1 points respectively. Simple linear regression showed that patients with higher income reported better emotional well-being, while in younger patients, ulcers had a higher impact on Emotional Status and Cosmetics. The translated and adapted version of the CCVUQ was easy to comprehend and apply, showing good psychometric properties. When used in association with the PROMIS Global Health Measure it provides complementary information. HRQL was severely affected in the study sample.

  19. Development and evaluation of CAHPS survey items assessing how well healthcare providers address health literacy.

    Science.gov (United States)

    Weidmer, Beverly A; Brach, Cindy; Hays, Ron D

    2012-09-01

    The complexity of health information often exceeds patients' skills to understand and use it. To develop survey items assessing how well healthcare providers communicate health information. Domains and items for the Consumer Assessment of Healthcare Providers and Systems (CAHPS) Item Set for Addressing Health Literacy were identified through an environmental scan and input from stakeholders. The draft item set was translated into Spanish and pretested in both English and Spanish. The revised item set was field tested with a randomly selected sample of adult patients from 2 sites using mail and telephonic data collection. Item-scale correlations, confirmatory factor analysis, and internal consistency reliability estimates were estimated to assess how well the survey items performed and identify composite measures. Finally, we regressed the CAHPS global rating of the provider item on the CAHPS core communication composite and the new health literacy composites. A total of 601 completed surveys were obtained (52% response rate). Two composite measures were identified: (1) Communication to Improve Health Literacy (16 items); and (2) How Well Providers Communicate About Medicines (6 items). These 2 composites were significantly uniquely associated with the global rating of the provider (communication to improve health literacy: PLiteracy composite accounted for 90% of the variance of the original 16-item composite. This study provides support for reliability and validity of the CAHPS Item Set for Addressing Health Literacy. These items can serve to assess whether healthcare providers have communicated effectively with their patients and as a tool for quality improvement.

  20. Using automatic item generation to create multiple-choice test items.

    Science.gov (United States)

    Gierl, Mark J; Lai, Hollis; Turner, Simon R

    2012-08-01

    Many tests of medical knowledge, from the undergraduate level to the level of certification and licensure, contain multiple-choice items. Although these are efficient in measuring examinees' knowledge and skills across diverse content areas, multiple-choice items are time-consuming and expensive to create. Changes in student assessment brought about by new forms of computer-based testing have created the demand for large numbers of multiple-choice items. Our current approaches to item development cannot meet this demand. We present a methodology for developing multiple-choice items based on automatic item generation (AIG) concepts and procedures. We describe a three-stage approach to AIG and we illustrate this approach by generating multiple-choice items for a medical licensure test in the content area of surgery. To generate multiple-choice items, our method requires a three-stage process. Firstly, a cognitive model is created by content specialists. Secondly, item models are developed using the content from the cognitive model. Thirdly, items are generated from the item models using computer software. Using this methodology, we generated 1248 multiple-choice items from one item model. Automatic item generation is a process that involves using models to generate items using computer technology. With our method, content specialists identify and structure the content for the test items, and computer technology systematically combines the content to generate new test items. By combining these outcomes, items can be generated automatically. © Blackwell Publishing Ltd 2012.

  1. Item response theory analysis of Centers for Disease Control and Prevention Health-Related Quality of Life (CDC HRQOL) items in adults with arthritis.

    Science.gov (United States)

    Mielenz, Thelma J; Callahan, Leigh F; Edwards, Michael C

    2016-03-12

    Examine the feasibility of performing an item response theory (IRT) analysis on two of the Centers for Disease Control and Prevention health-related quality of life (CDC HRQOL) modules - the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM). Previous principal components analyses confirm that the two scales both assess a mix of mental (CDC-MH) and physical health (CDC-PH). The purpose is to conduct item response theory (IRT) analysis on the CDC-MH and CDC-PH scales separately. 2182 patients with self-reported or physician-diagnosed arthritis completed a cross-sectional survey including HDCM and HDSM items. Besides global health, the other 8 items ask the number of days that some statement was true; we chose to recode the data into 8 categories based on observed clustering. The IRT assumptions were assessed using confirmatory factor analysis and the data could be modeled using an unidimensional IRT model. The graded response model was used for IRT analyses and CDC-MH and CDC-PH scales were analyzed separately in flexMIRT. The IRT parameter estimates for the five-item CDC-PH all appeared reasonable. The three-item CDC-MH did not have reasonable parameter estimates. The CDC-PH scale is amenable to IRT analysis but the existing The CDC-MH scale is not. We suggest either using the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM) as they currently stand or the CDC-PH scale alone if the primary goal is to measure physical health related HRQOL.

  2. PENGARUH PENAMBAHAN BIOAKTIVATOR EM4 DAN PROMI DALAM PEMBUATAN PUPUK CAIR ORGANIK DARI SAMPAH ORGANIK RUMAH TANGGA

    Directory of Open Access Journals (Sweden)

    Marlinda

    2015-10-01

    Full Text Available Utilization of household organic waste each year is increasing due to various problems will arise such as air pollution, can have an impact of the disease and the danger of flooding. Waste that can be used are organic waste such as leftover vegetables, fruits, leftover dried leaves and twigs. Household organic waste is the most widely used in daily life because of the need for food so that the quantity is more and more produced and will accumulate because the ground is not capable of degrading in significant amounts, so as to damage the environment in the form of air pollution (odor and can cause impact disease. Along with the impact of the organic waste to be treated in a household environment prior to turning it into more useful forms such as liquid organic fertilizer. Liquid fertilizers are more easily absorbed by plants and in the form konsetrat so it is more economical because it can be diluted. This research aims to use household organic waste into liquid fertilizer and see the impact of bio-activator EM4 and Promi to manufacture liquid fertilizer from organic C content. The method used in the form of waste materials vegetables such as kale, mustard greens, spinach and carrots as well as dried leaves 300 g cleaned and cut into small pieces and then put in the composter before fermentation in lightly mist or moistened with a bio-activator before hand and then fermented for 7 days , The fermentation process is done with a variety of bio-activator 2.5 mL, 5 mL, 7.5 mL, 10 mL, and 12.5 mL. Organic liquid fertilizer produced by using EM4 and Promi can be used as a bio-activator in fertilizer use but bio-activator EM4 provide more effective work in mengdegradasi organic waste to produce high levels of organic C approximately 23% compared to bio-activator Compromise approximately 18% and so is the content of other compounds EM4 such as nitrogen for 3.8%, 3.0% P2O5 content of K2O content of 4.2% and 3.2% Compromise Nitrogen levels, levels of 2

  3. The Role of Item Models in Automatic Item Generation

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2012-01-01

    Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…

  4. Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

    Science.gov (United States)

    Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

    2013-07-01

    Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.

  5. Problems with the factor analysis of items: Solutions based on item response theory and item parcelling

    Directory of Open Access Journals (Sweden)

    Gideon P. De Bruin

    2004-10-01

    Full Text Available The factor analysis of items often produces spurious results in the sense that unidimensional scales appear multidimensional. This may be ascribed to failure in meeting the assumptions of linearity and normality on which factor analysis is based. Item response theory is explicitly designed for the modelling of the non-linear relations between ordinal variables and provides a strong alternative to the factor analysis of items. Items may also be combined in parcels that are more likely to satisfy the assumptions of factor analysis than do the items. The use of the Rasch rating scale model and the factor analysis of parcels is illustrated with data obtained with the Locus of Control Inventory. The results of these analyses are compared with the results obtained through the factor analysis of items. It is shown that the Rasch rating scale model and the factoring of parcels produce superior results to the factor analysis of items. Recommendations for the analysis of scales are made. Opsomming Die faktorontleding van items lewer dikwels misleidende resultate op, veral in die opsig dat eendimensionele skale as meerdimensioneel voorkom. Hierdie resultate kan dikwels daaraan toegeskryf word dat daar nie aan die aannames van lineariteit en normaliteit waarop faktorontleding berus, voldoen word nie. Itemresponsteorie, wat eksplisiet vir die modellering van die nie-liniêre verbande tussen ordinale items ontwerp is, bied ’n aantreklike alternatief vir die faktorontleding van items. Items kan ook in pakkies gegroepeer word wat meer waarskynlik aan die aannames van faktorontleding voldoen as individuele items. Die gebruik van die Rasch beoordelingskaalmodel en die faktorontleding van pakkies word aan die hand van data wat met die Lokus van Beheervraelys verkry is, gedemonstreer. Die resultate van hierdie ontledings word vergelyk met die resultate wat deur ‘n faktorontleding van die individuele items verkry is. Die resultate dui daarop dat die Rasch

  6. IRT-Estimated Reliability for Tests Containing Mixed Item Formats

    Science.gov (United States)

    Shu, Lianghua; Schwarz, Richard D.

    2014-01-01

    As a global measure of precision, item response theory (IRT) estimated reliability is derived for four coefficients (Cronbach's a, Feldt-Raju, stratified a, and marginal reliability). Models with different underlying assumptions concerning test-part similarity are discussed. A detailed computational example is presented for the targeted…

  7. Item information and discrimination functions for trinary PCM items

    NARCIS (Netherlands)

    Akkermans, Wies; Muraki, Eiji

    1997-01-01

    For trinary partial credit items the shape of the item information and the item discrimination function is examined in relation to the item parameters. In particular, it is shown that these functions are unimodal if δ2 – δ1 < 4 ln 2 and bimodal otherwise. The locations and values of the maxima are

  8. Quantitative Analysis of Complex Multiple-Choice Items in Science Technology and Society: Item Scaling

    Directory of Open Access Journals (Sweden)

    Ángel Vázquez Alonso

    2005-05-01

    Full Text Available The scarce attention to assessment and evaluation in science education research has been especially harmful for Science-Technology-Society (STS education, due to the dialectic, tentative, value-laden, and controversial nature of most STS topics. To overcome the methodological pitfalls of the STS assessment instruments used in the past, an empirically developed instrument (VOSTS, Views on Science-Technology-Society have been suggested. Some methodological proposals, namely the multiple response models and the computing of a global attitudinal index, were suggested to improve the item implementation. The final step of these methodological proposals requires the categorization of STS statements. This paper describes the process of categorization through a scaling procedure ruled by a panel of experts, acting as judges, according to the body of knowledge from history, epistemology, and sociology of science. The statement categorization allows for the sound foundation of STS items, which is useful in educational assessment and science education research, and may also increase teachers’ self-confidence in the development of the STS curriculum for science classrooms.

  9. Item level diagnostics and model - data fit in item response theory ...

    African Journals Online (AJOL)

    Item response theory (IRT) is a framework for modeling and analyzing item response data. Item-level modeling gives IRT advantages over classical test theory. The fit of an item score pattern to an item response theory (IRT) models is a necessary condition that must be assessed for further use of item and models that best fit ...

  10. An item response theory analysis of Harter's Self-Perception Profile for children or why strong clinical scales should be distrusted.

    Science.gov (United States)

    Egberink, Iris J L; Meijer, Rob R

    2011-06-01

    The authors investigated the psychometric properties of the subscales of the Self-Perception Profile for Children with item response theory (IRT) models using a sample of 611 children. Results from a nonparametric Mokken analysis and a parametric IRT approach for boys (n = 268) and girls (n = 343) were compared. The authors found that most scales formed weak scales and that measurement precision was relatively low and only present for latent trait values indicating low self-perception. The subscales Physical Appearance and Global Self-Worth formed one strong scale. Children seem to interpret Global Self-Worth items as if they measure Physical Appearance. Furthermore, the authors found that strong Mokken scales (such as Global Self-Worth) consisted mostly of items that repeat the same item content. They conclude that researchers should be very careful in interpreting the total scores on the different Self-Perception Profile for Children scales. Finally, implications for further research are discussed.

  11. Bifactor and Item Response Theory Analyses of Interviewer Report Scales of Cognitive Impairment in Schizophrenia

    Science.gov (United States)

    Reise, Steven P.; Ventura, Joseph; Keefe, Richard S. E.; Baade, Lyle E.; Gold, James M.; Green, Michael F.; Kern, Robert S.; Mesholam-Gately, Raquelle; Nuechterlein, Keith H.; Seidman, Larry J.; Bilder, Robert

    2011-01-01

    A psychometric analysis of 2 interview-based measures of cognitive deficits was conducted: the 21-item Clinical Global Impression of Cognition in Schizophrenia (CGI-CogS; Ventura et al., 2008), and the 20-item Schizophrenia Cognition Rating Scale (SCoRS; Keefe et al., 2006), which were administered on 2 occasions to a sample of people with…

  12. The development of a clinical outcomes survey research application: Assessment Center.

    Science.gov (United States)

    Gershon, Richard; Rothrock, Nan E; Hanrahan, Rachel T; Jansky, Liz J; Harniss, Mark; Riley, William

    2010-06-01

    The National Institutes of Health sponsored Patient-Reported Outcome Measurement Information System (PROMIS) aimed to create item banks and computerized adaptive tests (CATs) across multiple domains for individuals with a range of chronic diseases. Web-based software was created to enable a researcher to create study-specific Websites that could administer PROMIS CATs and other instruments to research participants or clinical samples. This paper outlines the process used to develop a user-friendly, free, Web-based resource (Assessment Center) for storage, retrieval, organization, sharing, and administration of patient-reported outcomes (PRO) instruments. Joint Application Design (JAD) sessions were conducted with representatives from numerous institutions in order to supply a general wish list of features. Use Cases were then written to ensure that end user expectations matched programmer specifications. Program development included daily programmer "scrum" sessions, weekly Usability Acceptability Testing (UAT) and continuous Quality Assurance (QA) activities pre- and post-release. Assessment Center includes features that promote instrument development including item histories, data management, and storage of statistical analysis results. This case study of software development highlights the collection and incorporation of user input throughout the development process. Potential future applications of Assessment Center in clinical research are discussed.

  13. A Bifactor Multidimensional Item Response Theory Model for Differential Item Functioning Analysis on Testlet-Based Items

    Science.gov (United States)

    Fukuhara, Hirotaka; Kamata, Akihito

    2011-01-01

    A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…

  14. Gender-Based Differential Item Performance in Mathematics Achievement Items.

    Science.gov (United States)

    Doolittle, Allen E.; Cleary, T. Anne

    1987-01-01

    Eight randomly equivalent samples of high school seniors were each given a unique form of the ACT Assessment Mathematics Usage Test (ACTM). Signed measures of differential item performance (DIP) were obtained for each item in the eight ACTM forms. DIP estimates were analyzed and a significant item category effect was found. (Author/LMO)

  15. Piecewise Polynomial Fitting with Trend Item Removal and Its Application in a Cab Vibration Test

    Directory of Open Access Journals (Sweden)

    Wu Ren

    2018-01-01

    Full Text Available The trend item of a long-term vibration signal is difficult to remove. This paper proposes a piecewise integration method to remove trend items. Examples of direct integration without trend item removal, global integration after piecewise polynomial fitting with trend item removal, and direct integration after piecewise polynomial fitting with trend item removal were simulated. The results showed that direct integration of the fitted piecewise polynomial provided greater acceleration and displacement precision than the other two integration methods. A vibration test was then performed on a special equipment cab. The results indicated that direct integration by piecewise polynomial fitting with trend item removal was highly consistent with the measured signal data. However, the direct integration method without trend item removal resulted in signal distortion. The proposed method can help with frequency domain analysis of vibration signals and modal parameter identification for such equipment.

  16. Evaluation of psychometric properties and differential item functioning of 8-item Child Perceptions Questionnaires using item response theory.

    Science.gov (United States)

    Yau, David T W; Wong, May C M; Lam, K F; McGrath, Colman

    2015-08-19

    Four-factor structure of the two 8-item short forms of Child Perceptions Questionnaire CPQ11-14 (RSF:8 and ISF:8) has been confirmed. However, the sum scores are typically reported in practice as a proxy of Oral health-related Quality of Life (OHRQoL), which implied a unidimensional structure. This study first assessed the unidimensionality of 8-item short forms of CPQ11-14. Item response theory (IRT) was employed to offer an alternative and complementary approach of validation and to overcome the limitations of classical test theory assumptions. A random sample of 649 12-year-old school children in Hong Kong was analyzed. Unidimensionality of the scale was tested by confirmatory factor analysis (CFA), principle component analysis (PCA) and local dependency (LD) statistic. Graded response model was fitted to the data. Contribution of each item to the scale was assessed by item information function (IIF). Reliability of the scale was assessed by test information function (TIF). Differential item functioning (DIF) across gender was identified by Wald test and expected score functions. Both CPQ11-14 RSF:8 and ISF:8 did not deviate much from the unidimensionality assumption. Results from CFA indicated acceptable fit of the one-factor model. PCA indicated that the first principle component explained >30 % of the total variation with high factor loadings for both RSF:8 and ISF:8. Almost all LD statistic items suggesting little contribution of information to the scale and item removal caused little practical impact. Comparing the TIFs, RSF:8 showed slightly better information than ISF:8. In addition to oral symptoms items, the item "Concerned with what other people think" demonstrated a uniform DIF (p Items related to oral symptoms were not informative to OHRQoL and deletion of these items is suggested. The impact of DIF across gender on the overall score was minimal. CPQ11-14 RSF:8 performed slightly better than ISF:8 in measurement precision. The 6-item short forms

  17. Hand-related physical function in rheumatic hand conditions

    DEFF Research Database (Denmark)

    Klokker, Louise; Terwee, Caroline B; Wæhrens, Eva Ejlersen

    2016-01-01

    as well as those items from the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank that are relevant to patients with rheumatic hand conditions. Selection will be based on consensus among reviewers. Content validity of selected items will be established......INTRODUCTION: There is no consensus about what constitutes the most appropriate patient-reported outcome measurement (PROM) instrument for measuring physical function in patients with rheumatic hand conditions. Existing instruments lack psychometric testing and vary in feasibility...... and their psychometric qualities. We aim to develop a PROM instrument to assess hand-related physical function in rheumatic hand conditions. METHODS AND ANALYSIS: We will perform a systematic search to identify existing PROMs to rheumatic hand conditions, and select items relevant for hand-related physical function...

  18. Hand-related physical function in rheumatic hand conditions

    DEFF Research Database (Denmark)

    Klokker, Louise; Terwee, Caroline; Wæhrens, Eva Elisabet Ejlersen

    2016-01-01

    INTRODUCTION: There is no consensus about what constitutes the most appropriate patient-reported outcome measurement (PROM) instrument for measuring physical function in patients with rheumatic hand conditions. Existing instruments lack psychometric testing and vary in feasibility...... and their psychometric qualities. We aim to develop a PROM instrument to assess hand-related physical function in rheumatic hand conditions. METHODS AND ANALYSIS: We will perform a systematic search to identify existing PROMs to rheumatic hand conditions, and select items relevant for hand-related physical function...... as well as those items from the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank that are relevant to patients with rheumatic hand conditions. Selection will be based on consensus among reviewers. Content validity of selected items will be established...

  19. Validation of the MOS Social Support Survey 6-item (MOS-SSS-6) measure with two large population-based samples of Australian women.

    Science.gov (United States)

    Holden, Libby; Lee, Christina; Hockey, Richard; Ware, Robert S; Dobson, Annette J

    2014-12-01

    This study aimed to validate a 6-item 1-factor global measure of social support developed from the Medical Outcomes Study Social Support Survey (MOS-SSS) for use in large epidemiological studies. Data were obtained from two large population-based samples of participants in the Australian Longitudinal Study on Women's Health. The two cohorts were aged 53-58 and 28-33 years at data collection (N = 10,616 and 8,977, respectively). Items selected for the 6-item 1-factor measure were derived from the factor structure obtained from unpublished work using an earlier wave of data from one of these cohorts. Descriptive statistics, including polychoric correlations, were used to describe the abbreviated scale. Cronbach's alpha was used to assess internal consistency and confirmatory factor analysis to assess scale validity. Concurrent validity was assessed using correlations between the new 6-item version and established 19-item version, and other concurrent variables. In both cohorts, the new 6-item 1-factor measure showed strong internal consistency and scale reliability. It had excellent goodness-of-fit indices, similar to those of the established 19-item measure. Both versions correlated similarly with concurrent measures. The 6-item 1-factor MOS-SSS measures global functional social support with fewer items than the established 19-item measure.

  20. Development and validation of the functional assessment of cancer therapy-antiangiogenesis subscale.

    Science.gov (United States)

    Kaiser, Karen; Beaumont, Jennifer L; Webster, Kimberly; Yount, Susan E; Wagner, Lynne I; Kuzel, Timothy M; Cella, David

    2015-05-01

    The Functional Assessment of Cancer Therapy (FACT)-Antiangiogenesis (AntiA) Subscale was developed and validated to enhance treatment decision-making and side effect management for patients receiving anti-angiogenesis therapies. Side effects related to anti-angiogenesis therapies were identified from the literature, clinician input, and patient input. Fifty-nine possible patient expressions of side effects were generated. Patient and clinician ratings of the importance of these expressions led us to develop a 24-item questionnaire with clinical and research potential. To assess the scale's reliability and validity, 167 patients completed the AntiA Subscale, the Functional Assessment of Cancer Therapy-general (FACT-G), the FACT-Kidney Symptom Index (FKSI), the FACIT-Fatigue Subscale, the Global Rating of Change Scale (GRC), and the PROMIS Global Health Scale. Patient responses to the AntiA were analyzed for internal consistency, test-retest reliability, convergent and discriminant validity, and responsiveness to change in clinical status. All tested scales were found to have good internal consistency reliability (Cronbach's alpha 0.70-0.92). Test-retest reliability was also good (0.72-0.88) for total and subscale scores and lower for individual items. The total score, subscale scores, and all single items (except nosebleeds) significantly differentiated between groups defined by level of side effect bother. Evaluation of responsiveness to change in this study was not conclusive, suggesting an area for further research. The AntiA is a reliable and valid measure of side effects from anti-angiogenesis therapy. © 2014 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.

  1. Neural Global Pattern Similarity Underlies True and False Memories.

    Science.gov (United States)

    Ye, Zhifang; Zhu, Bi; Zhuang, Liping; Lu, Zhonglin; Chen, Chuansheng; Xue, Gui

    2016-06-22

    The neural processes giving rise to human memory strength signals remain poorly understood. Inspired by formal computational models that posit a central role of global matching in memory strength, we tested a novel hypothesis that the strengths of both true and false memories arise from the global similarity of an item's neural activation pattern during retrieval to that of all the studied items during encoding (i.e., the encoding-retrieval neural global pattern similarity [ER-nGPS]). We revealed multiple ER-nGPS signals that carried distinct information and contributed differentially to true and false memories: Whereas the ER-nGPS in the parietal regions reflected semantic similarity and was scaled with the recognition strengths of both true and false memories, ER-nGPS in the visual cortex contributed solely to true memory. Moreover, ER-nGPS differences between the parietal and visual cortices were correlated with frontal monitoring processes. By combining computational and neuroimaging approaches, our results advance a mechanistic understanding of memory strength in recognition. What neural processes give rise to memory strength signals, and lead to our conscious feelings of familiarity? Using fMRI, we found that the memory strength of a given item depends not only on how it was encoded during learning, but also on the similarity of its neural representation with other studied items. The global neural matching signal, mainly in the parietal lobule, could account for the memory strengths of both studied and unstudied items. Interestingly, a different global matching signal, originated from the visual cortex, could distinguish true from false memories. The findings reveal multiple neural mechanisms underlying the memory strengths of events registered in the brain. Copyright © 2016 the authors 0270-6474/16/366792-11$15.00/0.

  2. Evaluation of item candidates for a diabetic retinopathy quality of life item bank.

    Science.gov (United States)

    Fenwick, Eva K; Pesudovs, Konrad; Khadka, Jyoti; Rees, Gwyn; Wong, Tien Y; Lamoureux, Ecosse L

    2013-09-01

    We are developing an item bank assessing the impact of diabetic retinopathy (DR) on quality of life (QoL) using a rigorous multi-staged process combining qualitative and quantitative methods. We describe here the first two qualitative phases: content development and item evaluation. After a comprehensive literature review, items were generated from four sources: (1) 34 previously validated patient-reported outcome measures; (2) five published qualitative articles; (3) eight focus groups and 18 semi-structured interviews with 57 DR patients; and (4) seven semi-structured interviews with diabetes or ophthalmic experts. Items were then evaluated during 3 stages, namely binning (grouping) and winnowing (reduction) based on key criteria and panel consensus; development of item stems and response options; and pre-testing of items via cognitive interviews with patients. The content development phase yielded 1,165 unique items across 7 QoL domains. After 3 sessions of binning and winnowing, items were reduced to a minimally representative set (n = 312) across 9 domains of QoL: visual symptoms; ocular surface symptoms; activity limitation; mobility; emotional; health concerns; social; convenience; and economic. After 8 cognitive interviews, 42 items were amended resulting in a final set of 314 items. We have employed a systematic approach to develop items for a DR-specific QoL item bank. The psychometric properties of the nine QoL subscales will be assessed using Rasch analysis. The resulting validated item bank will allow clinicians and researchers to better understand the QoL impact of DR and DR therapies from the patient's perspective.

  3. The role of attention in item-item binding in visual working memory.

    Science.gov (United States)

    Peterson, Dwight J; Naveh-Benjamin, Moshe

    2017-09-01

    An important yet unresolved question regarding visual working memory (VWM) relates to whether or not binding processes within VWM require additional attentional resources compared with processing solely the individual components comprising these bindings. Previous findings indicate that binding of surface features (e.g., colored shapes) within VWM is not demanding of resources beyond what is required for single features. However, it is possible that other types of binding, such as the binding of complex, distinct items (e.g., faces and scenes), in VWM may require additional resources. In 3 experiments, we examined VWM item-item binding performance under no load, articulatory suppression, and backward counting using a modified change detection task. Binding performance declined to a greater extent than single-item performance under higher compared with lower levels of concurrent load. The findings from each of these experiments indicate that processing item-item bindings within VWM requires a greater amount of attentional resources compared with single items. These findings also highlight an important distinction between the role of attention in item-item binding within VWM and previous studies of long-term memory (LTM) where declines in single-item and binding test performance are similar under divided attention. The current findings provide novel evidence that the specific type of binding is an important determining factor regarding whether or not VWM binding processes require attention. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  4. Item analysis of ADAS-Cog: effect of baseline cognitive impairment in a clinical AD trial.

    Science.gov (United States)

    Sevigny, Jeffrey J; Peng, Yahong; Liu, Lian; Lines, Christopher R

    2010-03-01

    We explored the association of Alzheimer's disease (AD) Assessment Scale (ADAS-Cog) item scores with AD severity using cross-sectional and longitudinal data from the same study. Post hoc analyses were performed using placebo data from a 12-month trial of patients with mild-to-moderate AD (N =281 randomized, N =209 completed). Baseline distributions of ADAS-Cog item scores by Mini-Mental State Examination (MMSE) score and Clinical Dementia Rating (CDR) sum of boxes score (measures of dementia severity) were estimated using local and nonparametric regressions. Mixed-effect models were used to characterize ADAS-Cog item score changes over time by dementia severity (MMSE: mild =21-26, moderate =14-20; global CDR: mild =0.5-1, moderate =2). In the cross-sectional analysis of baseline ADAS-Cog item scores, orientation was the most sensitive item to differentiate patients across levels of cognitive impairment. Several items showed a ceiling effect, particularly in milder AD. In the longitudinal analysis of change scores over 12 months, orientation was the only item with noticeable decline (8%-10%) in mild AD. Most items showed modest declines (5%-20%) in moderate AD.

  5. A Construction of Global Literacy Indicators for Undergraduates

    Directory of Open Access Journals (Sweden)

    Lung-Sheng Lee

    2017-06-01

    Full Text Available In the era of glocalization and logloblization, only university graduates who are globally literate can effectively deal with international affairs or work overseas. Therefore, this study aimed to construct a set of global literacy indicators for undergraduates in Taiwan. The global literacy indicators can be used as a guide to assess undergraduates’ global literacy level and serve as the foundation for developing global education curriculums. Employing a theoretical framework, this study drafted global literacy dimensions and indicators from reviewing related literature, and invited 18 practitioners with international experience to participate in this study. During the item development process, fuzzy Delphi method (FDM and analytic hierarchy process (AHP were applied to select and weight global literacy indicators respectively. Consequently, a set of global literacy indicators for undergraduates were constructed, which include the following four dimensions: communication, context, career development, and culture. “Communication” is the most important dimension among them, while “communicate with foreign languages” and “use information and communication technology (ICT to communicate with others” are the most important indicators and items at the second and third hierarchical levels, respectively.

  6. A single-item global job satisfaction measure is associated with quantitative blood immune indices in white-collar employees.

    Science.gov (United States)

    Nakata, Akinori; Irie, Masahiro; Takahashi, Masaya

    2013-01-01

    Although a single-item job satisfaction measure has been shown to be reliable and inclusive as multiple-item scales in relation to health, studies including immunological data are few. The purpose of this study was to evaluate the validity of single-item job and family life satisfaction based on its association with immune indices. A total of 189 white-collar employees (70% men) underwent a blood draw for the measurement of natural killer (NK), total T, and B cell counts as well as plasma immunoglobulin (Ig) G concentrations and completed single-item job and family life satisfaction measures, respectively. The response options for satisfaction measures were 'dissatisfied' (coded 1) to 'satisfied' (coded 4). Spearman's partial correlations controlling for cofactors revealed that increased job satisfaction was positively associated with NK cells (rsp=0.201, p=0.007) and IgG (rsp=0.178, p=0.018), while family life satisfaction was unrelated to immune indices. Those who reported a combination of low job/low family life satisfaction had significantly lower NK and higher B cell counts than those with a high job/high family life satisfaction. Our study suggests that the single-item summary measure of job satisfaction, but not family life satisfaction, may be a valid tool to evaluate immune status in healthy white-collar employees.

  7. Measuring Environmental Factors: Unique and Overlapping International Classification of Functioning, Disability and Health Coverage of 5 Instruments.

    Science.gov (United States)

    Heinemann, Allen W; Miskovic, Ana; Semik, Patrick; Wong, Alex; Dashner, Jessica; Baum, Carolyn; Magasi, Susan; Hammel, Joy; Tulsky, David S; Garcia, Sofia F; Jerousek, Sara; Lai, Jin-Shei; Carlozzi, Noelle E; Gray, David B

    2016-12-01

    To describe the unique and overlapping content of the newly developed Environmental Factors Item Banks (EFIB) and 7 legacy environmental factor instruments, and to evaluate the EFIB's construct validity by examining associations with legacy instruments. Cross-sectional, observational cohort. Community. A sample of community-dwelling adults with stroke, spinal cord injury, and traumatic brain injury (N=568). None. EFIB covering domains of the built and natural environment; systems, services, and policies; social environment; and access to information and technology; the Craig Hospital Inventory of Environmental Factors (CHIEF) short form; the Facilitators and Barriers Survey/Mobility (FABS/M) short form; the Home and Community Environment Instrument (HACE); the Measure of the Quality of the Environment (MQE) short form; and 3 of the Patient Reported Outcomes Measurement Information System's (PROMIS) Quality of Social Support measures. The EFIB and legacy instruments assess most of the International Classification of Functioning, Disability and Health (ICF) environmental factors chapters, including chapter 1 (products and technology; 75 items corresponding to 11 codes), chapter 2 (natural environment and human-made changes; 31 items corresponding to 7 codes), chapter 3 (support and relationships; 74 items corresponding to 7 codes), chapter 4 (attitudes; 83 items corresponding to 8 codes), and chapter 5 (services, systems, and policies; 72 items corresponding to 16 codes). Construct validity is provided by moderate correlations between EFIB measures and the CHIEF, MQE barriers, HACE technology mobility, FABS/M community built features, and PROMIS item banks and by small correlations with other legacy instruments. Only 5 of the 66 legacy instrument correlation coefficients are moderate, suggesting they measure unique aspects of the environment, whereas all intra-EFIB correlations were at least moderate. The EFIB measures provide a brief and focused assessment of ICF

  8. Global World: A Problem of Governance

    Science.gov (United States)

    Chumakov, Alexander Nikolayevich

    2014-01-01

    Purpose: The purpose of this paper is to include the following items: to show the absolute necessity of managing the international community, to explore the fundamental possibility of managing the global world, to prove or disprove such a possibility, to determine the real background of global governance in modern conditions and to show the…

  9. Attitudes towards Internationalism through the Lens of Cognitive Effort, Global Mindset, and Cultural Intelligence

    Science.gov (United States)

    Romano, Joan; Platania, Judith

    2014-01-01

    In the current study we examine attitudes towards internationalism through the lens of a specific set of constructs necessary in defining an effective global leader. One hundred fifty-nine undergraduates responded to items measuring need for cognition, cultural intelligence, and a set of items measuring the correlates of global mindset. In…

  10. Shaping Collaboration 2006: action items for the LHC

    Energy Technology Data Exchange (ETDEWEB)

    Goldfarb, S [CERN-PH, 1211 Geneva 23 (Switzerland); Herr, J; Neal, H A [Assistant Research Scientist, University of Michigan (United States); Research Process Manager, University of Michigan (United States); Professor of Physics, University of Michigan (United States)], E-mail: steven.goldfarb@cern.ch

    2008-07-15

    Shaping Collaboration 2006 [1] was a workshop held in Geneva, on December 11-13, 2006, to examine the status and future of collaborative tool technology and its usage for large global scientific collaborations, such as those of the CERN LHC [2]. The workshop brought together some of the leading experts in the field of collaborative tools (WACE 2006) [3] with physicists and developers of the LHC collaborations and HENP (High-Energy and Nuclear Physics). We highlight important presentations and key discussions held during the workshop, then focus on a large and aggressive set of goals and specific action items targeted at institutes from all levels of the LHC organization. This list of action items, assembled during a panel discussion at the close of the LHC sessions, includes recommendations for the LHC Users, their Universities, Project Managers, Spokespersons, National Funding Agencies and Host Laboratories. We present this list, along with suggestions for priorities in addressing the immediate and long-term needs of HENP.

  11. Shaping Collaboration 2006: action items for the LHC

    International Nuclear Information System (INIS)

    Goldfarb, S; Herr, J; Neal, H A

    2008-01-01

    Shaping Collaboration 2006 [1] was a workshop held in Geneva, on December 11-13, 2006, to examine the status and future of collaborative tool technology and its usage for large global scientific collaborations, such as those of the CERN LHC [2]. The workshop brought together some of the leading experts in the field of collaborative tools (WACE 2006) [3] with physicists and developers of the LHC collaborations and HENP (High-Energy and Nuclear Physics). We highlight important presentations and key discussions held during the workshop, then focus on a large and aggressive set of goals and specific action items targeted at institutes from all levels of the LHC organization. This list of action items, assembled during a panel discussion at the close of the LHC sessions, includes recommendations for the LHC Users, their Universities, Project Managers, Spokespersons, National Funding Agencies and Host Laboratories. We present this list, along with suggestions for priorities in addressing the immediate and long-term needs of HENP

  12. Differential item functioning magnitude and impact measures from item response theory models.

    Science.gov (United States)

    Kleinman, Marjorie; Teresi, Jeanne A

    2016-01-01

    Measures of magnitude and impact of differential item functioning (DIF) at the item and scale level, respectively are presented and reviewed in this paper. Most measures are based on item response theory models. Magnitude refers to item level effect sizes, whereas impact refers to differences between groups at the scale score level. Reviewed are magnitude measures based on group differences in the expected item scores and impact measures based on differences in the expected scale scores. The similarities among these indices are demonstrated. Various software packages are described that provide magnitude and impact measures, and new software presented that computes all of the available statistics conveniently in one program with explanations of their relationships to one another.

  13. ITEM LEVEL DIAGNOSTICS AND MODEL - DATA FIT IN ITEM ...

    African Journals Online (AJOL)

    Global Journal

    Item response theory (IRT) is a framework for modeling and analyzing item response ... data. Though, there is an argument that the evaluation of fit in IRT modeling has been ... National Council on Measurement in Education ... model data fit should be based on three types of ... prediction should be assessed through the.

  14. Sleep disturbances in systemic sclerosis: evidence for the role of gastrointestinal symptoms, pain and pruritus.

    Science.gov (United States)

    Milette, Katherine; Hudson, Marie; Körner, Annett; Baron, Murray; Thombs, Brett D

    2013-09-01

    SSc is a rare autoimmune CTD characterized by thickening and fibrosis of skin and internal organs. There is significant mortality and no cure. Sleep disturbance has been identified as an important contributor to poor quality of life. The objective was to investigate socio-demographic and medical factors potentially associated with sleep disturbance in SSc. The sample consisted of patients from the Canadian Scleroderma Research Group's (CSRG) 15-centre, pan-Canadian Registry assessed with the 8-item Patient-Reported Outcome Measurement Information System (PROMIS) sleep disturbance scale short form, version 1.0. Pearson's correlations were used to assess bivariate association of socio-demographic and medical variables with PROMIS sleep scores. The independent association of PROMIS sleep disturbance scores and factors previously identified as associated with sleep disturbance in the general population, in SSc and other rheumatic diseases, was assessed using multiple linear regression. Among 397 patients in the study (88% female, mean age 57.5 years), 25% (n = 98) had diffuse cutaneous SSc. Mean duration since onset of non-RP symptoms was 10.6 years. Number of gastrointestinal symptoms (standardized regression coefficient β = 0.19, P = 0.001), pain severity (β = 0.21, P sleep disturbance. Gastrointestinal symptoms, pain and pruritus were associated with sleep disturbance in SSc. Additional research is needed on sleep in SSc so that well-informed sleep interventions can be developed and tested.

  15. Using item response theory to address vulnerabilities in FFQ.

    Science.gov (United States)

    Kazman, Josh B; Scott, Jonathan M; Deuster, Patricia A

    2017-09-01

    The limitations for self-reporting of dietary patterns are widely recognised as a major vulnerability of FFQ and the dietary screeners/scales derived from FFQ. Such instruments can yield inconsistent results to produce questionable interpretations. The present article discusses the value of psychometric approaches and standards in addressing these drawbacks for instruments used to estimate dietary habits and nutrient intake. We argue that a FFQ or screener that treats diet as a 'latent construct' can be optimised for both internal consistency and the value of the research results. Latent constructs, a foundation for item response theory (IRT)-based scales (e.g. Patient Reported Outcomes Measurement Information System) are typically introduced in the design stage of an instrument to elicit critical factors that cannot be observed or measured directly. We propose an iterative approach that uses such modelling to refine FFQ and similar instruments. To that end, we illustrate the benefits of psychometric modelling by using items and data from a sample of 12 370 Soldiers who completed the 2012 US Army Global Assessment Tool (GAT). We used factor analysis to build the scale incorporating five out of eleven survey items. An IRT-driven assessment of response category properties indicates likely problems in the ordering or wording of several response categories. Group comparisons, examined with differential item functioning (DIF), provided evidence of scale validity across each Army sub-population (sex, service component and officer status). Such an approach holds promise for future FFQ.

  16. Evolution of a Test Item

    Science.gov (United States)

    Spaan, Mary

    2007-01-01

    This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…

  17. Re-Examining Test Item Issues in the TIMSS Mathematics and Science Assessments

    Science.gov (United States)

    Wang, Jianjun

    2011-01-01

    As the largest international study ever taken in history, the Trend in Mathematics and Science Study (TIMSS) has been held as a benchmark to measure U.S. student performance in the global context. In-depth analyses of the TIMSS project are conducted in this study to examine key issues of the comparative investigation: (1) item flaws in mathematics…

  18. Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2013-01-01

    Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

  19. A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

    Science.gov (United States)

    Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

    2018-04-10

    To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading .3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.

  20. Teoria da Resposta ao Item Teoria de la respuesta al item Item response theory

    Directory of Open Access Journals (Sweden)

    Eutalia Aparecida Candido de Araujo

    2009-12-01

    Full Text Available A preocupação com medidas de traços psicológicos é antiga, sendo que muitos estudos e propostas de métodos foram desenvolvidos no sentido de alcançar este objetivo. Entre os trabalhos propostos, destaca-se a Teoria da Resposta ao Item (TRI que, a princípio, veio completar limitações da Teoria Clássica de Medidas, empregada em larga escala até hoje na medida de traços psicológicos. O ponto principal da TRI é que ela leva em consideração o item particularmente, sem relevar os escores totais; portanto, as conclusões não dependem apenas do teste ou questionário, mas de cada item que o compõe. Este artigo propõe-se a apresentar esta Teoria que revolucionou a teoria de medidas.La preocupación con las medidas de los rasgos psicológicos es antigua y muchos estudios y propuestas de métodos fueron desarrollados para lograr este objetivo. Entre estas propuestas de trabajo se incluye la Teoría de la Respuesta al Ítem (TRI que, en principio, vino a completar las limitaciones de la Teoría Clásica de los Tests, ampliamente utilizada hasta hoy en la medida de los rasgos psicológicos. El punto principal de la TRI es que se tiene en cuenta el punto concreto, sin relevar las puntuaciones totales; por lo tanto, los resultados no sólo dependen de la prueba o cuestionario, sino que de cada ítem que lo compone. En este artículo se propone presentar la Teoría que revolucionó la teoría de medidas.The concern with measures of psychological traits is old and many studies and proposals of methods were developed to achieve this goal. Among these proposed methods highlights the Item Response Theory (IRT that, in principle, came to complete limitations of the Classical Test Theory, which is widely used until nowadays in the measurement of psychological traits. The main point of IRT is that it takes into account the item in particular, not relieving the total scores; therefore, the findings do not only depend on the test or questionnaire

  1. Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items

    Science.gov (United States)

    Aybek, Eren Can; Demirtasli, R. Nukhet

    2017-01-01

    This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…

  2. Selecting Items for Criterion-Referenced Tests.

    Science.gov (United States)

    Mellenbergh, Gideon J.; van der Linden, Wim J.

    1982-01-01

    Three item selection methods for criterion-referenced tests are examined: the classical theory of item difficulty and item-test correlation; the latent trait theory of item characteristic curves; and a decision-theoretic approach for optimal item selection. Item contribution to the standardized expected utility of mastery testing is discussed. (CM)

  3. Geriatric Anxiety Scale: item response theory analysis, differential item functioning, and creation of a ten-item short form (GAS-10).

    Science.gov (United States)

    Mueller, Anne E; Segal, Daniel L; Gavett, Brandon; Marty, Meghan A; Yochim, Brian; June, Andrea; Coolidge, Frederick L

    2015-07-01

    The Geriatric Anxiety Scale (GAS; Segal et al. (Segal, D. L., June, A., Payne, M., Coolidge, F. L. and Yochim, B. (2010). Journal of Anxiety Disorders, 24, 709-714. doi:10.1016/j.janxdis.2010.05.002) is a self-report measure of anxiety that was designed to address unique issues associated with anxiety assessment in older adults. This study is the first to use item response theory (IRT) to examine the psychometric properties of a measure of anxiety in older adults. A large sample of older adults (n = 581; mean age = 72.32 years, SD = 7.64 years, range = 60 to 96 years; 64% women; 88% European American) completed the GAS. IRT properties were examined. The presence of differential item functioning (DIF) or measurement bias by age and sex was assessed, and a ten-item short form of the GAS (called the GAS-10) was created. All GAS items had discrimination parameters of 1.07 or greater. Items from the somatic subscale tended to have lower discrimination parameters than items on the cognitive or affective subscales. Two items were flagged for DIF, but the impact of the DIF was negligible. Women scored significantly higher than men on the GAS and its subscales. Participants in the young-old group (60 to 79 years old) scored significantly higher on the cognitive subscale than participants in the old-old group (80 years old and older). Results from the IRT analyses indicated that the GAS and GAS-10 have strong psychometric properties among older adults. We conclude by discussing implications and future research directions.

  4. Asymptotic Standard Errors for Item Response Theory True Score Equating of Polytomous Items

    Science.gov (United States)

    Cher Wong, Cheow

    2015-01-01

    Building on previous works by Lord and Ogasawara for dichotomous items, this article proposes an approach to derive the asymptotic standard errors of item response theory true score equating involving polytomous items, for equivalent and nonequivalent groups of examinees. This analytical approach could be used in place of empirical methods like…

  5. MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

    Science.gov (United States)

    Wang, Wen-Chung; Shih, Ching-Lin

    2010-01-01

    Three multiple indicators-multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods…

  6. Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory

    Science.gov (United States)

    Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi

    2016-01-01

    High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…

  7. Using Differential Item Functioning Procedures to Explore Sources of Item Difficulty and Group Performance Characteristics.

    Science.gov (United States)

    Scheuneman, Janice Dowd; Gerritz, Kalle

    1990-01-01

    Differential item functioning (DIF) methodology for revealing sources of item difficulty and performance characteristics of different groups was explored. A total of 150 Scholastic Aptitude Test items and 132 Graduate Record Examination general test items were analyzed. DIF was evaluated for males and females and Blacks and Whites. (SLD)

  8. Item Response Data Analysis Using Stata Item Response Theory Package

    Science.gov (United States)

    Yang, Ji Seung; Zheng, Xiaying

    2018-01-01

    The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

  9. Item Banking with Embedded Standards

    Science.gov (United States)

    MacCann, Robert G.; Stanley, Gordon

    2009-01-01

    An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…

  10. A Method for the Comparison of Item Selection Rules in Computerized Adaptive Testing

    Science.gov (United States)

    Barrada, Juan Ramon; Olea, Julio; Ponsoda, Vicente; Abad, Francisco Jose

    2010-01-01

    In a typical study comparing the relative efficiency of two item selection rules in computerized adaptive testing, the common result is that they simultaneously differ in accuracy and security, making it difficult to reach a conclusion on which is the more appropriate rule. This study proposes a strategy to conduct a global comparison of two or…

  11. Assessment of the Item Selection and Weighting in the Birmingham Vasculitis Activity Score for Wegener's Granulomatosis

    Science.gov (United States)

    MAHR, ALFRED D.; NEOGI, TUHINA; LAVALLEY, MICHAEL P.; DAVIS, JOHN C.; HOFFMAN, GARY S.; MCCUNE, W. JOSEPH; SPECKS, ULRICH; SPIERA, ROBERT F.; ST.CLAIR, E. WILLIAM; STONE, JOHN H.; MERKEL, PETER A.

    2013-01-01

    Objective To assess the Birmingham Vasculitis Activity Score for Wegener's Granulomatosis (BVAS/WG) with respect to its selection and weighting of items. Methods This study used the BVAS/WG data from the Wegener's Granulomatosis Etanercept Trial. The scoring frequencies of the 34 predefined items and any “other” items added by clinicians were calculated. Using linear regression with generalized estimating equations in which the physician global assessment (PGA) of disease activity was the dependent variable, we computed weights for all predefined items. We also created variables for clinical manifestations frequently added as other items, and computed weights for these as well. We searched for the model that included the items and their generated weights yielding an activity score with the highest R2 to predict the PGA. Results We analyzed 2,044 BVAS/WG assessments from 180 patients; 734 assessments were scored during active disease. The highest R2 with the PGA was obtained by scoring WG activity based on the following items: the 25 predefined items rated on ≥5 visits, the 2 newly created fatigue and weight loss variables, the remaining minor other and major other items, and a variable that signified whether new or worse items were present at a specific visit. The weights assigned to the items ranged from 1 to 21. Compared with the original BVAS/WG, this modified score correlated significantly more strongly with the PGA. Conclusion This study suggests possibilities to enhance the item selection and weighting of the BVAS/WG. These changes may increase this instrument's ability to capture the continuum of disease activity in WG. PMID:18512722

  12. P2-19: The Effect of item Repetition on Item-Context Association Depends on the Prior Exposure of Items

    Directory of Open Access Journals (Sweden)

    Hongmi Lee

    2012-10-01

    Full Text Available Previous studies have reported conflicting findings on whether item repetition has beneficial or detrimental effects on source memory. To reconcile such contradictions, we investigated whether the degree of pre-exposure of items can be a potential modulating factor. The experimental procedures spanned two consecutive days. On Day 1, participants were exposed to a set of unfamiliar faces. On Day 2, the same faces presented on the previous day were used again in half of the participants, whereas novel faces were used for the other half. Day 2 procedures consisted of three successive phases: item repetition, source association, and source memory test. In the item repetition phase, half of the face stimuli were repeatedly presented while participants were making male/female judgments. During the source association phase, both the repeated and the unrepeated faces appeared in one of the four locations on the screen. Finally, participants were tested on the location in which a given face was presented during the previous phase and reported the confidence of their memory. Source memory accuracy was measured as the percentage of correct non-guess trials. As results, we found a significant interaction between prior exposure and repetition. Repetition impaired source memory when the items had been pre-exposed on Day 1, while it led to greater accuracy in novel ones. These results show that pre-experimental exposure can modulate the effects of repetition on associative binding between an item and its contextual information, suggesting that pre-existing representation and novelty signal interact to form new episodic memory.

  13. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions

    Directory of Open Access Journals (Sweden)

    Yoon Soo ePark

    2016-02-01

    Full Text Available This study investigates the impact of item parameter drift (IPD on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effect on item parameters and examinee ability.

  14. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions.

    Science.gov (United States)

    Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

    2016-01-01

    This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.

  15. The Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES): item response theory findings.

    Science.gov (United States)

    Grigg, Kaine; Manderson, Lenore

    2016-03-17

    Racism and associated discrimination are pervasive and persistent challenges with multiple cumulative deleterious effects contributing to inequities in various health outcomes. Globally, research over the past decade has shown consistent associations between racism and negative health concerns. Such research confirms that race endures as one of the strongest predictors of poor health. Due to the lack of validated Australian measures of racist attitudes, RACES (Racism, Acceptance, and Cultural-Ethnocentrism Scale) was developed. Here, we examine RACES' psychometric properties, including the latent structure, utilising Item Response Theory (IRT). Unidimensional and Multidimensional Rating Scale Model (RSM) Rasch analyses were utilised with 296 Victorian primary school students and 182 adolescents and 220 adults from the Australian community. RACES was demonstrated to be a robust 24-item three-dimensional scale of Accepting Attitudes (12 items), Racist Attitudes (8 items), and Ethnocentric Attitudes (4 items). RSM Rasch analyses provide strong support for the instrument as a robust measure of racist attitudes in the Australian context, and for the overall factorial and construct validity of RACES across primary school children, adolescents, and adults. RACES provides a reliable and valid measure that can be utilised across the lifespan to evaluate attitudes towards all racial, ethnic, cultural, and religious groups. A core function of RACES is to assess the effectiveness of interventions to reduce community levels of racism and in turn inequities in health outcomes within Australia.

  16. Credit financing for deteriorating imperfect quality items with allowable shortages

    Directory of Open Access Journals (Sweden)

    Aditi Khanna

    2016-01-01

    Full Text Available The outset of new technologies, systems and applications in manufacturing sector has no doubt lighten up our workload, yet the chance causes of variation in production system cannot be eliminated completely. Every produced/ordered lot may have some fraction of defectives which may vary from process to process. In addition the situation is more susceptible when the items are deteriorating in nature. However, the defective items can be secluded from the good quality lot through a careful inspection process. Thus, a screening process is obligatory in today’s technology driven industry which has the customer satisfaction as its only motto. Moreover, in order to survive in the current global markets, credit financing has been proven a very influential promotional tool to attract new customers and a good inducement policy for the retailers. Keeping this scenario in mind, the present paper investigates an inventory model for a retailer dealing with imperfect quality deteriorating items under permissible delay in payments. Shortages are allowed and fully backlogged. This model jointly optimizes the order quantity and shortages by maximizing the expected total profit. A mathematical model is developed to depict this scenario. Results have been validated with the help of numerical example. Comprehensive sensitivity analysis has also been presented.

  17. The Technical Quality of Test Items Generated Using a Systematic Approach to Item Writing.

    Science.gov (United States)

    Siskind, Theresa G.; Anderson, Lorin W.

    The study was designed to examine the similarity of response options generated by different item writers using a systematic approach to item writing. The similarity of response options to student responses for the same item stems presented in an open-ended format was also examined. A non-systematic (subject matter expertise) approach and a…

  18. Self-Esteem and Method Effects Associated with Negatively Worded Items: Investigating Factorial Invariance by Sex

    Science.gov (United States)

    DiStefano, Christine; Motl, Robert W.

    2009-01-01

    The Rosenberg Self-Esteem scale (RSE) has been widely used in examinations of sex differences in global self-esteem. However, previous examinations of sex differences have not accounted for method effects associated with item wording, which have consistently been reported by researchers using the RSE. Accordingly, this study examined the…

  19. Generalizability theory and item response theory

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a

  20. Sharing the cost of redundant items

    DEFF Research Database (Denmark)

    Hougaard, Jens Leth; Moulin, Hervé

    2014-01-01

    We ask how to share the cost of finitely many public goods (items) among users with different needs: some smaller subsets of items are enough to serve the needs of each user, yet the cost of all items must be covered, even if this entails inefficiently paying for redundant items. Typical examples...... are network connectivity problems when an existing (possibly inefficient) network must be maintained. We axiomatize a family cost ratios based on simple liability indices, one for each agent and for each item, measuring the relative worth of this item across agents, and generating cost allocation rules...... additive in costs....

  1. Purchasing behaviour on aesthetic items in online video games with real currency : The case of Counter Strike: Global Offensive

    OpenAIRE

    Rodríguez, Bruno

    2017-01-01

    Over the last decade, buying in-game content with real money has become a more common practice among players in order to unlock exclusive content in video games. Prior research has mainly focused on those functional digital items that provide an advantage to the buyer. This thesis aims to determine the underlying factors that influence video game players to purchase purely aesthetic virtual items.Prior studies on the field of video games, gaming business models and purchasing behaviour were r...

  2. Dissociating the neural correlates of intra-item and inter-item working-memory binding.

    Directory of Open Access Journals (Sweden)

    Carinne Piekema

    Full Text Available BACKGROUND: Integration of information streams into a unitary representation is an important task of our cognitive system. Within working memory, the medial temporal lobe (MTL has been conceptually linked to the maintenance of bound representations. In a previous fMRI study, we have shown that the MTL is indeed more active during working-memory maintenance of spatial associations as compared to non-spatial associations or single items. There are two explanations for this result, the mere presence of the spatial component activates the MTL, or the MTL is recruited to bind associations between neurally non-overlapping representations. METHODOLOGY/PRINCIPAL FINDINGS: The current fMRI study investigates this issue further by directly comparing intrinsic intra-item binding (object/colour, extrinsic intra-item binding (object/location, and inter-item binding (object/object. The three binding conditions resulted in differential activation of brain regions. Specifically, we show that the MTL is important for establishing extrinsic intra-item associations and inter-item associations, in line with the notion that binding of information processed in different brain regions depends on the MTL. CONCLUSIONS/SIGNIFICANCE: Our findings indicate that different forms of working-memory binding rely on specific neural structures. In addition, these results extend previous reports indicating that the MTL is implicated in working-memory maintenance, challenging the classic distinction between short-term and long-term memory systems.

  3. Generalizability theory and item response theory

    OpenAIRE

    Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a selected-response format. This chapter presents a short overview of how item response theory and generalizability theory were integrated to model such assessments. Further, the precision of the esti...

  4. The 4-Item Negative Symptom Assessment (NSA-4) Instrument: A Simple Tool for Evaluating Negative Symptoms in Schizophrenia Following Brief Training.

    Science.gov (United States)

    Alphs, Larry; Morlock, Robert; Coon, Cheryl; van Willigenburg, Arjen; Panagides, John

    2010-07-01

    Objective. To assess the ability of mental health professionals to use the 4-item Negative Symptom Assessment instrument, derived from the Negative Symptom Assessment-16, to rapidly determine the severity of negative symptoms of schizophrenia.Design. Open participation.Setting. Medical education conferences.Participants. Attendees at two international psychiatry conferences.Measurements. Participants read a brief set of the 4-item Negative Symptom Assessment instructions and viewed a videotape of a patient with schizophrenia. Using the 1 to 6 4-item Negative Symptom Assessment severity rating scale, they rated four negative symptom items and the overall global negative symptoms. These ratings were compared with a consensus rating determination using frequency distributions and Chi-square tests for the proportion of participant ratings that were within one point of the expert rating.Results. More than 400 medical professionals (293 physicians, 50% with a European practice, and 55% who reported past utilization of schizophrenia ratings scales) participated. Between 82.1 and 91.1 percent of the 4-items and the global rating determinations by the participants were within one rating point of the consensus expert ratings. The differences between the percentage of participant rating scores that were within one point versus the percentage that were greater than one point different from those by the consensus experts was significant (pnegative symptoms using the 4-item Negative Symptom Assessment did not generally differ among the geographic regions of practice, the professional credentialing, or their familiarity with the use of schizophrenia symptom rating instruments.Conclusion. These findings suggest that clinicians from a variety of geographic practices can, after brief training, use the 4-item Negative Symptom Assessment effectively to rapidly assess negative symptoms in patients with schizophrenia.

  5. The 12-item World Health Organization Disability Assessment Schedule II (WHO-DAS II: a nonparametric item response analysis

    Directory of Open Access Journals (Sweden)

    Fernandez Ana

    2010-05-01

    Full Text Available Abstract Background Previous studies have analyzed the psychometric properties of the World Health Organization Disability Assessment Schedule II (WHO-DAS II using classical omnibus measures of scale quality. These analyses are sample dependent and do not model item responses as a function of the underlying trait level. The main objective of this study was to examine the effectiveness of the WHO-DAS II items and their options in discriminating between changes in the underlying disability level by means of item response analyses. We also explored differential item functioning (DIF in men and women. Methods The participants were 3615 adult general practice patients from 17 regions of Spain, with a first diagnosed major depressive episode. The 12-item WHO-DAS II was administered by the general practitioners during the consultation. We used a non-parametric item response method (Kernel-Smoothing implemented with the TestGraf software to examine the effectiveness of each item (item characteristic curves and their options (option characteristic curves in discriminating between changes in the underliying disability level. We examined composite DIF to know whether women had a higher probability than men of endorsing each item. Results Item response analyses indicated that the twelve items forming the WHO-DAS II perform very well. All items were determined to provide good discrimination across varying standardized levels of the trait. The items also had option characteristic curves that showed good discrimination, given that each increasing option became more likely than the previous as a function of increasing trait level. No gender-related DIF was found on any of the items. Conclusions All WHO-DAS II items were very good at assessing overall disability. Our results supported the appropriateness of the weights assigned to response option categories and showed an absence of gender differences in item functioning.

  6. Item Response Theory with Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model

    Science.gov (United States)

    Tay, Louis; Huang, Qiming; Vermunt, Jeroen K.

    2016-01-01

    In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…

  7. The randomly renewed general item and the randomly inspected item with exponential life distribution

    International Nuclear Information System (INIS)

    Schneeweiss, W.G.

    1979-01-01

    For a randomly renewed item the probability distributions of the time to failure and of the duration of down time and the expectations of these random variables are determined. Moreover, it is shown that the same theory applies to randomly checked items with exponential probability distribution of life such as electronic items. The case of periodic renewals is treated as an example. (orig.) [de

  8. Contributions of physical function and satisfaction with social roles to emotional distress in chronic pain: a Collaborative Health Outcomes Information Registry (CHOIR) study.

    Science.gov (United States)

    Sturgeon, John A; Dixon, Eric A; Darnall, Beth D; Mackey, Sean C

    2015-12-01

    Individuals with chronic pain show greater vulnerability to depression or anger than those without chronic pain, and also show greater interpersonal difficulties and physical disability. The present study examined data from 675 individuals with chronic pain during their initial visits to a tertiary care pain clinic using assessments from Stanford University's Collaborative Health Outcomes Information Registry (CHOIR). Using a path modeling analysis, the mediating roles of Patient-Reported Outcomes Measurement Information Systems (PROMIS) Physical Function and PROMIS Satisfaction with Social Roles and Activities were tested between pain intensity and PROMIS Depression and Anger. Pain intensity significantly predicted both depression and anger, and both physical function and satisfaction with social roles mediated these relationships when modeled in separate 1-mediator models. Notably, however, when modeled together, ratings of satisfaction with social roles mediated the relationship between physical function and both anger and depression. Our results suggest that the process by which chronic pain disrupts emotional well-being involves both physical function and disrupted social functioning. However, the more salient factor in determining pain-related emotional distress seems to be disruption of social relationships, than global physical impairment. These results highlight the particular importance of social factors to pain-related distress, and highlight social functioning as an important target for clinical intervention in chronic pain.

  9. The Lifespan Self-Esteem Scale: Initial Validation of a New Measure of Global Self-Esteem.

    Science.gov (United States)

    Harris, Michelle A; Donnellan, M Brent; Trzesniewski, Kali H

    2018-01-01

    This article introduces the Lifespan Self-Esteem Scale (LSE), a short measure of global self-esteem suitable for populations drawn from across the lifespan. Many existing measures of global self-esteem cannot be used across multiple developmental periods due to changes in item content, response formats, and other scale characteristics. This creates a need for a new lifespan scale so that changes in global self-esteem over time can be studied without confounding maturational changes with alterations in the measure. The LSE is a 4-item measure with a 5-point response format using items inspired by established self-esteem scales. The scale is essentially unidimensional and internally consistent, and it converges with existing self-esteem measures across ages 5 to 93 (N = 2,714). Thus, the LSE appears to be a useful measure of global self-esteem suitable for use across the lifespan as well as contexts where a short measure is desirable, such as populations with short attention spans or large projects assessing multiple constructs. Moreover, the LSE is one of the first global self-esteem scales to be validated for children younger than age 8, which provides the opportunity to broaden the field to include research on early formation and development of global self-esteem, an area that has previously been limited.

  10. 17 CFR 260.7a-16 - Inclusion of items, differentiation between items and answers, omission of instructions.

    Science.gov (United States)

    2010-04-01

    ... 17 Commodity and Securities Exchanges 3 2010-04-01 2010-04-01 false Inclusion of items, differentiation between items and answers, omission of instructions. 260.7a-16 Section 260.7a-16 Commodity and... INDENTURE ACT OF 1939 Formal Requirements § 260.7a-16 Inclusion of items, differentiation between items and...

  11. The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory

    Science.gov (United States)

    Sahin, Alper; Anil, Duygu

    2017-01-01

    This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…

  12. Approximation Preserving Reductions among Item Pricing Problems

    Science.gov (United States)

    Hamane, Ryoso; Itoh, Toshiya; Tomita, Kouhei

    When a store sells items to customers, the store wishes to determine the prices of the items to maximize its profit. Intuitively, if the store sells the items with low (resp. high) prices, the customers buy more (resp. less) items, which provides less profit to the store. So it would be hard for the store to decide the prices of items. Assume that the store has a set V of n items and there is a set E of m customers who wish to buy those items, and also assume that each item i ∈ V has the production cost di and each customer ej ∈ E has the valuation vj on the bundle ej ⊆ V of items. When the store sells an item i ∈ V at the price ri, the profit for the item i is pi = ri - di. The goal of the store is to decide the price of each item to maximize its total profit. We refer to this maximization problem as the item pricing problem. In most of the previous works, the item pricing problem was considered under the assumption that pi ≥ 0 for each i ∈ V, however, Balcan, et al. [In Proc. of WINE, LNCS 4858, 2007] introduced the notion of “loss-leader, ” and showed that the seller can get more total profit in the case that pi < 0 is allowed than in the case that pi < 0 is not allowed. In this paper, we derive approximation preserving reductions among several item pricing problems and show that all of them have algorithms with good approximation ratio.

  13. Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?

    Science.gov (United States)

    Sinharay, Sandip

    2017-09-01

    Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.

  14. An emotional functioning item bank of 24 items for computerized adaptive testing (CAT) was established

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Gamper, Eva-Maria; Costantini, Anna

    2016-01-01

    of the widely used EORTC Quality of Life questionnaire (QLQ-C30). STUDY DESIGN AND SETTING: On the basis of literature search and evaluations by international samples of experts and cancer patients, 38 candidate items were developed. The psychometric properties of the items were evaluated in a large...... international sample of cancer patients. This included evaluations of dimensionality, item response theory (IRT) model fit, differential item functioning (DIF), and of measurement precision/statistical power. RESULTS: Responses were obtained from 1,023 cancer patients from four countries. The evaluations showed...... that 24 items could be included in a unidimensional IRT model. DIF did not seem to have any significant impact on the estimation of EF. Evaluations indicated that the CAT measure may reduce sample size requirements by up to 50% compared to the QLQ-C30 EF scale without reducing power. CONCLUSION...

  15. Factoring handedness data: I. Item analysis.

    Science.gov (United States)

    Messinger, H B; Messinger, M I

    1995-12-01

    Recently in this journal Peters and Murphy challenged the validity of factor analyses done on bimodal handedness data, suggesting instead that right- and left-handers be studied separately. But bimodality may be avoidable if attention is paid to Oldfield's questionnaire format and instructions for the subjects. Two characteristics appear crucial: a two-column LEFT-RIGHT format for the body of the instrument and what we call Oldfield's Admonition: not to indicate strong preference for handedness item, such as write, unless "... the preference is so strong that you would never try to use the other hand unless absolutely forced to...". Attaining unimodality of an item distribution would seem to overcome the objections of Peters and Murphy. In a 1984 survey in Boston we used Oldfield's ten-item questionnaire exactly as published. This produced unimodal item distributions. With reflection of the five-point item scale and a logarithmic transformation, we achieved a degree of normalization for the items. Two surveys elsewhere based on Oldfield's 20-item list but with changes in the questionnaire format and the instructions, yielded markedly different item distributions with peaks at each extreme and sometimes in the middle as well.

  16. Item Modeling Concept Based on Multimedia Authoring

    Directory of Open Access Journals (Sweden)

    Janez Stergar

    2008-09-01

    Full Text Available In this paper a modern item design framework for computer based assessment based on Flash authoring environment will be introduced. Question design will be discussed as well as the multimedia authoring environment used for item modeling emphasized. Item type templates are a structured means of collecting and storing item information that can be used to improve the efficiency and security of the innovative item design process. Templates can modernize the item design, enhance and speed up the development process. Along with content creation, multimedia has vast potential for use in innovative testing. The introduced item design template is based on taxonomy of innovative items which have great potential for expanding the content areas and construct coverage of an assessment. The presented item design approach is based on GUI's – one for question design based on implemented item design templates and one for user interaction tracking/retrieval. The concept of user interfaces based on Flash technology will be discussed as well as implementation of the innovative approach of the item design forms with multimedia authoring. Also an innovative method for user interaction storage/retrieval based on PHP extending Flash capabilities in the proposed framework will be introduced.

  17. The medial temporal lobes distinguish between within-item and item-context relations during autobiographical memory retrieval.

    Science.gov (United States)

    Sheldon, Signy; Levine, Brian

    2015-12-01

    During autobiographical memory retrieval, the medial temporal lobes (MTL) relate together multiple event elements, including object (within-item relations) and context (item-context relations) information, to create a cohesive memory. There is consistent support for a functional specialization within the MTL according to these relational processes, much of which comes from recognition memory experiments. In this study, we compared brain activation patterns associated with retrieving within-item relations (i.e., associating conceptual and sensory-perceptual object features) and item-context relations (i.e., spatial relations among objects) with respect to naturalistic autobiographical retrieval. We developed a novel paradigm that cued participants to retrieve information about past autobiographical events, non-episodic within-item relations, and non-episodic item-context relations with the perceptuomotor aspects of retrieval equated across these conditions. We used multivariate analysis techniques to extract common and distinct patterns of activity among these conditions within the MTL and across the whole brain, both in terms of spatial and temporal patterns of activity. The anterior MTL (perirhinal cortex and anterior hippocampus) was preferentially recruited for generating within-item relations later in retrieval whereas the posterior MTL (posterior parahippocampal cortex and posterior hippocampus) was preferentially recruited for generating item-context relations across the retrieval phase. These findings provide novel evidence for functional specialization within the MTL with respect to naturalistic memory retrieval. © 2015 Wiley Periodicals, Inc.

  18. A strategy for optimizing item-pool management

    NARCIS (Netherlands)

    Ariel, A.; van der Linden, Willem J.; Veldkamp, Bernard P.

    2006-01-01

    Item-pool management requires a balancing act between the input of new items into the pool and the output of tests assembled from it. A strategy for optimizing item-pool management is presented that is based on the idea of a periodic update of an optimal blueprint for the item pool to tune item

  19. A psychometric comparison of three scales and a single-item measure to assess sexual satisfaction.

    Science.gov (United States)

    Mark, Kristen P; Herbenick, Debby; Fortenberry, J Dennis; Sanders, Stephanie; Reece, Michael

    2014-01-01

    This study was designed to systematically compare and contrast the psychometric properties of three scales developed to measure sexual satisfaction and a single-item measure of sexual satisfaction. The Index of Sexual Satisfaction (ISS), Global Measure of Sexual Satisfaction (GMSEX), and the New Sexual Satisfaction Scale-Short (NSSS-S) were compared to one another and to a single-item measure of sexual satisfaction. Conceptualization of the constructs, distribution of scores, internal consistency, convergent validity, test-retest reliability, and factor structure were compared between the measures. A total of 211 men and 214 women completed the scales and a measure of relationship satisfaction, with 33% (n = 139) of the sample reassessed two months later. All scales demonstrated appropriate distribution of scores and adequate internal consistency. The GMSEX, NSSS-S, and the single-item measure demonstrated convergent validity. Test-retest reliability was demonstrated by the ISS, GMSEX, and NSSS-S, but not the single-item measure. Taken together, the GMSEX received the strongest psychometric support in this sample for a unidimensional measure of sexual satisfaction and the NSSS-S received the strongest psychometric support in this sample for a bidimensional measure of sexual satisfaction.

  20. NASA: Black soot fuels global warming

    CERN Multimedia

    2003-01-01

    New research from NASA's Goddard Space Center scientists suggests emissions of black soot have been altering the way sunlight reflects off Earth's snow. The research indicates the soot could be responsible for as much as 25 percent of global warming over the past century (assorted news items, 1 paragraph each).

  1. Item response theory - A first approach

    Science.gov (United States)

    Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

    2017-07-01

    The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.

  2. Human dietary δ(15)N intake: representative data for principle food items.

    Science.gov (United States)

    Huelsemann, F; Koehler, K; Braun, H; Schaenzer, W; Flenker, U

    2013-09-01

    Dietary analysis using δ(15)N values of human remains such as bone and hair is usually based on general principles and limited data sets. Even for modern humans, the direct ascertainment of dietary δ(15)N is difficult and laborious, due to the complexity of metabolism and nitrogen fractionation, differing dietary habits and variation of δ(15)N values of food items. The objective of this study was to summarize contemporary regional experimental and global literature data to ascertain mean representative δ(15)N values for distinct food categories. A comprehensive data set of more than 12,000 analyzed food samples was summarized from the literature. Data originated from studies dealing with (1) authenticity tracing or origin control of food items, and (2) effects of fertilization or nutrition on δ(15)N values of plants or animals. Regional German food δ(15)N values revealed no major differences compared with the mean global values derived from the literature. We found that, in contrast to other food categories, historical faunal remains of pig and poultry are significantly enriched in (15)N compared to modern samples. This difference may be due to modern industrialized breeding practices. In some food categories variations in agricultural and feeding regimens cause significant differences in δ(15)N values that may lead to misinterpretations when only limited information is available. Copyright © 2013 Wiley Periodicals, Inc.

  3. Investigating Separate and Concurrent Approaches for Item Parameter Drift in 3PL Item Response Theory Equating

    Science.gov (United States)

    Arce-Ferrer, Alvaro J.; Bulut, Okan

    2017-01-01

    This study examines separate and concurrent approaches to combine the detection of item parameter drift (IPD) and the estimation of scale transformation coefficients in the context of the common item nonequivalent groups design with the three-parameter item response theory equating. The study uses real and synthetic data sets to compare the two…

  4. ITEM Project: Risk Communication on Exposure to Electromagnetic Radiation from Mobile Communications

    International Nuclear Information System (INIS)

    Oliveira, Carla; Carpinteiro, Goncalo; Correia, Luis M.; Fernandes, Carlos A.; Serralha, Afonso; Marques, Nuno

    2004-01-01

    The ITEM Project is a pioneer project in Portugal, providing public information on exposure to electromagnetic radiation, essentially due to mobile communication systems. The motivation, the main goals and the Project description are presented in this paper, as well as the website that provides the public dissemination of results and further significant information (www.lx.it.pt/item). This site provides information on different issues related to exposure to radiation, namely results of measurement campaigns conducted by a team on several locations in Portugal, and results of continuous measurements performed by autonomous stations located in public places in collaboration with municipal authorities. The global overview of the results from the measurement campaigns carried out up to present shows that all the analysed locations are in compliance with the radiation thresholds, i.e., all the electric field measured values are below the most restrictive threshold established at European level. (author)

  5. Dutch Patient-Generated Subjective Global Assessment (PG-SGA): training improves scores for comprehensibility and difficulty

    NARCIS (Netherlands)

    Danique Haven; Martine J. Sealy; Jan Roodenburg; Dr. C.P. van der Schans; Dr. Harriët Jager-Wittenaar; Anne van der Braak; Faith Ottery

    2015-01-01

    The Patient-Generated Subjective Global Assessment (PG-SGA) is a validated instrument to assess and monitor malnutrition. The PG-SGA consists of both patient-reported and professional-reported items. A professional should be able to correctly interpret all items. Untrained professionals may

  6. 76 FR 60474 - Commercial Item Handbook

    Science.gov (United States)

    2011-09-29

    ... DEPARTMENT OF DEFENSE Defense Acquisition Regulations System Commercial Item Handbook AGENCY.... SUMMARY: DoD has updated its Commercial Item Handbook. The purpose of the Handbook is to help acquisition personnel develop sound business strategies for procuring commercial items. DoD is seeking industry input on...

  7. Spare Items validation

    International Nuclear Information System (INIS)

    Fernandez Carratala, L.

    1998-01-01

    There is an increasing difficulty for purchasing safety related spare items, with certifications by manufacturers for maintaining the original qualifications of the equipment of destination. The main reasons are, on the top of the logical evolution of technology, applied to the new manufactured components, the quitting of nuclear specific production lines and the evolution of manufacturers quality systems, originally based on nuclear codes and standards, to conventional industry standards. To face this problem, for many years different Dedication processes have been implemented to verify whether a commercial grade element is acceptable to be used in safety related applications. In the same way, due to our particular position regarding the spare part supplies, mainly from markets others than the american, C.N. Trillo has developed a methodology called Spare Items Validation. This methodology, which is originally based on dedication processes, is not a single process but a group of coordinated processes involving engineering, quality and management activities. These are to be performed on the spare item itself, its design control, its fabrication and its supply for allowing its use in destinations with specific requirements. The scope of application is not only focussed on safety related items, but also to complex design, high cost or plant reliability related components. The implementation in C.N. Trillo has been mainly curried out by merging, modifying and making the most of processes and activities which were already being performed in the company. (Author)

  8. Item Analysis in Introductory Economics Testing.

    Science.gov (United States)

    Tinari, Frank D.

    1979-01-01

    Computerized analysis of multiple choice test items is explained. Examples of item analysis applications in the introductory economics course are discussed with respect to three objectives: to evaluate learning; to improve test items; and to help improve classroom instruction. Problems, costs and benefits of the procedures are identified. (JMD)

  9. A review of the effects on IRT item parameter estimates with a focus on misbehaving common items in test equating

    Directory of Open Access Journals (Sweden)

    Michalis P Michaelides

    2010-10-01

    Full Text Available Many studies have investigated the topic of change or drift in item parameter estimates in the context of Item Response Theory. Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.

  10. A Review of the Effects on IRT Item Parameter Estimates with a Focus on Misbehaving Common Items in Test Equating.

    Science.gov (United States)

    Michaelides, Michalis P

    2010-01-01

    Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.

  11. A Comparison of the 27-Item and 12-Item Intolerance of Uncertainty Scales

    Science.gov (United States)

    Khawaja, Nigar G.; Yu, Lai Ngo Heidi

    2010-01-01

    The 27-item Intolerance of Uncertainty Scale (IUS) has become one of the most frequently used measures of Intolerance of Uncertainty. More recently, an abridged, 12-item version of the IUS has been developed. The current research used clinical (n = 50) and non-clinical (n = 56) samples to examine and compare the psychometric properties of both…

  12. Instruments to measure anxiety in children, adolescents, and young adults with cancer: a systematic review.

    Science.gov (United States)

    Lazor, Tanya; Tigelaar, Leonie; Pole, Jason D; De Souza, Claire; Tomlinson, Deborah; Sung, Lillian

    2017-09-01

    The primary objective was to describe anxiety measurement instruments used in children and adolescents with cancer or undergoing hematopoietic stem cell transplantation (HSCT) and summarize their content and psychometric properties. We conducted searches of MEDLINE, Embase, PsycINFO, HAPI, and CINAHL. We included studies that used at least one instrument to measure anxiety quantitatively in children or adolescents with cancer or undergoing HSCT. Two authors independently identified studies and abstracted study demographics and instrument characteristics. Twenty-seven instruments, 14 multi-item and 13 single-item, were used between 78 studies. The most commonly used instrument was the State-Trait Anxiety Inventory in 46 studies. Three multi-item instruments (Children's Manifest Anxiety Scale-Mandarin version, PROMIS Pediatric Anxiety Short Form, and the State-Trait Anxiety Inventory) and two single-item instruments (Faces Pain Scale-Revised and 10-cm Visual Analogue Scale, both adapted for anxiety) were found to be reliable and valid in children with cancer. We identified 14 different multi-item and 13 different single-item anxiety measurement instruments that have been used in pediatric cancer or HSCT. Only three multi-item and two single-item instruments were identified as being reliable and valid among pediatric cancer or HSCT patients and would therefore be appropriate to measure anxiety in this population.

  13. More is not Always Better: The Relation between Item Response and Item Response Time in Raven’s Matrices

    Directory of Open Access Journals (Sweden)

    Frank Goldhammer

    2015-03-01

    Full Text Available The role of response time in completing an item can have very different interpretations. Responding more slowly could be positively related to success as the item is answered more carefully. However, the association may be negative if working faster indicates higher ability. The objective of this study was to clarify the validity of each assumption for reasoning items considering the mode of processing. A total of 230 persons completed a computerized version of Raven’s Advanced Progressive Matrices test. Results revealed that response time overall had a negative effect. However, this effect was moderated by items and persons. For easy items and able persons the effect was strongly negative, for difficult items and less able persons it was less negative or even positive. The number of rules involved in a matrix problem proved to explain item difficulty significantly. Most importantly, a positive interaction effect between the number of rules and item response time indicated that the response time effect became less negative with an increasing number of rules. Moreover, exploratory analyses suggested that the error type influenced the response time effect.

  14. Negative effects of item repetition on source memory.

    Science.gov (United States)

    Kim, Kyungmi; Yi, Do-Joon; Raye, Carol L; Johnson, Marcia K

    2012-08-01

    In the present study, we explored how item repetition affects source memory for new item-feature associations (picture-location or picture-color). We presented line drawings varying numbers of times in Phase 1. In Phase 2, each drawing was presented once with a critical new feature. In Phase 3, we tested memory for the new source feature of each item from Phase 2. Experiments 1 and 2 demonstrated and replicated the negative effects of item repetition on incidental source memory. Prior item repetition also had a negative effect on source memory when different source dimensions were used in Phases 1 and 2 (Experiment 3) and when participants were explicitly instructed to learn source information in Phase 2 (Experiments 4 and 5). Importantly, when the order between Phases 1 and 2 was reversed, such that item repetition occurred after the encoding of critical item-source combinations, item repetition no longer affected source memory (Experiment 6). Overall, our findings did not support predictions based on item predifferentiation, within-dimension source interference, or general interference from multiple traces of an item. Rather, the findings were consistent with the idea that prior item repetition reduces attention to subsequent presentations of the item, decreasing the likelihood that critical item-source associations will be encoded.

  15. Psychometric Consequences of Subpopulation Item Parameter Drift

    Science.gov (United States)

    Huggins-Manley, Anne Corinne

    2017-01-01

    This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…

  16. Location Indices for Ordinal Polytomous Items Based on Item Response Theory. Research Report. ETS RR-15-20

    Science.gov (United States)

    Ali, Usama S.; Chang, Hua-Hua; Anderson, Carolyn J.

    2015-01-01

    Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index is needed to describe an item's location along a latent trait continuum. Situations in which a single index would be needed include item selection in computerized adaptive testing or test assembly. Therefore single…

  17. Few items in the thyroid-related quality of life instrument ThyPRO exhibited differential item functioning.

    Science.gov (United States)

    Watt, Torquil; Groenvold, Mogens; Hegedüs, Laszlo; Bonnema, Steen Joop; Rasmussen, Åse Krogh; Feldt-Rasmussen, Ulla; Bjorner, Jakob Bue

    2014-02-01

    To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis. A total of 838 patients with benign thyroid diseases completed the ThyPRO questionnaire (84 five-point items, 13 scales). Uniform and nonuniform DIF were investigated using ordinal logistic regression, testing for both statistical significance and magnitude (∆R(2) > 0.02). Scale level was estimated by the sum score, after purification. Twenty instances of DIF in 17 of the 84 items were found. Eight according to diagnosis, where the goiter scale was the one most affected, possibly due to differing perceptions in patients with auto-immune thyroid diseases compared to patients with simple goiter. Eight DIFs according to age were found, of which 5 were in positively worded items, which younger patients were more likely to endorse; one according to gender: women were more likely to report crying, and three according to educational level. The vast majority of DIF had only minor influence on the scale scores (0.1-2.3 points on the 0-100 scales), but two DIF corresponded to a difference of 4.6 and 9.8, respectively. Ordinal logistic regression identified DIF in 17 of 84 items. The potential impact of this on the present scales was low, but items displaying DIF could be avoided when developing abbreviated scales, where the potential impact of DIF (due to fewer items) will be larger.

  18. Loglinear multidimensional IRT models for polytomously scired Items

    NARCIS (Netherlands)

    Kelderman, Henk

    1988-01-01

    A loglinear item response theory (IRT) model is proposed that relates polytomously scored item responses to a multidimensional latent space. Each item may have a different response function where each item response may be explained by one or more latent traits. Item response functions may follow a

  19. 48 CFR 852.214-72 - Alternate item(s).

    Science.gov (United States)

    2010-10-01

    ... AND FORMS SOLICITATION PROVISIONS AND CONTRACT CLAUSES Texts of Provisions and Clauses 852.214-72... 2008) Bids on []* will be given equal consideration along with bids on []** and any such bids received... [].** * Contracting officer will insert an alternate item that is considered acceptable. ** Contracting officer will...

  20. Macrostructural Treatment of Multi-word Lexical Items

    Directory of Open Access Journals (Sweden)

    Alenka Vrbinc

    2011-05-01

    Full Text Available The paper discusses the macrostructural treatment of multi-word lexical items in mono- and bilingual dictionaries. First, the classification of multi-word lexical items is presented, and special attention is paid to the discussion of compounds – a specific group of multi-word lexical items that is most commonly afforded headword status but whose inclusion in the headword list may also depend on spelling. Then the inclusion of multi-word lexical items in monolingual dictionaries is dealt with in greater detail, while the results of a short survey on the inclusion of five randomly chosen multi-word lexical items in seven English monolingual dictionaries are presented. The proposals as to how to treat these five multi-word lexical items in bilingual dictionaries are presented in the section about the inclusion of multi-word lexical items in bilingual dictionaries. The conclusion is that it is most important to take the users’ needs into consideration and to make any dictionary as user friendly as possible.

  1. Losing Items in the Psychogeriatric Nursing Home

    Directory of Open Access Journals (Sweden)

    J. van Hoof PhD

    2016-09-01

    Full Text Available Introduction: Losing items is a time-consuming occurrence in nursing homes that is ill described. An explorative study was conducted to investigate which items got lost by nursing home residents, and how this affects the residents and family caregivers. Method: Semi-structured interviews and card sorting tasks were conducted with 12 residents with early-stage dementia and 12 family caregivers. Thematic analysis was applied to the outcomes of the sessions. Results: The participants stated that numerous personal items and assistive devices get lost in the nursing home environment, which had various emotional, practical, and financial implications. Significant amounts of time are spent on trying to find items, varying from 1 hr up to a couple of weeks. Numerous potential solutions were identified by the interviewees. Discussion: Losing items often goes together with limitations to the participation of residents. Many family caregivers are reluctant to replace lost items, as these items may get lost again.

  2. ‘Forget me (not?’ – Remembering forget-items versus un-cued items in directed forgetting

    Directory of Open Access Journals (Sweden)

    Bastian eZwissler

    2015-11-01

    Full Text Available Humans need to be able to selectively control their memories. Here, we investigate the underlying processes in item-method directed forgetting and compare the classic active memory cues in this paradigm with a passive instruction. Typically, individual items are presented and each is followed by either a forget- or remember-instruction. On a surprise test of all items, memory is then worse for to-be-forgotten items (TBF compared to to-be-remembered items (TBR. This is thought to result from selective rehearsal of TBR, or from active inhibition of TBF, or from both. However, evidence suggests that if a forget instruction initiates active processing, paradoxical effects may also arise. To investigate the underlying mechanisms, four experiments were conducted where un-cued items (UI were introduced and recognition performance was compared between TBR, TBF and UI stimuli. Accuracy was encouraged via a performance-dependent monetary bonus. Across all experiments, including perceptually fully matched variants, memory accuracy for TBF was reduced compared to TBR, but better than for UI. Moreover, participants used a more conservative response criterion when responding to TBF stimuli. Thus, ironically, the F cue results in active processing, but this does not have inhibitory effects that would impair recognition memory beyond a un-cued baseline condition. This casts doubts on inhibitory accounts of item-method directed forgetting and is also difficult to reconcile with pure selective rehearsal of TBR. While the F-cue does induce active processing, this does not result in particularly successful forgetting. The pattern seems most consistent with the notion of ironic processing.

  3. Irrational Delay Revisited: Examining Five Procrastination Scales in a Global Sample.

    Science.gov (United States)

    Svartdal, Frode; Steel, Piers

    2017-01-01

    Scales attempting to measure procrastination focus on different facets of the phenomenon, yet they share a common understanding of procrastination as an unnecessary, unwanted, and disadvantageous delay. The present paper examines in a global sample ( N = 4,169) five different procrastination scales - Decisional Procrastination Scale (DPS), Irrational Procrastination Scale (IPS), Pure Procrastination Scale (PPS), Adult Inventory of Procrastination Scale (AIP), and General Procrastination Scale (GPS), focusing on factor structures and item functioning using Confirmatory Factor Analysis and Item Response Theory. The results indicated that The PPS (12 items selected from DPS, AIP, and GPS) measures different facets of procrastination even better than the three scales it is based on. An even shorter version of the PPS (5 items focusing on irrational delay), corresponds well to the nine-item IPS. Both scales demonstrate good psychometric properties and appear to be superior measures of core procrastination attributes than alternative procrastination scales.

  4. Do animals and furniture items elicit different brain responses in human infants?

    Science.gov (United States)

    Jeschonek, Susanna; Marinovic, Vesna; Hoehl, Stefanie; Elsner, Birgit; Pauen, Sabina

    2010-11-01

    One of the earliest categorical distinctions to be made by preverbal infants is the animate-inanimate distinction. To explore the neural basis for this distinction in 7-8-month-olds, an equal number of animal and furniture pictures was presented in an ERP-paradigm. The total of 118 pictures, all looking different from each other, were presented in a semi-randomized order for 1000ms each. Infants' brain responses to exemplars from both categories differed systematically regarding the negative central component (Nc: 400-600ms) at anterior channels. More specifically, the Nc was enhanced for animals in one subgroup of infants, and for furniture items in another subgroup of infants. Explorative analyses related to categorical priming further revealed category-specific differences in brain responses in the late time window (650-1550ms) at right frontal channels: Unprimed stimuli (preceded by a different-category item) elicited a more positive response as compared to primed stimuli (preceded by a same-category item). In sum, these findings suggest that the infant's brain discriminates exemplars from both global domains. Given the design of our task, we conclude that processes of category identification are more likely to account for our findings than processes of on-line category formation during the experimental session. Copyright © 2009 Elsevier B.V. All rights reserved.

  5. Item selection via Bayesian IRT models.

    Science.gov (United States)

    Arima, Serena

    2015-02-10

    With reference to a questionnaire that aimed to assess the quality of life for dysarthric speakers, we investigate the usefulness of a model-based procedure for reducing the number of items. We propose a mixed cumulative logit model, which is known in the psychometrics literature as the graded response model: responses to different items are modelled as a function of individual latent traits and as a function of item characteristics, such as their difficulty and their discrimination power. We jointly model the discrimination and the difficulty parameters by using a k-component mixture of normal distributions. Mixture components correspond to disjoint groups of items. Items that belong to the same groups can be considered equivalent in terms of both difficulty and discrimination power. According to decision criteria, we select a subset of items such that the reduced questionnaire is able to provide the same information that the complete questionnaire provides. The model is estimated by using a Bayesian approach, and the choice of the number of mixture components is justified according to information criteria. We illustrate the proposed approach on the basis of data that are collected for 104 dysarthric patients by local health authorities in Lecce and in Milan. Copyright © 2014 John Wiley & Sons, Ltd.

  6. Re-evaluating a vision-related quality of life questionnaire with item response theory (IRT and differential item functioning (DIF analyses

    Directory of Open Access Journals (Sweden)

    Knol Dirk L

    2011-09-01

    Full Text Available Abstract Background For the Low Vision Quality Of Life questionnaire (LVQOL it is unknown whether the psychometric properties are satisfactory when an item response theory (IRT perspective is considered. This study evaluates some essential psychometric properties of the LVQOL questionnaire in an IRT model, and investigates differential item functioning (DIF. Methods Cross-sectional data were used from an observational study among visually-impaired patients (n = 296. Calibration was performed for every dimension of the LVQOL in the graded response model. Item goodness-of-fit was assessed with the S-X2-test. DIF was assessed on relevant background variables (i.e. age, gender, visual acuity, eye condition, rehabilitation type and administration type with likelihood-ratio tests for DIF. The magnitude of DIF was interpreted by assessing the largest difference in expected scores between subgroups. Measurement precision was assessed by presenting test information curves; reliability with the index of subject separation. Results All items of the LVQOL dimensions fitted the model. There was significant DIF on several items. For two items the maximum difference between expected scores exceeded one point, and DIF was found on multiple relevant background variables. Item 1 'Vision in general' from the "Adjustment" dimension and item 24 'Using tools' from the "Reading and fine work" dimension were removed. Test information was highest for the "Reading and fine work" dimension. Indices for subject separation ranged from 0.83 to 0.94. Conclusions The items of the LVQOL showed satisfactory item fit to the graded response model; however, two items were removed because of DIF. The adapted LVQOL with 21 items is DIF-free and therefore seems highly appropriate for use in heterogeneous populations of visually impaired patients.

  7. Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

    Science.gov (United States)

    Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina

    2015-06-01

    This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism. © The Author(s) 2014.

  8. Software Note: Using BILOG for Fixed-Anchor Item Calibration

    Science.gov (United States)

    DeMars, Christine E.; Jurich, Daniel P.

    2012-01-01

    The nonequivalent groups anchor test (NEAT) design is often used to scale item parameters from two different test forms. A subset of items, called the anchor items or common items, are administered as part of both test forms. These items are used to adjust the item calibrations for any differences in the ability distributions of the groups taking…

  9. Inventions on presenting textual items in Graphical User Interface

    OpenAIRE

    Mishra, Umakant

    2014-01-01

    Although a GUI largely replaces textual descriptions by graphical icons, the textual items are not completely removed. The textual items are inevitably used in window titles, message boxes, help items, menu items and popup items. Textual items are necessary for communicating messages that are beyond the limitation of graphical messages. However, it is necessary to harness the textual items on the graphical interface in such a way that they complement each other to produce the best effect. One...

  10. Most efficient questionnaires to measure quality of life, physical function, and pain in patients with metastatic spine disease: a cross-sectional prospective survey study.

    Science.gov (United States)

    Paulino Pereira, Nuno Rui; Janssen, Stein J; Raskin, Kevin A; Hornicek, Francis J; Ferrone, Marco L; Shin, John H; Bramer, Jos A M; van Dijk, Cornelis Nicolaas; Schwab, Joseph H

    2017-07-01

    Assessing quality of life, functional outcome, and pain has become important in assessing the effectiveness of treatment for metastatic spine disease. Many questionnaires are able to measure these outcomes; few are validated in patients with metastatic spine disease. As a result, there is no consensus on the ideal questionnaire to use in these patients. Our study aim was to assess whether certain questionnaires measuring quality of life, functional outcome, and pain (1) correlated with each other, (2) measured the construct they claim to measure, (3) had good coverage-floor and ceiling effects, (4) were reliable, and (5) whether there were differences in completion time between them. This is a prospective cross-sectional survey study from three outpatient clinics (two orthopedic oncology clinics and one neurosurgery clinic) from two affiliated tertiary hospital care centers. We included 100 consecutive patients with metastatic spine disease between July 2014 and February 2016. We excluded non-English-speaking patients. The following questionnaires were given in random order: Oswestry Disability Index (ODI) or Neck Disability Index (NDI), Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function, PROMIS Pain Intensity, EuroQol-5 Dimensions (EQ-5D), and the Spine Oncology Study Group Outcome Questionnaire (SOSG-OQ). We used exploratory factor analysis-correlating questionnaires with an underlying mathematically derived trait-to assess if questionnaires measured the same concept. Coverage was assessed by floor and ceiling effects, and reliability was assessed by standard error of measurement as a function of ability. Differences in completion times were tested using the Friedman test. Questionnaires measured the construct they were developed for, as demonstrated with high correlations (>0.7) with the underlying trait. A floor effect was present in the PROMIS Pain Intensity (7.0%), ODI or NDI (4.0%), and the PROMIS Physical Function (1

  11. Science Library of Test Items. Volume Eighteen. A Collection of Multiple Choice Test Items Relating Mainly to Chemistry.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  12. Science Library of Test Items. Volume Seventeen. A Collection of Multiple Choice Test Items Relating Mainly to Biology.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  13. Science Library of Test Items. Volume Nineteen. A Collection of Multiple Choice Test Items Relating Mainly to Geology.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  14. Feed mechanism and method for feeding minute items

    Science.gov (United States)

    Stringer, Timothy Kent [Bucyrus, KS; Yerganian, Simon Scott [Lee's Summit, MO

    2009-10-20

    A feeding mechanism and method for feeding minute items, such as capacitors, resistors, or solder preforms. The mechanism is adapted to receive a plurality of the randomly-positioned and randomly-oriented extremely small or minute items, and to isolate, orient, and position one or more of the items in a specific repeatable pickup location wherefrom they may be removed for use by, for example, a computer-controlled automated assembly machine. The mechanism comprises a sliding shelf adapted to receive and support the items; a wiper arm adapted to achieve a single even layer of the items; and a pushing arm adapted to push the items into the pickup location. The mechanism can be adapted for providing the items with a more exact orientation, and can also be adapted for use in a liquid environment.

  15. Memory for Items and Relationships among Items Embedded in Realistic Scenes: Disproportionate Relational Memory Impairments in Amnesia

    Science.gov (United States)

    Hannula, Deborah E.; Tranel, Daniel; Allen, John S.; Kirchhoff, Brenda A.; Nickel, Allison E.; Cohen, Neal J.

    2014-01-01

    Objective The objective of this study was to examine the dependence of item memory and relational memory on medial temporal lobe (MTL) structures. Patients with amnesia, who either had extensive MTL damage or damage that was relatively restricted to the hippocampus, were tested, as was a matched comparison group. Disproportionate relational memory impairments were predicted for both patient groups, and those with extensive MTL damage were also expected to have impaired item memory. Method Participants studied scenes, and were tested with interleaved two-alternative forced-choice probe trials. Probe trials were either presented immediately after the corresponding study trial (lag 1), five trials later (lag 5), or nine trials later (lag 9) and consisted of the studied scene along with a manipulated version of that scene in which one item was replaced with a different exemplar (item memory test) or was moved to a new location (relational memory test). Participants were to identify the exact match of the studied scene. Results As predicted, patients were disproportionately impaired on the test of relational memory. Item memory performance was marginally poorer among patients with extensive MTL damage, but both groups were impaired relative to matched comparison participants. Impaired performance was evident at all lags, including the shortest possible lag (lag 1). Conclusions The results are consistent with the proposed role of the hippocampus in relational memory binding and representation, even at short delays, and suggest that the hippocampus may also contribute to successful item memory when items are embedded in complex scenes. PMID:25068665

  16. Applying Hierarchical Model Calibration to Automatically Generated Items.

    Science.gov (United States)

    Williamson, David M.; Johnson, Matthew S.; Sinharay, Sandip; Bejar, Isaac I.

    This study explored the application of hierarchical model calibration as a means of reducing, if not eliminating, the need for pretesting of automatically generated items from a common item model prior to operational use. Ultimately the successful development of automatic item generation (AIG) systems capable of producing items with highly similar…

  17. 41 CFR 101-27.404 - Review of items.

    Science.gov (United States)

    2010-07-01

    ... 41 Public Contracts and Property Management 2 2010-07-01 2010-07-01 true Review of items. 101-27.404 Section 101-27.404 Public Contracts and Property Management Federal Property Management...-Elimination of Items From Inventory § 101-27.404 Review of items. Except for standby or reserve stocks, items...

  18. Towards an authoring system for item construction

    NARCIS (Netherlands)

    Rikers, Jos H.A.N.

    1988-01-01

    The process of writing test items is analyzed, and a blueprint is presented for an authoring system for test item writing to reduce invalidity and to structure the process of item writing. The developmental methodology is introduced, and the first steps in the process are reported. A historical

  19. Modeling Local Item Dependence in Cloze and Reading Comprehension Test Items Using Testlet Response Theory

    Science.gov (United States)

    Baghaei, Purya; Ravand, Hamdollah

    2016-01-01

    In this study the magnitudes of local dependence generated by cloze test items and reading comprehension items were compared and their impact on parameter estimates and test precision was investigated. An advanced English as a foreign language reading comprehension test containing three reading passages and a cloze test was analyzed with a…

  20. 10 CFR 835.605 - Labeling items and containers.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false Labeling items and containers. 835.605 Section 835.605... items and containers. Except as provided at § 835.606, each item or container of radioactive material... information to permit individuals handling, using, or working in the vicinity of the items or containers to...

  1. Development of the movement domain in the global body examination.

    Science.gov (United States)

    Kvåle, Alice; Bunkan, Berit Heir; Opjordsmoen, Stein; Friis, Svein

    2012-01-01

    The purpose of this study was to develop a new Movement domain, based on 16 items from the Global Physiotherapy Examination-52 (GPE-52) and 18 items from the Comprehensive Body Examination (CBE). Furthermore, we examined how well the new domain and its scales would discriminate between healthy individuals and different groups of patients, compared to the original methods. Two physiotherapists, each using one method, independently examined 132 individuals (34 healthy, 32 with localized pain, 32 with generalized pain, and 34 with psychoses). The number of items was reduced by means of correlational and exploratory factor analysis. Internal consistency was examined with Cronbach's alpha. For examination of discriminative validity, Mann-Whitney U-test and Area under the Curve (AUC) were used. The initial 34 items were reduced to two subscales with 13 items: one for range of movement and balance and one for flexibility. Cronbach's alpha was 0.84 and 0.87 for the two subscales. The new subscales showed very good to excellent discriminating ability between healthy persons and the different patient groups (p movement aberrations than the other patient groups. The new Movement domain had fewer items than the GPE-52 and CBE, without losing discriminative validity.

  2. Obtaining a Proportional Allocation by Deleting Items

    NARCIS (Netherlands)

    Dorn, B.; de Haan, R.; Schlotter, I.; Röthe, J.

    2017-01-01

    We consider the following control problem on fair allocation of indivisible goods. Given a set I of items and a set of agents, each having strict linear preference over the items, we ask for a minimum subset of the items whose deletion guarantees the existence of a proportional allocation in the

  3. Item-Based Top-N Recommendation Algorithms

    Science.gov (United States)

    2003-01-20

    basket of items, utilized by many e-commerce sites, cannot take advantage of pre-computed user-to-user similarities. Finally, even though the...not discriminate between items that are present in frequent itemsets and items that are not, while still maintaining the computational advantages of...453219 0.02% 7.74 ccard 42629 68793 398619 0.01% 9.35 ecommerce 6667 17491 91222 0.08% 13.68 em 8002 1648 769311 5.83% 96.14 ml 943 1682 100000 6.31

  4. Normative data for the 12 item WHO Disability Assessment Schedule 2.0.

    Directory of Open Access Journals (Sweden)

    Gavin Andrews

    Full Text Available BACKGROUND: The World Health Organization Disability Assessment Schedule (WHODAS 2.0 measures disability due to health conditions including diseases, illnesses, injuries, mental or emotional problems, and problems with alcohol or drugs. METHOD: The 12 Item WHODAS 2.0 was used in the second Australian Survey of Mental Health and Well-being. We report the overall factor structure and the distribution of scores and normative data (means and SDs for people with any physical disorder, any mental disorder and for people with neither. FINDINGS: A single second order factor justifies the use of the scale as a measure of global disability. People with mental disorders had high scores (mean 6.3, SD 7.1, people with physical disorders had lower scores (mean 4.3, SD 6.1. People with no disorder covered by the survey had low scores (mean 1.4, SD 3.6. INTERPRETATION: The provision of normative data from a population sample of adults will facilitate use of the WHODAS 2.0 12 item scale in clinical and epidemiological research.

  5. A Review of Classical Methods of Item Analysis.

    Science.gov (United States)

    French, Christine L.

    Item analysis is a very important consideration in the test development process. It is a statistical procedure to analyze test items that combines methods used to evaluate the important characteristics of test items, such as difficulty, discrimination, and distractibility of the items in a test. This paper reviews some of the classical methods for…

  6. Electronics. Criterion-Referenced Test (CRT) Item Bank.

    Science.gov (United States)

    Davis, Diane, Ed.

    This document contains 519 criterion-referenced multiple choice and true or false test items for a course in electronics. The test item bank is designed to work with both the Vocational Instructional Management System (VIMS) and the Vocational Administrative Management System (VAMS) in Missouri. The items are grouped into 15 units covering the…

  7. 26 CFR 301.6501(o)-3 - Partnership items.

    Science.gov (United States)

    2010-04-01

    ... 26 Internal Revenue 18 2010-04-01 2010-04-01 false Partnership items. 301.6501(o)-3 Section 301... § 301.6501(o)-3 Partnership items. (a) Partnership item defined. For purposes of section 6501(o) (as it..., and § 301.6511(g)-1, the term “partnership item” means— (1) Any item required to be taken into account...

  8. Tree-Based Global Model Tests for Polytomous Rasch Models

    Science.gov (United States)

    Komboz, Basil; Strobl, Carolin; Zeileis, Achim

    2018-01-01

    Psychometric measurement models are only valid if measurement invariance holds between test takers of different groups. Global model tests, such as the well-established likelihood ratio (LR) test, are sensitive to violations of measurement invariance, such as differential item functioning and differential step functioning. However, these…

  9. A Balance Sheet for Educational Item Banking.

    Science.gov (United States)

    Hiscox, Michael D.

    Educational item banking presents observers with a considerable paradox. The development of test items from scratch is viewed as wasteful, a luxury in times of declining resources. On the other hand, item banking has failed to become a mature technology despite large amounts of money and the efforts of talented professionals. The question of which…

  10. Promoting cold-start items in recommender systems.

    Science.gov (United States)

    Liu, Jin-Hu; Zhou, Tao; Zhang, Zi-Ke; Yang, Zimo; Liu, Chuang; Li, Wei-Min

    2014-01-01

    As one of the major challenges, cold-start problem plagues nearly all recommender systems. In particular, new items will be overlooked, impeding the development of new products online. Given limited resources, how to utilize the knowledge of recommender systems and design efficient marketing strategy for new items is extremely important. In this paper, we convert this ticklish issue into a clear mathematical problem based on a bipartite network representation. Under the most widely used algorithm in real e-commerce recommender systems, the so-called item-based collaborative filtering, we show that to simply push new items to active users is not a good strategy. Interestingly, experiments on real recommender systems indicate that to connect new items with some less active users will statistically yield better performance, namely, these new items will have more chance to appear in other users' recommendation lists. Further analysis suggests that the disassortative nature of recommender systems contributes to such observation. In a word, getting in-depth understanding on recommender systems could pave the way for the owners to popularize their cold-start products with low costs.

  11. Promoting Cold-Start Items in Recommender Systems

    Science.gov (United States)

    Liu, Jin-Hu; Zhou, Tao; Zhang, Zi-Ke; Yang, Zimo; Liu, Chuang; Li, Wei-Min

    2014-01-01

    As one of the major challenges, cold-start problem plagues nearly all recommender systems. In particular, new items will be overlooked, impeding the development of new products online. Given limited resources, how to utilize the knowledge of recommender systems and design efficient marketing strategy for new items is extremely important. In this paper, we convert this ticklish issue into a clear mathematical problem based on a bipartite network representation. Under the most widely used algorithm in real e-commerce recommender systems, the so-called item-based collaborative filtering, we show that to simply push new items to active users is not a good strategy. Interestingly, experiments on real recommender systems indicate that to connect new items with some less active users will statistically yield better performance, namely, these new items will have more chance to appear in other users' recommendation lists. Further analysis suggests that the disassortative nature of recommender systems contributes to such observation. In a word, getting in-depth understanding on recommender systems could pave the way for the owners to popularize their cold-start products with low costs. PMID:25479013

  12. Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

    Science.gov (United States)

    Wang, Wei

    2013-01-01

    Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

  13. Negative affect impairs associative memory but not item memory.

    Science.gov (United States)

    Bisby, James A; Burgess, Neil

    2013-12-17

    The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine the effects of emotion on memory for items and their associations. By presenting neutral and negative items with background contexts, Experiment 1 demonstrated that item memory was facilitated by emotional affect, whereas memory for an associated context was reduced. In Experiment 2, arousal was manipulated independently of the memoranda, by a threat of shock, whereby encoding trials occurred under conditions of threat or safety. Memory for context was equally impaired by the presence of negative affect, whether induced by threat of shock or a negative item, relative to retrieval of the context of a neutral item in safety. In Experiment 3, participants were presented with neutral and negative items as paired associates, including all combinations of neutral and negative items. The results showed both above effects: compared to a neutral item, memory for the associate of a negative item (a second item here, context in Experiments 1 and 2) is impaired, whereas retrieval of the item itself is enhanced. Our findings suggest that negative affect impairs associative memory while recognition of a negative item is enhanced. They support dual-processing models in which negative affect or stress impairs hippocampal-dependent associative memory while the storage of negative sensory/perceptual representations is spared or even strengthened.

  14. Non-ignorable missingness item response theory models for choice effects in examinee-selected items.

    Science.gov (United States)

    Liu, Chen-Wei; Wang, Wen-Chung

    2017-11-01

    Examinee-selected item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set, always yields incomplete data (i.e., when only the selected items are answered, data are missing for the others) that are likely non-ignorable in likelihood inference. Standard item response theory (IRT) models become infeasible when ESI data are missing not at random (MNAR). To solve this problem, the authors propose a two-dimensional IRT model that posits one unidimensional IRT model for observed data and another for nominal selection patterns. The two latent variables are assumed to follow a bivariate normal distribution. In this study, the mirt freeware package was adopted to estimate parameters. The authors conduct an experiment to demonstrate that ESI data are often non-ignorable and to determine how to apply the new model to the data collected. Two follow-up simulation studies are conducted to assess the parameter recovery of the new model and the consequences for parameter estimation of ignoring MNAR data. The results of the two simulation studies indicate good parameter recovery of the new model and poor parameter recovery when non-ignorable missing data were mistakenly treated as ignorable. © 2017 The British Psychological Society.

  15. A Case Study on an Item Writing Process: Use of Test Specifications, Nature of Group Dynamics, and Individual Item Writers' Characteristics

    Science.gov (United States)

    Kim, Jiyoung; Chi, Youngshin; Huensch, Amanda; Jun, Heesung; Li, Hongli; Roullion, Vanessa

    2010-01-01

    This article discusses a case study on an item writing process that reflects on our practical experience in an item development project. The purpose of the article is to share our lessons from the experience aiming to demystify item writing process. The study investigated three issues that naturally emerged during the project: how item writers use…

  16. Automated Item Generation with Recurrent Neural Networks.

    Science.gov (United States)

    von Davier, Matthias

    2018-03-12

    Utilizing technology for automated item generation is not a new idea. However, test items used in commercial testing programs or in research are still predominantly written by humans, in most cases by content experts or professional item writers. Human experts are a limited resource and testing agencies incur high costs in the process of continuous renewal of item banks to sustain testing programs. Using algorithms instead holds the promise of providing unlimited resources for this crucial part of assessment development. The approach presented here deviates in several ways from previous attempts to solve this problem. In the past, automatic item generation relied either on generating clones of narrowly defined item types such as those found in language free intelligence tests (e.g., Raven's progressive matrices) or on an extensive analysis of task components and derivation of schemata to produce items with pre-specified variability that are hoped to have predictable levels of difficulty. It is somewhat unlikely that researchers utilizing these previous approaches would look at the proposed approach with favor; however, recent applications of machine learning show success in solving tasks that seemed impossible for machines not too long ago. The proposed approach uses deep learning to implement probabilistic language models, not unlike what Google brain and Amazon Alexa use for language processing and generation.

  17. Science Library of Test Items. Volume Twenty-Two. A Collection of Multiple Choice Test Items Relating Mainly to Skills.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  18. Science Library of Test Items. Volume Twenty. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 1.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  19. Does remembering emotional items impair recall of same-emotion items?

    Science.gov (United States)

    Sison, Jo Ann G; Mather, Mara

    2007-04-01

    In the part-set cuing effect, cuing a subset of previously studied items impairs recall of the remaining noncued items. This experiment reveals that cuing participants with previously-studied emotional pictures (e.g., fear-evoking pictures of people) can impair recall of pictures involving the same emotion but different content (e.g., fear-evoking pictures of animals). This indicates that new events can be organized in memory using emotion as a grouping function to create associations. However, whether new information is organized in memory along emotional or nonemotional lines appears to be a flexible process that depends on people's current focus. Mentioning in the instructions that the pictures were either amusement- or fear-related led to memory impairment for pictures with the same emotion as cued pictures, whereas mentioning that the pictures depicted either animals or people led to memory impairment for pictures with the same type of actor.

  20. Modeling Item-Level and Step-Level Invariance Effects in Polytomous Items Using the Partial Credit Model

    Science.gov (United States)

    Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D.

    2012-01-01

    Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…

  1. Item Information in the Rasch Model

    NARCIS (Netherlands)

    Engelen, Ron J.H.; van der Linden, Willem J.; Oosterloo, Sebe J.

    1988-01-01

    Fisher's information measure for the item difficulty parameter in the Rasch model and its marginal and conditional formulations are investigated. It is shown that expected item information in the unconditional model equals information in the marginal model, provided the assumption of sampling

  2. Work ability as prognostic risk marker of disability pension : Single-item work ability score versus multi-item work ability index

    NARCIS (Netherlands)

    Roelen, C.A.M.; Rhenen, van W.; Groothoff, J.W.; Klink, van der J.J.L.; Twisk, W.R.; Heymans, M.W.

    2014-01-01

    Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP.

  3. Memory deficit in patients with schizophrenia and posttraumatic stress disorder: relational vs item-specific memory

    Directory of Open Access Journals (Sweden)

    Jung W

    2016-05-01

    Full Text Available Wookyoung Jung,1 Seung-Hwan Lee1,2 1Clinical Emotions and Cognition Research Laboratory, Department of Psychiatry, Inje University, Ilsan-Paik Hospital, 2Department of Psychiatry, Inje University, Ilsan-Paik Hospital, Goyang, Korea Abstract: It has been well established that patients with schizophrenia have impairments in cognitive functioning and also that patients who experienced traumatic events suffer from cognitive deficits. Of the cognitive deficits revealed in schizophrenia or posttraumatic stress disorder (PTSD patients, the current article provides a brief review of deficit in episodic memory, which is highly predictive of patients’ quality of life and global functioning. In particular, we have focused on studies that compared relational and item-specific memory performance in schizophrenia and PTSD, because measures of relational and item-specific memory are considered the most promising constructs for immediate tangible development of clinical trial paradigm. The behavioral findings of schizophrenia are based on the tasks developed by the Cognitive Neuroscience Treatment Research to Improve Cognition in Schizophrenia (CNTRICS initiative and the Cognitive Neuroscience Test Reliability and Clinical Applications for Schizophrenia (CNTRACS Consortium. The findings we reviewed consistently showed that schizophrenia and PTSD are closely associated with more severe impairments in relational memory compared to item-specific memory. Candidate brain regions involved in relational memory impairment in schizophrenia and PTSD are also discussed. Keywords: schizophrenia, posttraumatic stress disorder, episodic memory deficit, relational memory, item-specific memory, prefrontal cortex, hippocampus

  4. CERN Running Club – Sale of Items

    CERN Multimedia

    CERN Running club

    2018-01-01

    The CERN Running Club is organising a sale of items  on 26 June from 11:30 – 13:00 in the entry area of Restaurant 2 (504 R-202). The items for sale are souvenir prizes of past Relay Races and comprise: Backpacks, thermos, towels, gloves & caps, lamps, long sleeve winter shirts and windproof vest. All items will be sold at 5 CHF.

  5. Work ability as prognostic risk marker of disability pension: single-item work ability score versus multi-item work ability index

    NARCIS (Netherlands)

    Roelen, C.A.M.; van Rhenen, W.; Groothoff, J.W.; van der Klink, J.J.L.; Twisk, J.W.R.; Heymans, M.W.

    2014-01-01

    Objectives Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. Methods This

  6. Work ability as prognostic risk marker of disability pension : single-item work ability score versus multi-item work ability index

    NARCIS (Netherlands)

    Roelen, Corne A. M.; van Rhenen, Willem; Groothoff, Johan W.; van der Klink, Jac J. L.; Twisk, Jos W. R.; Heymans, Martijn W.

    Objectives Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. Methods This

  7. Explaining Method Effects Associated with Negatively Worded Items in Trait and State Global and Domain-Specific Self-Esteem Scales

    Science.gov (United States)

    Tomas, Jose M.; Oliver, Amparo; Galiana, Laura; Sancho, Patricia; Lila, Marisol

    2013-01-01

    Several investigators have interpreted method effects associated with negatively worded items in a substantive way. This research extends those studies in different ways: (a) it establishes the presence of methods effects in further populations and particular scales, and (b) it examines the possible relations between a method factor associated…

  8. Globalization in a Religiously Pluralistic Environment: The Nigerian ...

    African Journals Online (AJOL)

    FIRST LADY

    Ways of assimilating the positive effects of globalization were considered while the .... appreciation of foreign goods including food items to the scorn of what they have. .... is verily not new among men but the current wind is also verily very fast that .... Like bad doctors, they could be said to treat the symptoms of the disease.

  9. Contextual cueing by global features

    Science.gov (United States)

    Kunar, Melina A.; Flusberg, Stephen J.; Wolfe, Jeremy M.

    2008-01-01

    In visual search tasks, attention can be guided to a target item, appearing amidst distractors, on the basis of simple features (e.g. find the red letter among green). Chun and Jiang’s (1998) “contextual cueing” effect shows that RTs are also speeded if the spatial configuration of items in a scene is repeated over time. In these studies we ask if global properties of the scene can speed search (e.g. if the display is mostly red, then the target is at location X). In Experiment 1a, the overall background color of the display predicted the target location. Here the predictive color could appear 0, 400 or 800 msec in advance of the search array. Mean RTs are faster in predictive than in non-predictive conditions. However, there is little improvement in search slopes. The global color cue did not improve search efficiency. Experiments 1b-1f replicate this effect using different predictive properties (e.g. background orientation/texture, stimuli color etc.). The results show a strong RT effect of predictive background but (at best) only a weak improvement in search efficiency. A strong improvement in efficiency was found, however, when the informative background was presented 1500 msec prior to the onset of the search stimuli and when observers were given explicit instructions to use the cue (Experiment 2). PMID:17355043

  10. Binomial test models and item difficulty

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1979-01-01

    In choosing a binomial test model, it is important to know exactly what conditions are imposed on item difficulty. In this paper these conditions are examined for both a deterministic and a stochastic conception of item responses. It appears that they are more restrictive than is generally

  11. Vegetable parenting practices scale: Item response modeling analyses

    Science.gov (United States)

    Our objective was to evaluate the psychometric properties of a vegetable parenting practices scale using multidimensional polytomous item response modeling which enables assessing item fit to latent variables and the distributional characteristics of the items in comparison to the respondents. We al...

  12. Efficient Algorithms for Segmentation of Item-Set Time Series

    Science.gov (United States)

    Chundi, Parvathi; Rosenkrantz, Daniel J.

    We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets—Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.

  13. Overview of Classical Test Theory and Item Response Theory for Quantitative Assessment of Items in Developing Patient-Reported Outcome Measures

    Science.gov (United States)

    Cappelleri, Joseph C.; Lundy, J. Jason; Hays, Ron D.

    2014-01-01

    Introduction The U.S. Food and Drug Administration’s patient-reported outcome (PRO) guidance document defines content validity as “the extent to which the instrument measures the concept of interest” (FDA, 2009, p. 12). “Construct validity is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity” (Strauss & Smith, 2009, p. 7). Hence both qualitative and quantitative information are essential in evaluating the validity of measures. Methods We review classical test theory and item response theory approaches to evaluating PRO measures including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized “difficulty” (severity) order of items is represented by observed responses. Conclusion Classical test theory and item response theory can be useful in providing a quantitative assessment of items and scales during the content validity phase of patient-reported outcome measures. Depending on the particular type of measure and the specific circumstances, either one or both approaches should be considered to help maximize the content validity of PRO measures. PMID:24811753

  14. CTTITEM: SAS macro and SPSS syntax for classical item analysis.

    Science.gov (United States)

    Lei, Pui-Wa; Wu, Qiong

    2007-08-01

    This article describes the functions of a SAS macro and an SPSS syntax that produce common statistics for conventional item analysis including Cronbach's alpha, item difficulty index (p-value or item mean), and item discrimination indices (D-index, point biserial and biserial correlations for dichotomous items and item-total correlation for polytomous items). These programs represent an improvement over the existing SAS and SPSS item analysis routines in terms of completeness and user-friendliness. To promote routine evaluations of item qualities in instrument development of any scale, the programs are available at no charge for interested users. The program codes along with a brief user's manual that contains instructions and examples are downloadable from suen.ed.psu.edu/-pwlei/plei.htm.

  15. Negative effects of item repetition on source memory

    OpenAIRE

    Kim, Kyungmi; Yi, Do-Joon; Raye, Carol L.; Johnson, Marcia K.

    2012-01-01

    In the present study, we explored how item repetition affects source memory for new item–feature associations (picture–location or picture–color). We presented line drawings varying numbers of times in Phase 1. In Phase 2, each drawing was presented once with a critical new feature. In Phase 3, we tested memory for the new source feature of each item from Phase 2. Experiments 1 and 2 demonstrated and replicated the negative effects of item repetition on incidental source memory. Prior item re...

  16. Measuring participation in patients with chronic back pain-the 5-Item Pain Disability Index.

    Science.gov (United States)

    McKillop, Ashley B; Carroll, Linda J; Dick, Bruce D; Battié, Michele C

    2018-02-01

    Of the three broad outcome domains of body functions and structures, activities, and participation (eg, engaging in valued social roles) outlined in the World Health Organization's (WHO) International Classification of Functioning, Disability and Health (ICF), it has been argued that participation is the most important to individuals, particularly those with chronic health problems. Yet, participation is not commonly measured in back pain research. The aim of this study was to investigate the construct validity of a modified 5-Item Pain Disability Index (PDI) score as a measure of participation in people with chronic back pain. A validation study was conducted using cross-sectional data. Participants with chronic back pain were recruited from a multidisciplinary pain center in Alberta, Canada. The outcome measure of interest is the 5-Item PDI. Each study participant was given a questionnaire package containing measures of participation, resilience, anxiety and depression, pain intensity, and pain-related disability, in addition to the PDI. The first five items of the PDI deal with social roles involving family responsibilities, recreation, social activities with friends, work, and sexual behavior, and comprised the 5-Item PDI seeking to measure participation. The last two items of the PDI deal with self-care and life support functions and were excluded. Construct validity of the 5-Item PDI as a measure of participation was examined using Pearson correlations or point-biserial correlations to test each hypothesized association. Participants were 70 people with chronic back pain and a mean age of 48.1 years. Forty-four (62.9%) were women. As hypothesized, the 5-Item PDI was associated with all measures of participation, including the Participation Assessment with Recombined Tools-Objective (r=-0.61), Late-Life Function and Disability Instrument: Disability Component (frequency: r=-0.66; limitation: r=-0.65), Work and Social Adjustment Scale (r=0.85), a global

  17. Three controversies over item disclosure in medical licensure examinations

    Directory of Open Access Journals (Sweden)

    Yoon Soo Park

    2015-09-01

    Full Text Available In response to views on public's right to know, there is growing attention to item disclosure – release of items, answer keys, and performance data to the public – in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations – 1 fairness and validity, 2 impact on passing levels, and 3 utility of item disclosure – by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers’ right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration.

  18. Method of data mining including determining multidimensional coordinates of each item using a predetermined scalar similarity value for each item pair

    Science.gov (United States)

    Meyers, Charles E.; Davidson, George S.; Johnson, David K.; Hendrickson, Bruce A.; Wylie, Brian N.

    1999-01-01

    A method of data mining represents related items in a multidimensional space. Distance between items in the multidimensional space corresponds to the extent of relationship between the items. The user can select portions of the space to perceive. The user also can interact with and control the communication of the space, focusing attention on aspects of the space of most interest. The multidimensional spatial representation allows more ready comprehension of the structure of the relationships among the items.

  19. Guide to good practices for the development of test items

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-01-01

    While the methodology used in developing test items can vary significantly, to ensure quality examinations, test items should be developed systematically. Test design and development is discussed in the DOE Guide to Good Practices for Design, Development, and Implementation of Examinations. This guide is intended to be a supplement by providing more detailed guidance on the development of specific test items. This guide addresses the development of written examination test items primarily. However, many of the concepts also apply to oral examinations, both in the classroom and on the job. This guide is intended to be used as guidance for the classroom and laboratory instructor or curriculum developer responsible for the construction of individual test items. This document focuses on written test items, but includes information relative to open-reference (open book) examination test items, as well. These test items have been categorized as short-answer, multiple-choice, or essay. Each test item format is described, examples are provided, and a procedure for development is included. The appendices provide examples for writing test items, a test item development form, and examples of various test item formats.

  20. Asia's growing role in the global energy markets

    International Nuclear Information System (INIS)

    Anon.

    1995-01-01

    Three articles are drawn together in this special Petroleum Economist survey on the growing role played by Asian countries in global energy markets, both as world gas suppliers and as important markets for various oil products. The first looks at independent storage in the Asian countries in global energy markets, both as world gas suppliers and as important markets for various oil products. The first looks at independent storage in the Asian Pacific area; the second describes the growth of Asia's natural gas industry and the third item celebrates the product surplus produced due to recent refinery expansion programs. (UK)

  1. 38 CFR 3.1606 - Transportation items.

    Science.gov (United States)

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Transportation items. 3... Burial Benefits § 3.1606 Transportation items. The transportation costs of those persons who come within... shipment. (6) Cost of transportation by common carrier including amounts paid as Federal taxes. (7) Cost of...

  2. Assessing difference between classical test theory and item ...

    African Journals Online (AJOL)

    Assessing difference between classical test theory and item response theory methods in scoring primary four multiple choice objective test items. ... All research participants were ranked on the CTT number correct scores and the corresponding IRT item pattern scores from their performance on the PRISMADAT. Wilcoxon ...

  3. The basics of item response theory using R

    CERN Document Server

    Baker, Frank B

    2017-01-01

    This graduate-level textbook is a tutorial for item response theory that covers both the basics of item response theory and the use of R for preparing graphical presentation in writings about the theory. Item response theory has become one of the most powerful tools used in test construction, yet one of the barriers to learning and applying it is the considerable amount of sophisticated computational effort required to illustrate even the simplest concepts. This text provides the reader access to the basic concepts of item response theory freed of the tedious underlying calculations. It is intended for those who possess limited knowledge of educational measurement and psychometrics. Rather than presenting the full scope of item response theory, this textbook is concise and practical and presents basic concepts without becoming enmeshed in underlying mathematical and computational complexities. Clearly written text and succinct R code allow anyone familiar with statistical concepts to explore and apply item re...

  4. Science Library of Test Items. Volume Twenty-One. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 2.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  5. Few items in the thyroid-related quality of life instrument ThyPRO exhibited differential item functioning

    DEFF Research Database (Denmark)

    Watt, Torquil; Grønvold, Mogens; Hegedüs, Laszlo

    2014-01-01

    To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis.......To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis....

  6. Effect of Differential Item Functioning on Test Equating

    Science.gov (United States)

    Kabasakal, Kübra Atalay; Kelecioglu, Hülya

    2015-01-01

    This study examines the effect of differential item functioning (DIF) items on test equating through multilevel item response models (MIRMs) and traditional IRMs. The performances of three different equating models were investigated under 24 different simulation conditions, and the variables whose effects were examined included sample size, test…

  7. Item bias detection in the Hospital Anxiety and Depression Scale using structural equation modeling: comparison with other item bias detection methods

    NARCIS (Netherlands)

    Verdam, M.G.E.; Oort, F.J.; Sprangers, M.A.G.

    Purpose Comparison of patient-reported outcomes may be invalidated by the occurrence of item bias, also known as differential item functioning. We show two ways of using structural equation modeling (SEM) to detect item bias: (1) multigroup SEM, which enables the detection of both uniform and

  8. Calibration of Automatically Generated Items Using Bayesian Hierarchical Modeling.

    Science.gov (United States)

    Johnson, Matthew S.; Sinharay, Sandip

    For complex educational assessments, there is an increasing use of "item families," which are groups of related items. However, calibration or scoring for such an assessment requires fitting models that take into account the dependence structure inherent among the items that belong to the same item family. C. Glas and W. van der Linden…

  9. ACER Chemistry Test Item Collection. ACER Chemtic Year 12.

    Science.gov (United States)

    Australian Council for Educational Research, Hawthorn.

    The chemistry test item banks contains 225 multiple-choice questions suitable for diagnostic and achievement testing; a three-page teacher's guide; answer key with item facilities; an answer sheet; and a 45-item sample achievement test. Although written for the new grade 12 chemistry course in Victoria, Australia, the items are widely applicable.…

  10. Counterfeit and Fraudulent Items - Mitigating the risk

    International Nuclear Information System (INIS)

    Tannenbaum, Marc

    2011-01-01

    This presentation (slides) provides an overview of the industry's challenges and activities. Firstly, it outlines the differences between counterfeit, fraudulent, suspect, and also substandard items. Notice is given that items could be found not to meet the standard, but the difference in the intent to deceive with counterfeit and fraudulent items is the critical element. Examples from other industries are used which also rely heavily on the assurance of quality for safety. It also informs that EPRI has just completed a report in October 2009 in coordination with other US government agencies and industry organizations; this report, entitled Counterfeit, Substandard and Fraudulent Items, number 1019163, is available for free on the EPRI web site. As a follow-up to this report, EPRI is developing a CFSI Database; any country interested in a collaborative agreement is invited to use and contribute to the database information. Finally, it stresses the importance of the oversight of contractors, training to raise the awareness of the employees and the inspectors, and having a response plan for identified items

  11. Development and validation of the Psychological Adaptation Scale (PAS): use in six studies of adaptation to a health condition or risk.

    Science.gov (United States)

    Biesecker, Barbara B; Erby, Lori H; Woolford, Samuel; Adcock, Jessica Young; Cohen, Julie S; Lamb, Amanda; Lewis, Katie V; Truitt, Megan; Turriff, Amy; Reeve, Bryce B

    2013-11-01

    We introduce The Psychological Adaptation Scale (PAS) for assessing adaptation to a chronic condition or risk and present validity data from six studies of genetic conditions. Informed by theory, we identified four domains of adaptation: effective coping, self-esteem, social integration, and spiritual/existential meaning. Items were selected from the PROMIS "positive illness impact" item bank and adapted from the Rosenberg self-esteem scale to create a 20-item scale. Each domain included five items, with four sub-scale scores. Data from studies of six populations: adults affected with or at risk for genetic conditions (N=3) and caregivers of children with genetic conditions (N=3) were analyzed using confirmatory factor analyses (CFA). CFA suggested that all but five posited items converge on the domains as designed. Invariance of the PAS amongst the studies further suggested it is a valid and reliable tool to facilitate comparisons of adaptation across conditions. Use of the PAS will standardize assessments of adaptation and foster understanding of the relationships among related health outcomes, such as quality of life and psychological well-being. Clinical interventions can be designed based on PAS data to enhance dimensions of psychological adaptation to a chronic health condition or risk. Published by Elsevier Ireland Ltd.

  12. Global facilitation of attended features is obligatory and restricts divided attention.

    Science.gov (United States)

    Andersen, Søren K; Hillyard, Steven A; Müller, Matthias M

    2013-11-13

    In many common situations such as driving an automobile it is advantageous to attend concurrently to events at different locations (e.g., the car in front, the pedestrian to the side). While spatial attention can be divided effectively between separate locations, studies investigating attention to nonspatial features have often reported a "global effect", whereby items having the attended feature may be preferentially processed throughout the entire visual field. These findings suggest that spatial and feature-based attention may at times act in direct opposition: spatially divided foci of attention cannot be truly independent if feature attention is spatially global and thereby affects all foci equally. In two experiments, human observers attended concurrently to one of two overlapping fields of dots of different colors presented in both the left and right visual fields. When the same color or two different colors were attended on the two sides, deviant targets were detected accurately, and visual-cortical potentials elicited by attended dots were enhanced. However, when the attended color on one side matched the ignored color on the opposite side, attentional modulation of cortical potentials was abolished. This loss of feature selectivity could be attributed to enhanced processing of unattended items that shared the color of the attended items in the opposite field. Thus, while it is possible to attend to two different colors at the same time, this ability is fundamentally constrained by spatially global feature enhancement in early visual-cortical areas, which is obligatory and persists even when it explicitly conflicts with task demands.

  13. An approach for estimating item sensitivity to within-person change over time: An illustration using the Alzheimer's Disease Assessment Scale-Cognitive subscale (ADAS-Cog).

    Science.gov (United States)

    Dowling, N Maritza; Bolt, Daniel M; Deng, Sien

    2016-12-01

    When assessments are primarily used to measure change over time, it is important to evaluate items according to their sensitivity to change, specifically. Items that demonstrate good sensitivity to between-person differences at baseline may not show good sensitivity to change over time, and vice versa. In this study, we applied a longitudinal factor model of change to a widely used cognitive test designed to assess global cognitive status in dementia, and contrasted the relative sensitivity of items to change. Statistically nested models were estimated introducing distinct latent factors related to initial status differences between test-takers and within-person latent change across successive time points of measurement. Models were estimated using all available longitudinal item-level data from the Alzheimer's Disease Assessment Scale-Cognitive subscale, including participants representing the full-spectrum of disease status who were enrolled in the multisite Alzheimer's Disease Neuroimaging Initiative. Five of the 13 Alzheimer's Disease Assessment Scale-Cognitive items demonstrated noticeably higher loadings with respect to sensitivity to change. Attending to performance change on only these 5 items yielded a clearer picture of cognitive decline more consistent with theoretical expectations in comparison to the full 13-item scale. Items that show good psychometric properties in cross-sectional studies are not necessarily the best items at measuring change over time, such as cognitive decline. Applications of the methodological approach described and illustrated in this study can advance our understanding regarding the types of items that best detect fine-grained early pathological changes in cognition. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  14. Utilizing Response Time Distributions for Item Selection in CAT

    Science.gov (United States)

    Fan, Zhewen; Wang, Chun; Chang, Hua-Hua; Douglas, Jeffrey

    2012-01-01

    Traditional methods for item selection in computerized adaptive testing only focus on item information without taking into consideration the time required to answer an item. As a result, some examinees may receive a set of items that take a very long time to finish, and information is not accrued as efficiently as possible. The authors propose two…

  15. Monitoring population disability: Evaluation of a new Global Activity Limitation Indicator (GALI)

    NARCIS (Netherlands)

    Oyen, H. van; Heyden, J.; Perenboom, R.; Jagger, C.

    2006-01-01

    Objective: To evaluate a single item instrument, the Global Activity Limitation Indicator (GALI), to measure long-standing health related activity limitations, against several health indicators: a composite morbidity indicator, instruments measuring mental health (SCL-90R, GHQ-12), physical

  16. Item analysis and evaluation in the examinations in the faculty of ...

    African Journals Online (AJOL)

    2014-11-05

    Nov 5, 2014 ... Key words: Classical test theory, item analysis, item difficulty, item discrimination, item response theory, reliability ... the probability of answering an item correctly or of attaining ..... A Monte Carlo comparison of item and person.

  17. Are great apes able to reason from multi-item samples to populations of food items?

    Science.gov (United States)

    Eckert, Johanna; Rakoczy, Hannes; Call, Josep

    2017-10-01

    Inductive learning from limited observations is a cognitive capacity of fundamental importance. In humans, it is underwritten by our intuitive statistics, the ability to draw systematic inferences from populations to randomly drawn samples and vice versa. According to recent research in cognitive development, human intuitive statistics develops early in infancy. Recent work in comparative psychology has produced first evidence for analogous cognitive capacities in great apes who flexibly drew inferences from populations to samples. In the present study, we investigated whether great apes (Pongo abelii, Pan troglodytes, Pan paniscus, Gorilla gorilla) also draw inductive inferences in the opposite direction, from samples to populations. In two experiments, apes saw an experimenter randomly drawing one multi-item sample from each of two populations of food items. The populations differed in their proportion of preferred to neutral items (24:6 vs. 6:24) but apes saw only the distribution of food items in the samples that reflected the distribution of the respective populations (e.g., 4:1 vs. 1:4). Based on this observation they were then allowed to choose between the two populations. Results show that apes seemed to make inferences from samples to populations and thus chose the population from which the more favorable (4:1) sample was drawn in Experiment 1. In this experiment, the more attractive sample not only contained proportionally but also absolutely more preferred food items than the less attractive sample. Experiment 2, however, revealed that when absolute and relative frequencies were disentangled, apes performed at chance level. Whether these limitations in apes' performance reflect true limits of cognitive competence or merely performance limitations due to accessory task demands is still an open question. © 2017 Wiley Periodicals, Inc.

  18. An NCME Instructional Module on Polytomous Item Response Theory Models

    Science.gov (United States)

    Penfield, Randall David

    2014-01-01

    A polytomous item is one for which the responses are scored according to three or more categories. Given the increasing use of polytomous items in assessment practices, item response theory (IRT) models specialized for polytomous items are becoming increasingly common. The purpose of this ITEMS module is to provide an accessible overview of…

  19. 41 CFR 101-27.204 - Types of shelf-life items.

    Science.gov (United States)

    2010-07-01

    ... 41 Public Contracts and Property Management 2 2010-07-01 2010-07-01 true Types of shelf-life items...-Management of Shelf-Life Materials § 101-27.204 Types of shelf-life items. Shelf-life items are classified as nonextendable (Type I) and extendable (Type II). Type I items have a definite storage life after which the item...

  20. Constructing the 32-item Fitness-to-Drive Screening Measure.

    Science.gov (United States)

    Medhizadah, Shabnam; Classen, Sherrilene; Johnson, Andrew M

    2018-04-01

    The Fitness-to-Drive Screening Measure © (FTDS) enables proxies to identify at-risk older drivers via 54 driving-related items, but may be too lengthy for widespread uptake. We reduced the number of items in the FTDS and validated the shorter measure, using 200 caregiver responses. Exploratory factor analysis and classical test theory techniques were used to determine the most interpretable factor model and the minimum number of items to be used for predicting fitness to drive. The extent to which the shorter FTDS predicted the results of the 54-item FTDS was evaluated through correlational analysis. A three-factor model best represented the empirical data. Classical test theory techniques lead to the development of the 32-item FTDS. The 32-item FTDS was highly correlated ( r = .99, p = .05) with the FTDS. The 32-item FTDS may provide raters with a faster and more efficient way to identify at-risk older drivers.

  1. Tailored Cloze: Improved with Classical Item Analysis Techniques.

    Science.gov (United States)

    Brown, James Dean

    1988-01-01

    The reliability and validity of a cloze procedure used as an English-as-a-second-language (ESL) test in China were improved by applying traditional item analysis and selection techniques. The 'best' test items were chosen on the basis of item facility and discrimination indices, and were administered as a 'tailored cloze.' 29 references listed.…

  2. Global Self-Esteem: Cognitive Interpretation in an Academic Setting.

    Science.gov (United States)

    Yeung, Alexander Seeshing

    Researchers have assumed that global self-esteem (often labeled as general self-concept), being a general aggregate of perceptions of the self, is content free. Recent research has, however, shown that responses to self-esteem survey items are influenced by the context in which the respondents are asked to make their responses--a chameleon effect.…

  3. Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

    Directory of Open Access Journals (Sweden)

    Suttida Rakkapao

    2016-10-01

    Full Text Available This study investigated the multiple-choice test of understanding of vectors (TUV, by applying item response theory (IRT. The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test’s distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.

  4. Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

    Science.gov (United States)

    Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

    2016-12-01

    This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC) that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test's distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.

  5. Using Likert-type and ipsative/forced choice items in sequence to generate a preference.

    Science.gov (United States)

    Ried, L Douglas

    2014-01-01

    Collaboration and implementation of a minimum, standardized set of core global educational and professional competencies seems appropriate given the expanding international evolution of pharmacy practice. However, winnowing down hundreds of competencies from a plethora of local, national and international competency frameworks to select the most highly preferred to be included in the core set is a daunting task. The objective of this paper is to describe a combination of strategies used to ascertain the most highly preferred items among a large number of disparate items. In this case, the items were >100 educational and professional competencies that might be incorporated as the core components of new and existing competency frameworks. Panelists (n = 30) from the European Union (EU) and United States (USA) were chosen to reflect a variety of practice settings. Each panelist completed two electronic surveys. The first survey presented competencies in a Likert-type format and the second survey presented many of the same competencies in an ipsative/forced choice format. Item mean scores were calculated for each competency, the competencies were ranked, and non-parametric statistical tests were used to ascertain the consistency in the rankings achieved by the two strategies. This exploratory study presented over 100 competencies to the panelists in the beginning. The two methods provided similar results, as indicated by the significant correlation between the rankings (Spearman's rho = 0.30, P < 0.09). A two-step strategy using Likert-type and ipsative/forced choice formats in sequence, appears to be useful in a situation where a clear preference is required from among a large number of choices. The ipsative/forced choice format resulted in some differences in the competency preferences because the panelists could not rate them equally by design. While this strategy was used for the selection of professional educational competencies in this exploratory study, it is

  6. 41 CFR 101-26.605 - Items other than petroleum products and electronic items available from the Defense Logistics...

    Science.gov (United States)

    2010-07-01

    ... petroleum products and electronic items available from the Defense Logistics Agency. 101-26.605 Section 101... available from the Defense Logistics Agency. Agencies required to use GSA supply sources should also use... Logistics Agency, the catalog will contain only those items in Federal supply classification classes which...

  7. Extending item response theory to online homework

    Directory of Open Access Journals (Sweden)

    Gerd Kortemeyer

    2014-05-01

    Full Text Available Item response theory (IRT becomes an increasingly important tool when analyzing “big data” gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large-enrollment physics course for scientists and engineers, the study compares outcomes from IRT analyses of exam and homework data, and then proceeds to investigate the effects of each confounding factor introduced in the online realm. It is found that IRT yields the correct trends for learner ability and meaningful item parameters, yet overall agreement with exam data is moderate. It is also found that learner ability and item discrimination is robust over a wide range with respect to model assumptions and introduced noise. Item difficulty is also robust, but over a narrower range.

  8. Editorial Changes and Item Performance: Implications for Calibration and Pretesting

    Directory of Open Access Journals (Sweden)

    Heather Stoffel

    2014-11-01

    Full Text Available Previous research on the impact of text and formatting changes on test-item performance has produced mixed results. This matter is important because it is generally acknowledged that any change to an item requires that it be recalibrated. The present study investigated the effects of seven classes of stylistic changes on item difficulty, discrimination, and response time for a subset of 65 items that make up a standardized test for physician licensure completed by 31,918 examinees in 2012. One of two versions of each item (original or revised was randomly assigned to examinees such that each examinee saw only two experimental items, with each item being administered to approximately 480 examinees. The stylistic changes had little or no effect on item difficulty or discrimination; however, one class of edits -' changing an item from an open lead-in (incomplete statement to a closed lead-in (direct question -' did result in slightly longer response times. Data for nonnative speakers of English were analyzed separately with nearly identical results. These findings have implications for the conventional practice of repretesting (or recalibrating items that have been subjected to minor editorial changes.

  9. On multidimensional item response theory -- a coordinate free approach

    OpenAIRE

    Antal, Tamás

    2007-01-01

    A coordinate system free definition of complex structure multidimensional item response theory (MIRT) for dichotomously scored items is presented. The point of view taken emphasizes the possibilities and subtleties of understanding MIRT as a multidimensional extension of the ``classical'' unidimensional item response theory models. The main theorem of the paper is that every monotonic MIRT model looks the same; they are all trivial extensions of univariate item response theory.

  10. Verification of Differential Item Functioning (DIF) Status of West ...

    African Journals Online (AJOL)

    This study investigated test item bias and Differential Item Functioning (DIF) of West African ... items in chemistry function differentially with respect to gender and location. In Aba education zone of Abia, 50 secondary schools were purposively ...

  11. Mixture Item Response Theory-MIMIC Model: Simultaneous Estimation of Differential Item Functioning for Manifest Groups and Latent Classes

    Science.gov (United States)

    Bilir, Mustafa Kuzey

    2009-01-01

    This study uses a new psychometric model (mixture item response theory-MIMIC model) that simultaneously estimates differential item functioning (DIF) across manifest groups and latent classes. Current DIF detection methods investigate DIF from only one side, either across manifest groups (e.g., gender, ethnicity, etc.), or across latent classes…

  12. Wrong-Site Surgery, Retained Surgical Items, and Surgical Fires : A Systematic Review of Surgical Never Events.

    Science.gov (United States)

    Hempel, Susanne; Maggard-Gibbons, Melinda; Nguyen, David K; Dawes, Aaron J; Miake-Lye, Isomi; Beroes, Jessica M; Booth, Marika J; Miles, Jeremy N V; Shanman, Roberta; Shekelle, Paul G

    2015-08-01

    Serious, preventable surgical events, termed never events, continue to occur despite considerable patient safety efforts. To examine the incidence and root causes of and interventions to prevent wrong-site surgery, retained surgical items, and surgical fires in the era after the implementation of the Universal Protocol in 2004. We searched 9 electronic databases for entries from 2004 through June 30, 2014, screened references, and consulted experts. Two independent reviewers identified relevant publications in June 2014. One reviewer used a standardized form to extract data and a second reviewer checked the data. Strength of evidence was established by the review team. Data extraction was completed in January 2015. Incidence of wrong-site surgery, retained surgical items, and surgical fires. We found 138 empirical studies that met our inclusion criteria. Incidence estimates for wrong-site surgery in US settings varied by data source and procedure (median estimate, 0.09 events per 10,000 surgical procedures). The median estimate for retained surgical items was 1.32 events per 10,000 procedures, but estimates varied by item and procedure. The per-procedure surgical fire incidence is unknown. A frequently reported root cause was inadequate communication. Methodologic challenges associated with investigating changes in rare events limit the conclusions of 78 intervention evaluations. Limited evidence supported the Universal Protocol (5 studies), education (4 studies), and team training (4 studies) interventions to prevent wrong-site surgery. Limited evidence exists to prevent retained surgical items by using data-matrix-coded sponge-counting systems (5 pertinent studies). Evidence for preventing surgical fires was insufficient, and intervention effects were not estimable. Current estimates for wrong-site surgery and retained surgical items are 1 event per 100,000 and 1 event per 10,000 procedures, respectively, but the precision is uncertain, and the per

  13. Understanding and quantifying cognitive complexity level in mathematical problem solving items

    Directory of Open Access Journals (Sweden)

    SUSAN E. EMBRETSON

    2008-09-01

    Full Text Available The linear logistic test model (LLTM; Fischer, 1973 has been applied to a wide variety of new tests. When the LLTM application involves item complexity variables that are both theoretically interesting and empirically supported, several advantages can result. These advantages include elaborating construct validity at the item level, defining variables for test design, predicting parameters of new items, item banking by sources of complexity and providing a basis for item design and item generation. However, despite the many advantages of applying LLTM to test items, it has been applied less often to understand the sources of complexity for large-scale operational test items. Instead, previously calibrated item parameters are modeled using regression techniques because raw item response data often cannot be made available. In the current study, both LLTM and regression modeling are applied to mathematical problem solving items from a widely used test. The findings from the two methods are compared and contrasted for their implications for continued development of ability and achievement tests based on mathematical problem solving items.

  14. Improved Approximation Algorithms for Item Pricing with Bounded Degree and Valuation

    Science.gov (United States)

    Hamane, Ryoso; Itoh, Toshiya

    When a store sells items to customers, the store wishes to decide the prices of the items to maximize its profit. If the store sells the items with low (resp. high) prices, the customers buy more (resp. less) items, which provides less profit to the store. It would be hard for the store to decide the prices of items. Assume that a store has a set V of n items and there is a set C of m customers who wish to buy those items. The goal of the store is to decide the price of each item to maximize its profit. We refer to this maximization problem as an item pricing problem. We classify the item pricing problems according to how many items the store can sell or how the customers valuate the items. If the store can sell every item i with unlimited (resp. limited) amount, we refer to this as unlimited supply (resp. limited supply). We say that the item pricing problem is single-minded if each customer j∈C wishes to buy a set ej⊆V of items and assigns valuation w(ej)≥0. For the single-minded item pricing problems (in unlimited supply), Balcan and Blum regarded them as weighted k-hypergraphs and gave several approximation algorithms. In this paper, we focus on the (pseudo) degree of k-hypergraphs and the valuation ratio, i. e., the ratio between the smallest and the largest valuations. Then for the single-minded item pricing problems (in unlimited supply), we show improved approximation algorithms (for k-hypergraphs, general graphs, bipartite graphs, etc.) with respect to the maximum (pseudo) degree and the valuation ratio.

  15. Item-level factor analysis of the Self-Efficacy Scale.

    Science.gov (United States)

    Bunketorp Käll, Lina

    2014-03-01

    This study explores the internal structure of the Self-Efficacy Scale (SES) using item response analysis. The SES was previously translated into Swedish and modified to encompass all types of pain, not exclusively back pain. Data on perceived self-efficacy in 47 patients with subacute whiplash-associated disorders were derived from a previously conducted randomized-controlled trial. The item-level factor analysis was carried out using a six-step procedure. To further study the item inter-relationships and to determine the underlying structure empirically, the 20 items of the SES were also subjected to principal component analysis with varimax rotation. The analyses showed two underlying factors, named 'social activities' and 'physical activities', with seven items loading on each factor. The remaining six items of the SES appeared to measure somewhat different constructs and need to be analysed further.

  16. Negative affect impairs associative memory but not item memory.

    OpenAIRE

    Bisby, J. A.; Burgess, N.

    2014-01-01

    The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine the effects of emotion on memory for items and their associations. By presenting neutral and negative items with background contexts, Experiment 1 ...

  17. Hazardous metals in yellow items used in RCAs

    International Nuclear Information System (INIS)

    Brown, K.F.; Rankin, W.N.

    1992-01-01

    Yellow items used in Radiologically Controlled Areas (RCAs) that could contain hazardous metals were identified. X-ray fluorescence analyses indicated that thirty of the fifty-two items do contain hazardous metals. It is important to minimize the hazardous metals put into the wastes. The authors recommend that the specifications for all yellow items stocked in Stores be changed to specify that they contain no hazardous metals

  18. Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André

    2016-01-01

    Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…

  19. The item level psychometrics of the behaviour rating inventory of executive function-adult (BRIEF-A) in a TBI sample.

    Science.gov (United States)

    Waid-Ebbs, J Kay; Wen, Pey-Shan; Heaton, Shelley C; Donovan, Neila J; Velozo, Craig

    2012-01-01

    To determine whether the psychometrics of the BRIEF-A are adequate for individuals diagnosed with TBI. A prospective observational study in which the BRIEF-A was collected as part of a larger study. Informant ratings of the 75-item BRIEF-A on 89 individuals diagnosed with TBI were examined to determine items level psychometrics for each of the two BRIEF-A indexes: Behaviour Rating Index (BRI) and Metacognitive Index (MI). Patients were either outpatients or at least 1 year post-injury. Each index measured a latent trait, separating individuals into five-to-six ability levels and demonstrated good reliability (0.94 and 0.96). Four items were identified that did not meet the infit criteria. The results provide support for the use of the BRIEF-A as a supplemental assessment of executive function in TBI populations. However, further validation is needed with other measures of executive function. Recommendations include use of the index scores over the Global Executive Composite score and use of the difficulty hierarchy for setting therapy goals.

  20. Can Item Keyword Feedback Help Remediate Knowledge Gaps?

    Science.gov (United States)

    Feinberg, Richard A; Clauser, Amanda L

    2016-10-01

    In graduate medical education, assessment results can effectively guide professional development when both assessment and feedback support a formative model. When individuals cannot directly access the test questions and responses, a way of using assessment results formatively is to provide item keyword feedback. The purpose of the following study was to investigate whether exposure to item keyword feedback aids in learner remediation. Participants included 319 trainees who completed a medical subspecialty in-training examination (ITE) in 2012 as first-year fellows, and then 1 year later in 2013 as second-year fellows. Performance on 2013 ITE items in which keywords were, or were not, exposed as part of the 2012 ITE score feedback was compared across groups based on the amount of time studying (preparation). For the same items common to both 2012 and 2013 ITEs, response patterns were analyzed to investigate changes in answer selection. Test takers who indicated greater amounts of preparation on the 2013 ITE did not perform better on the items in which keywords were exposed compared to those who were not exposed. The response pattern analysis substantiated overall growth in performance from the 2012 ITE. For items with incorrect responses on both attempts, examinees selected the same option 58% of the time. Results from the current study were unsuccessful in supporting the use of item keywords in aiding remediation. Unfortunately, the results did provide evidence of examinees retaining misinformation.

  1. Method using a density field for locating related items for data mining

    Science.gov (United States)

    Wylie, Brian N.

    2002-01-01

    A method for locating related items in a geometric space transforms relationships among items to geometric locations. The method locates items in the geometric space so that the distance between items corresponds to the degree of relatedness. The method facilitates communication of the structure of the relationships among the items. The method makes use of numeric values as a measure of similarity between each pairing of items. The items are given initial coordinates in the space. An energy is then determined for each item from the item's distance and similarity to other items, and from the density of items assigned coordinates near the item. The distance and similarity component can act to draw items with high similarities close together, while the density component can act to force all items apart. If a terminal condition is not yet reached, then new coordinates can be determined for one or more items, and the energy determination repeated. The iteration can terminate, for example, when the total energy reaches a threshold, when each item's energy is below a threshold, after a certain amount of time or iterations.

  2. Maintenance of item and order information in verbal working memory.

    Science.gov (United States)

    Camos, Valérie; Lagner, Prune; Loaiza, Vanessa M

    2017-09-01

    Although verbal recall of item and order information is well-researched in short-term memory paradigms, there is relatively little research concerning item and order recall from working memory. The following study examined whether manipulating the opportunity for attentional refreshing and articulatory rehearsal in a complex span task differently affected the recall of item- and order-specific information of the memoranda. Five experiments varied the opportunity for articulatory rehearsal and attentional refreshing in a complex span task, but the type of recall was manipulated between experiments (item and order, order only, and item only recall). The results showed that impairing attentional refreshing and articulatory rehearsal similarly affected recall regardless of whether the scoring procedure (Experiments 1 and 4) or recall requirements (Experiments 2, 3, and 5) reflected item- or order-specific recall. This implies that both mechanisms sustain the maintenance of item and order information, and suggests that the common cumulative functioning of these two mechanisms to maintain items could be at the root of order maintenance.

  3. Do people with and without medical conditions respond similarly to the short health anxiety inventory? An assessment of differential item functioning using item response theory.

    Science.gov (United States)

    LeBouthillier, Daniel M; Thibodeau, Michel A; Alberts, Nicole M; Hadjistavropoulos, Heather D; Asmundson, Gordon J G

    2015-04-01

    Individuals with medical conditions are likely to have elevated health anxiety; however, research has not demonstrated how medical status impacts response patterns on health anxiety measures. Measurement bias can undermine the validity of a questionnaire by overestimating or underestimating scores in groups of individuals. We investigated whether the Short Health Anxiety Inventory (SHAI), a widely-used measure of health anxiety, exhibits medical condition-based bias on item and subscale levels, and whether the SHAI subscales adequately assess the health anxiety continuum. Data were from 963 individuals with diabetes, breast cancer, or multiple sclerosis, and 372 healthy individuals. Mantel-Haenszel tests and item characteristic curves were used to classify the severity of item-level differential item functioning in all three medical groups compared to the healthy group. Test characteristic curves were used to assess scale-level differential item functioning and whether the SHAI subscales adequately assess the health anxiety continuum. Nine out of 14 items exhibited differential item functioning. Two items exhibited differential item functioning in all medical groups compared to the healthy group. In both Thought Intrusion and Fear of Illness subscales, differential item functioning was associated with mildly deflated scores in medical groups with very high levels of the latent traits. Fear of Illness items poorly discriminated between individuals with low and very low levels of the latent trait. While individuals with medical conditions may respond differentially to some items, clinicians and researchers can confidently use the SHAI with a variety of medical populations without concern of significant bias. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Comparison on Computed Tomography using industrial items

    DEFF Research Database (Denmark)

    Angel, Jais Andreas Breusch; De Chiffre, Leonardo

    2014-01-01

    In a comparison involving 27 laboratories from 8 countries, measurements on two common industrial items, a polymer part and a metal part, were carried out using X-ray Computed Tomography. All items were measured using coordinate measuring machines before and after circulation, with reference...

  5. Using Item Response Theory to Develop a 60-Item Representation of the NEO PI-R Using the International Personality Item Pool: Development of the IPIP-NEO-60.

    Science.gov (United States)

    Maples-Keller, Jessica L; Williamson, Rachel L; Sleep, Chelsea E; Carter, Nathan T; Campbell, W Keith; Miller, Joshua D

    2017-10-31

    Given advantages of freely available and modifiable measures, an increase in the use of measures developed from the International Personality Item Pool (IPIP), including the 300-item representation of the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992a ) has occurred. The focus of this study was to use item response theory to develop a 60-item, IPIP-based measure of the Five-Factor Model (FFM) that provides equal representation of the FFM facets and to test the reliability and convergent and criterion validity of this measure compared to the NEO Five Factor Inventory (NEO-FFI). In an undergraduate sample (n = 359), scores from the NEO-FFI and IPIP-NEO-60 demonstrated good reliability and convergent validity with the NEO PI-R and IPIP-NEO-300. Additionally, across criterion variables in the undergraduate sample as well as a community-based sample (n = 757), the NEO-FFI and IPIP-NEO-60 demonstrated similar nomological networks across a wide range of external variables (r ICC = .96). Finally, as expected, in an MTurk sample the IPIP-NEO-60 demonstrated advantages over the Big Five Inventory-2 (Soto & John, 2017 ; n = 342) with regard to the Agreeableness domain content. The results suggest strong reliability and validity of the IPIP-NEO-60 scores.

  6. 16 CFR 304.6 - Marking requirements for imitation numismatic items.

    Science.gov (United States)

    2010-01-01

    ... 16 Commercial Practices 1 2010-01-01 2010-01-01 false Marking requirements for imitation... for imitation numismatic items. (a) An imitation numismatic item which is manufactured in the United... the item. (3) An imitation numismatic item of incusable material shall be incused with the word “COPY...

  7. A comparison of Rasch item-fit and Cronbach's alpha item reduction analysis for the development of a Quality of Life scale for children and adolescents.

    Science.gov (United States)

    Erhart, M; Hagquist, C; Auquier, P; Rajmil, L; Power, M; Ravens-Sieberer, U

    2010-07-01

    This study compares item reduction analysis based on classical test theory (maximizing Cronbach's alpha - approach A), with analysis based on the Rasch Partial Credit Model item-fit (approach B), as applied to children and adolescents' health-related quality of life (HRQoL) items. The reliability and structural, cross-cultural and known-group validity of the measures were examined. Within the European KIDSCREEN project, 3019 children and adolescents (8-18 years) from seven European countries answered 19 HRQoL items of the Physical Well-being dimension of a preliminary KIDSCREEN instrument. The Cronbach's alpha and corrected item total correlation (approach A) were compared with infit mean squares and the Q-index item-fit derived according to a partial credit model (approach B). Cross-cultural differential item functioning (DIF ordinal logistic regression approach), structural validity (confirmatory factor analysis and residual correlation) and relative validity (RV) for socio-demographic and health-related factors were calculated for approaches (A) and (B). Approach (A) led to the retention of 13 items, compared with 11 items with approach (B). The item overlap was 69% for (A) and 78% for (B). The correlation coefficient of the summated ratings was 0.93. The Cronbach's alpha was similar for both versions [0.86 (A); 0.85 (B)]. Both approaches selected some items that are not strictly unidimensional and items displaying DIF. RV ratios favoured (A) with regard to socio-demographic aspects. Approach (B) was superior in RV with regard to health-related aspects. Both types of item reduction analysis should be accompanied by additional analyses. Neither of the two approaches was universally superior with regard to cultural, structural and known-group validity. However, the results support the usability of the Rasch method for developing new HRQoL measures for children and adolescents.

  8. Assessing the factor structures of the 55- and 22-item versions of the conformity to masculine norms inventory.

    Science.gov (United States)

    Owen, Jesse

    2011-03-01

    The current study examined the psychometric properties of the abbreviated versions, 55- and 22-items, of the Conformity to Masculine Norms Inventory (CMNI). The authors tested the factor structure for the 11 subscales of the CMNI-55 and the global masculinity factor for the CMNI-55 and the CMNI-22. In a clinical sample of men and women (n=522), the results supported the 11-factor model. Furthermore, the factor structure was invariant for men and women. The higher order model, which tested the utility of the global masculine score, demonstrated marginal fit. The factor structures for the global masculinity score for the CMNI-22 demonstrated poor fit. Collectively, the results suggest that the CMNI-55 is better represented in a multidimensional construct. The subscales' alpha levels and factor loadings were, generally, within acceptable limits. Gender and ethnic mean level differences are also reported. © The Author(s) 2011

  9. Dissociation between source and item memory in Parkinson's disease

    Institute of Scientific and Technical Information of China (English)

    Hu Panpan; Li Youhai; Ma Huijuan; Xi Chunhua; Chen Xianwen; Wang Kai

    2014-01-01

    Background Episodic memory includes information about item memory and source memory.Many researches support the hypothesis that these two memory systems are implemented by different brain structures.The aim of this study was to investigate the characteristics of item memory and source memory processing in patients with Parkinson's disease (PD),and to further verify the hypothesis of dual-process model of source and item memory.Methods We established a neuropsychological battery to measure the performance of item memory and source memory.Totally 35 PD individuals and 35 matched healthy controls (HC) were administrated with the battery.Item memory task consists of the learning and recognition of high-frequency national Chinese characters; source memory task consists of the learning and recognition of three modes (character,picture,and image) of objects.Results Compared with the controls,the idiopathic PD patients have been impaired source memory (PD vs.HC:0.65±0.06 vs.0.72±0.09,P=0.001),but not impaired in item memory (PD vs.HC:0.65±0.07 vs.0.67±0.08,P=0.240).Conclusions The present experiment provides evidence for dissociation between item and source memory in PD patients,thereby strengthening the claim that the item or source memory rely on different brain structures.PD patients show poor source memory,in which dopamine plays a critical role.

  10. Collaborative Filtering Based on Sequential Extraction of User-Item Clusters

    Science.gov (United States)

    Honda, Katsuhiro; Notsu, Akira; Ichihashi, Hidetomo

    Collaborative filtering is a computational realization of “word-of-mouth” in network community, in which the items prefered by “neighbors” are recommended. This paper proposes a new item-selection model for extracting user-item clusters from rectangular relation matrices, in which mutual relations between users and items are denoted in an alternative process of “liking or not”. A technique for sequential co-cluster extraction from rectangular relational data is given by combining the structural balancing-based user-item clustering method with sequential fuzzy cluster extraction appraoch. Then, the tecunique is applied to the collaborative filtering problem, in which some items may be shared by several user clusters.

  11. An empirical comparison of Item Response Theory and Classical Test Theory

    Directory of Open Access Journals (Sweden)

    Špela Progar

    2008-11-01

    Full Text Available Based on nonlinear models between the measured latent variable and the item response, item response theory (IRT enables independent estimation of item and person parameters and local estimation of measurement error. These properties of IRT are also the main theoretical advantages of IRT over classical test theory (CTT. Empirical evidence, however, often failed to discover consistent differences between IRT and CTT parameters and between invariance measures of CTT and IRT parameter estimates. In this empirical study a real data set from the Third International Mathematics and Science Study (TIMSS 1995 was used to address the following questions: (1 How comparable are CTT and IRT based item and person parameters? (2 How invariant are CTT and IRT based item parameters across different participant groups? (3 How invariant are CTT and IRT based item and person parameters across different item sets? The findings indicate that the CTT and the IRT item/person parameters are very comparable, that the CTT and the IRT item parameters show similar invariance property when estimated across different groups of participants, that the IRT person parameters are more invariant across different item sets, and that the CTT item parameters are at least as much invariant in different item sets as the IRT item parameters. The results furthermore demonstrate that, with regards to the invariance property, IRT item/person parameters are in general empirically superior to CTT parameters, but only if the appropriate IRT model is used for modelling the data.

  12. Item difficulty of multiple choice tests dependant on different item response formats – An experiment in fundamental research on psychological assessment

    Directory of Open Access Journals (Sweden)

    KLAUS D. KUBINGER

    2007-12-01

    Full Text Available Multiple choice response formats are problematical as an item is often scored as solved simply because the test-taker is a lucky guesser. Instead of applying pertinent IRT models which take guessing effects into account, a pragmatic approach of re-conceptualizing multiple choice response formats to reduce the chance of lucky guessing is considered. This paper compares the free response format with two different multiple choice formats. A common multiple choice format with a single correct response option and five distractors (“1 of 6” is used, as well as a multiple choice format with five response options, of which any number of the five is correct and the item is only scored as mastered if all the correct response options and none of the wrong ones are marked (“x of 5”. An experiment was designed, using pairs of items with exactly the same content but different response formats. 173 test-takers were randomly assigned to two test booklets of 150 items altogether. Rasch model analyses adduced a fitting item pool, after the deletion of 39 items. The resulting item difficulty parameters were used for the comparison of the different formats. The multiple choice format “1 of 6” differs significantly from “x of 5”, with a relative effect of 1.63, while the multiple choice format “x of 5” does not significantly differ from the free response format. Therefore, the lower degree of difficulty of items with the “1 of 6” multiple choice format is an indicator of relevant guessing effects. In contrast the “x of 5” multiple choice format can be seen as an appropriate substitute for free response format.

  13. Attention restores discrete items to visual short-term memory.

    Science.gov (United States)

    Murray, Alexandra M; Nobre, Anna C; Clark, Ian A; Cravo, André M; Stokes, Mark G

    2013-04-01

    When a memory is forgotten, is it lost forever? Our study shows that selective attention can restore forgotten items to visual short-term memory (VSTM). In our two experiments, all stimuli presented in a memory array were designed to be equally task relevant during encoding. During the retention interval, however, participants were sometimes given a cue predicting which of the memory items would be probed at the end of the delay. This shift in task relevance improved recall for that item. We found that this type of cuing improved recall for items that otherwise would have been irretrievable, providing critical evidence that attention can restore forgotten information to VSTM. Psychophysical modeling of memory performance has confirmed that restoration of information in VSTM increases the probability that the cued item is available for recall but does not improve the representational quality of the memory. We further suggest that attention can restore discrete items to VSTM.

  14. Health-related quality of life of African-American female breast cancer survivors, survivors of other cancers, and those without cancer.

    Science.gov (United States)

    Claridy, Mechelle D; Ansa, Benjamin; Damus, Francesca; Alema-Mensah, Ernest; Smith, Selina A

    2018-04-27

    The purpose of this study was to compare differences in health-related quality of life (HRQOL) between African-American female breast cancer survivors, African-American female survivors of other cancers, and African-American women with no history of cancer. Using data from the 2010 National Health Interview Survey (NHIS), the HRQOL of African-American women aged 35 years or older was compared by cancer status. Physical and mental health items from the Patient-Reported Outcomes Measurement Information System (PROMIS) global health scale were used to assess differences in HRQOL. For summary physical and mental health measures, no significant differences were found between breast cancer survivors and women with no history of cancer; survivors of other cancers reported poorer physical and mental health than did women with no history of cancer. Similar differences were found at the item level. When we examined the two African-American female cancer survivor groups, we found that cancer survivors whose cancer was being treated reported substantially poorer physical health and mental health than did those whose cancer was not being treated. Survivors who had private insurance and were cancer free reported better physical and mental health than did those who did not have private insurance and those who were not cancer free. Breast cancer survivors reported slightly better physical and mental health than did survivors of other cancers. Our findings highlight the need for public health agencies to adopt practices to improve the mental and physical health of African-American female survivors of cancer.

  15. Australian Chemistry Test Item Bank: Years 11 & 12. Volume 1.

    Science.gov (United States)

    Commons, C., Ed.; Martin, P., Ed.

    Volume 1 of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the "ACER…

  16. 31 CFR 50.14 - Separate line item.

    Science.gov (United States)

    2010-07-01

    ....14 Money and Finance: Treasury Office of the Secretary of the Treasury TERRORISM RISK INSURANCE PROGRAM Disclosures as Conditions for Federal Payment § 50.14 Separate line item. An insurer is deemed to be in compliance with the requirement of providing disclosure on a “separate line item in the policy...

  17. Republic of Croatia's Experiences in the Implementation of the EU Directive About Dual-Use Items

    International Nuclear Information System (INIS)

    Vidas, Z.; Orehovec, Z.; Superina, V.

    2007-01-01

    The Republic of Croatia is undergoing a process of adjusting its own legislation to the legislation of EU. It is one of the most important obligations of the EU-Croatia Stabilization and Association Agreement. It is also a basic prerequisite for the practical realization of the modern, unique and integral Export and Import Control system of the Sensitive Items. At the same time, it is a very important step towards better understanding of real and great danger of the weapons of mass destruction (WMD) proliferation and their possible usage in terrorism. That means that Republic of Croatia will act along with EU in the complex activities to prevent and minimize the WMD proliferation, to participate in antiterrorism activities, and to maintain regional and global security. In the year 2004, along the lines of the EU Legislation, the Croatian Parliament adopted the basic legal act - Act on export of Dual-use Items and its accompanying rules and regulations. The existing act on dual-purpose items in Croatia is mostly in harmony with the 2000 and 2003 EU Decrees which regulate te regime of the dual-purpose items export control. Nevertheless, the EU legislation experiences constant amendments in the field. And the Croatian Government is committed to following the improvements of te system and adjusting its own. However, during this process, a series of vague wordings and inconsistencies were noticed in the WMD nonproliferation policy and in the legislation to control the export of high technology products which could be abused for the WMD development. In addition, there is neither regulation on import control system nor control on the export of knowledge through scientific and professional cooperation. The purpose of this article is to professionally elaborate the value wordings and inconsistencies. It can be done on the basis of Croatia's experiences in the export and import control system of the dual-purpose items and knowledge and experience acquired through the

  18. Evaluating the quality of medical multiple-choice items created with automated processes.

    Science.gov (United States)

    Gierl, Mark J; Lai, Hollis

    2013-07-01

    Computerised assessment raises formidable challenges because it requires large numbers of test items. Automatic item generation (AIG) can help address this test development problem because it yields large numbers of new items both quickly and efficiently. To date, however, the quality of the items produced using a generative approach has not been evaluated. The purpose of this study was to determine whether automatic processes yield items that meet standards of quality that are appropriate for medical testing. Quality was evaluated firstly by subjecting items created using both AIG and traditional processes to rating by a four-member expert medical panel using indicators of multiple-choice item quality, and secondly by asking the panellists to identify which items were developed using AIG in a blind review. Fifteen items from the domain of therapeutics were created in three different experimental test development conditions. The first 15 items were created by content specialists using traditional test development methods (Group 1 Traditional). The second 15 items were created by the same content specialists using AIG methods (Group 1 AIG). The third 15 items were created by a new group of content specialists using traditional methods (Group 2 Traditional). These 45 items were then evaluated for quality by a four-member panel of medical experts and were subsequently categorised as either Traditional or AIG items. Three outcomes were reported: (i) the items produced using traditional and AIG processes were comparable on seven of eight indicators of multiple-choice item quality; (ii) AIG items can be differentiated from Traditional items by the quality of their distractors, and (iii) the overall predictive accuracy of the four expert medical panellists was 42%. Items generated by AIG methods are, for the most part, equivalent to traditionally developed items from the perspective of expert medical reviewers. While the AIG method produced comparatively fewer plausible

  19. An Investigation of Item Type in a Standards-Based Assessment.

    Directory of Open Access Journals (Sweden)

    Liz Hollingworth

    2007-12-01

    Full Text Available Large-scale state assessment programs use both multiple-choice and open-ended items on tests for accountability purposes. Certainly, there is an intuitive belief among some educators and policy makers that open-ended items measure something different than multiple-choice items. This study examined two item formats in custom-built, standards-based tests of achievement in Reading and Mathematics at grades 3-8. In this paper, we raise questions about the value of including open-ended items, given scoring costs, time constraints, and the higher probability of missing data from test-takers.

  20. Instemmingsgeneigdheid en verskillende item- en responsformate in 'n gesommeerde selfbeoordelingskaal

    Directory of Open Access Journals (Sweden)

    Nadene Hanekom

    1998-06-01

    Full Text Available This study examines the degree of acquiescence present when the item and response formats of a summated rating scale are varied. It is often recommended that acquiescence response bias in rating scales may be controlled by using both positively and negatively worded items. Such items are generally worded in the Likert-type format of statements. The purpose of the study was to establish whether items in question format would result in a smaller degree of acquiescence than items worded as statements. the response format was also varied (five- and seven-point options to determine whether this would influence the reliability and degree of acquiescence in the scales. A twenty-item Locus of Control (LC questionnaire was used, but each item was complemented by its opposite, resulting in 40 items. The subjects, divided randomly into two groups, were second year students who had to complete four versions of the questionnaire, plus a shortened version of Bass's scale for measuring acquiescence. The LC version were questions or statements each combined with a five- or seven-point respons format. Partial counterbalancing was introduced by testing on two separate occasions, presenting the tests to the two groups in the opposite order. The degree of acquiescence was assessed by correlating the items with their opposite, and by correlating scores on each version with scores on the acquiescence questionnaire. No major difference were found between the various item and response format in relation to acquiescence. Opsomming Hierdie ondersoek is uitgevoer om te bepaal of die mate van instemmingsgeneigdheid deur die item- en responsformaat van 'n gesommeerde selfbeoordelingskaal beinvloed word. Daar word dikwels aanbeveel dat die gebruik van positief- sowel as negatiefbewoorde items in 'n vraelys instemmingsgeneigdheid beperk. Suike items word gewoonlik in die tradisionele Likertformaat as stellings geformuleer. Die doel van die ondersoek was om te bepaal of items

  1. Development and psychometric evaluation of a clinical global impression for schizoaffective disorder scale.

    Science.gov (United States)

    Allen, Michael H; Daniel, David G; Revicki, Dennis A; Canuso, Carla M; Turkoz, Ibrahim; Fu, Dong-Jing; Alphs, Larry; Ishak, K Jack; Bartko, John J; Lindenmayer, Jean-Pierre

    2012-01-01

    The Clinical Global Impression for Schizoaffective Disorder scale is a new rating scale adapted from the Clinical Global Impression scale for use in patients with schizoaffective disorder. The psychometric characteristics of the Clinical Global Impression for Schizoaffective Disorder are described. Content validity was assessed using an investigator questionnaire. Inter-rater reliability was determined with 12 sets of videotaped interviews rated independently by two trained individuals. Test-retest reliability was assessed using 30 randomly selected raters from clinical trials who evaluated the same videos on separate occasions two weeks apart. Convergent and divergent validity and effect size were evaluated by comparing scores between the Clinical Global Impression for Schizoaffective Disorder and the Positive and Negative Syndrome Scale, 21-item Hamilton Rating Scale for Depression, and Young Mania Rating Scale scales using pooled patient data from two clinical trials. Clinical Global Impression for Schizoaffective Disorder scores were then linked to corresponding Positive and Negative Syndrome Scale scores. Content validity was strong. Inter-rater agreement was good to excellent for most scales and subscales (intra-class correlation coefficient ≥ 0.50). Test-retest showed good reproducibility, with intraclass correlation coefficients ranging from 0.444 to 0.898. Spearman correlations between Clinical Global Impression for Schizoaffective Disorder domains and corresponding symptom scales were 0.60 or greater, and effect sizes for Clinical Global Impression for Schizoaffective Disorder overall and domain scores were similar to Positive and Negative Syndrome Scale Young Mania Rating Scale, and 21-item Hamilton Rating Scale for Depression scores. Raters anticipated that the scale might be less effective in distinguishing negative from depressive symptoms, and, in fact, the results here may reflect that clinical reality. Multiple lines of evidence support the

  2. Writing, Evaluating and Assessing Data Response Items in Economics.

    Science.gov (United States)

    Trotman-Dickenson, D. I.

    1989-01-01

    Describes some of the problems in writing data response items in economics for use by A Level and General Certificate of Secondary Education (GCSE) students. Examines the experience of two series of workshops on writing items, evaluating them and assessing responses from schools. Offers suggestions for producing packages of data response items as…

  3. Three Modeling Applications to Promote Automatic Item Generation for Examinations in Dentistry.

    Science.gov (United States)

    Lai, Hollis; Gierl, Mark J; Byrne, B Ellen; Spielman, Andrew I; Waldschmidt, David M

    2016-03-01

    Test items created for dentistry examinations are often individually written by content experts. This approach to item development is expensive because it requires the time and effort of many content experts but yields relatively few items. The aim of this study was to describe and illustrate how items can be generated using a systematic approach. Automatic item generation (AIG) is an alternative method that allows a small number of content experts to produce large numbers of items by integrating their domain expertise with computer technology. This article describes and illustrates how three modeling approaches to item content-item cloning, cognitive modeling, and image-anchored modeling-can be used to generate large numbers of multiple-choice test items for examinations in dentistry. Test items can be generated by combining the expertise of two content specialists with technology supported by AIG. A total of 5,467 new items were created during this study. From substitution of item content, to modeling appropriate responses based upon a cognitive model of correct responses, to generating items linked to specific graphical findings, AIG has the potential for meeting increasing demands for test items. Further, the methods described in this study can be generalized and applied to many other item types. Future research applications for AIG in dental education are discussed.

  4. Separating relational from item load effects in paired recognition: temporoparietal and middle frontal gyral activity with increased associates, but not items during encoding and retention.

    Science.gov (United States)

    Phillips, Steven; Niki, Kazuhisa

    2002-10-01

    Working memory is affected by items stored and the relations between them. However, separating these factors has been difficult, because increased items usually accompany increased associations/relations. Hence, some have argued, relational effects are reducible to item effects. We overcome this problem by manipulating index length: the fewest number of item positions at which there is a unique item, or tuple of items (if length >1), for every instance in the relational (memory) set. Longer indexes imply greater similarity (number of shared items) between instances and higher load on encoding processes. Subjects were given lists of study pairs and asked to make a recognition judgement. The number of unique items and index length in the three list conditions were: (1) AB, CD: four/one; (2) AB, CD, EF: six/one; and (3) AB, AD, CB: four/two, respectively. Japanese letters were used in Experiments 1 (kanji-ideograms) and 2 (hiragana-phonograms); numbers in Experiment 3; and shapes generated from Fourier descriptors in Experiment 4. Across all materials, right dominant temporoparietal and middle frontal gyral activity was found with increased index length, but not items during study. In Experiment 5, a longer delay was used to isolate retention effects in the absence of visual stimuli. Increased left hemispheric activity was observed in the precuneus, middle frontal gyrus, and superior temporal gyrus with increased index length for the delay period. These results show that relational load is not reducible to item load.

  5. Rats Remember Items in Context Using Episodic Memory.

    Science.gov (United States)

    Panoz-Brown, Danielle; Corbin, Hannah E; Dalecki, Stefan J; Gentry, Meredith; Brotheridge, Sydney; Sluka, Christina M; Wu, Jie-En; Crystal, Jonathon D

    2016-10-24

    Vivid episodic memories in people have been characterized as the replay of unique events in sequential order [1-3]. Animal models of episodic memory have successfully documented episodic memory of a single event (e.g., [4-8]). However, a fundamental feature of episodic memory in people is that it involves multiple events, and notably, episodic memory impairments in human diseases are not limited to a single event. Critically, it is not known whether animals remember many unique events using episodic memory. Here, we show that rats remember many unique events and the contexts in which the events occurred using episodic memory. We used an olfactory memory assessment in which new (but not old) odors were rewarded using 32 items. Rats were presented with 16 odors in one context and the same odors in a second context. To attain high accuracy, the rats needed to remember item in context because each odor was rewarded as a new item in each context. The demands on item-in-context memory were varied by assessing memory with 2, 3, 5, or 15 unpredictable transitions between contexts, and item-in-context memory survived a 45 min retention interval challenge. When the memory of item in context was put in conflict with non-episodic familiarity cues, rats relied on item in context using episodic memory. Our findings suggest that rats remember multiple unique events and the contexts in which these events occurred using episodic memory and support the view that rats may be used to model fundamental aspects of human cognition. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. The effect of non-recurring items on analysts’ earnings forecasts

    Directory of Open Access Journals (Sweden)

    Nan Li

    2018-03-01

    Full Text Available This article discusses the effects of non-recurring profits and losses on statement users’ decision-making processes from the perspective of securities analysts. We examine the relationship between analysts’ forecast revisions and firms’ non-recurring earnings. We find that 1 non-recurring gains and losses can influence analysts’ earnings forecast revision; 2 compared with non-recurring items resulting from policy changes, analysts are more concerned about those attributed to changes in business scope; 3 if listed companies use non-recurring items to turn losses into gains during earnings management, it will weaken the effects of non-recurring items on analysts’ earnings forecast revision. The results suggest that non-recurring items that result from changes in business scope incorporate information that users need for the future operation of the business. This article verifies the information relevance of non-recurring items and provides evidence for the necessity of non-recurring item disclosure. Keywords: Non-recurring items, Earnings forecasts, Revisions

  7. Mediate gamma radiation effects on some packaged food items

    International Nuclear Information System (INIS)

    Inamura, Patricia Y.; Uehara, Vanessa B.; Teixeira, Christian A.H.M.; Mastro, Nelida L. del

    2012-01-01

    For most of prepackaged foods a 10 kGy radiation dose is considered the maximum dose needed; however, the commercially available and practically accepted packaging materials must be suitable for such application. This work describes the application of ionizing radiation on several packaged food items, using 5 dehydrated food items, 5 ready-to-eat meals and 5 ready-to-eat food items irradiated in a 60 Co gamma source with a 3 kGy dose. The quality evaluation of the irradiated samples was performed 2 and 8 months after irradiation. Microbiological analysis (bacteria, fungus and yeast load) was performed. The sensory characteristics were established for appearance, aroma, texture and flavor attributes were also established. From these data, the acceptability of all irradiated items was obtained. All ready-to-eat food items assayed like manioc flour, some pâtés and blocks of raw brown sugar and most of ready-to-eat meals like sausages and chicken with legumes were considered acceptable for microbial and sensory characteristics. On the other hand, the dehydrated food items chosen for this study, such as dehydrated bacon potatoes or pea soups were not accepted by the sensory analysis. A careful dose choice and special irradiation conditions must be used in order to achieve sensory acceptability needed for the commercialization of specific irradiated food items. - Highlights: ► We applied gamma radiation on several kinds of packaged food items. ► Microbiological and sensory analyses were performed 2 and 8 months after irradiation. ► All ready-to-eat food items assayed were approved for microbial and sensory characteristics. ► Most ready-to-eat meals like sausages and chicken with legumes were also acceptable. ► Dehydrated bacon potatoes or pea soups were considered not acceptable.

  8. 41 CFR 101-27.209-1 - GSA stock items.

    Science.gov (United States)

    2010-07-01

    ... 41 Public Contracts and Property Management 2 2010-07-01 2010-07-01 true GSA stock items. 101-27.209-1 Section 101-27.209-1 Public Contracts and Property Management Federal Property Management...-Management of Shelf-Life Materials § 101-27.209-1 GSA stock items. Shelf-life items that meet the criteria...

  9. 12 CFR 210.8 - Presenting noncash items for acceptance.

    Science.gov (United States)

    2010-01-01

    ... for acceptance. (a) A Reserve Bank or a subsequent collecting bank may, if instructed by the sender, present a noncash item for acceptance in any manner authorized by law if— (1) The item provides that it... 12 Banks and Banking 2 2010-01-01 2010-01-01 false Presenting noncash items for acceptance. 210.8...

  10. Brief Sensation Seeking Scale: Latent structure of 8-item and 4-item versions in Peruvian adolescents.

    Science.gov (United States)

    Merino-Soto, Cesar; Salas Blas, Edwin

    2018-01-01

    This research intended to validate two brief scales of sensations seeking with Peruvian adolescents: the eight item scale (BSSS8; Hoyle, Stephenson, Palmgreen, Lorch, y Donohew, 2002) and the four item scale (BSSS4; Stephenson, Hoyle, Slater, y Palmgreen, 2003). Questionnaires were administered to 618 voluntary participants, with an average age of 13.6 years, from different levels of high school, state and private school in a district in the south of Lima. It analyzed the internal structure of both short versions using three models: a) unidimensional (M1), b) oblique or related dimensions (M2), and c) the bifactor model (M3). Results show that both instruments have a single dimension which best represents the variability of the items; a fact that can be explained both by the complexity of the concept and by the small number of items representing each factor, which is more noticeable in the BSSS4. Reliability is within levels found by previous studies: alpha: .745 = BSSS8 and BSSS4 =. 643; omega coefficient: .747 in BSSS8 and .651 in BSSS4. These are considered suitable for the type of instruments studied. Based on the correlation between the two instruments, it was found that there are satisfactory levels of equivalence between the BSSS8 and BSSS4. However, it is recommended that the BSSS4 is mainly used for research and for the purpose of describing populations.

  11. Item response theory analysis of the mechanics baseline test

    Science.gov (United States)

    Cardamone, Caroline N.; Abbott, Jonathan E.; Rayyan, Saif; Seaton, Daniel T.; Pawl, Andrew; Pritchard, David E.

    2012-02-01

    Item response theory is useful in both the development and evaluation of assessments and in computing standardized measures of student performance. In item response theory, individual parameters (difficulty, discrimination) for each item or question are fit by item response models. These parameters provide a means for evaluating a test and offer a better measure of student skill than a raw test score, because each skill calculation considers not only the number of questions answered correctly, but the individual properties of all questions answered. Here, we present the results from an analysis of the Mechanics Baseline Test given at MIT during 2005-2010. Using the item parameters, we identify questions on the Mechanics Baseline Test that are not effective in discriminating between MIT students of different abilities. We show that a limited subset of the highest quality questions on the Mechanics Baseline Test returns accurate measures of student skill. We compare student skills as determined by item response theory to the more traditional measurement of the raw score and show that a comparable measure of learning gain can be computed.

  12. Item Response Theory Analysis of the Psychopathic Personality Inventory-Revised.

    Science.gov (United States)

    Eichenbaum, Alexander E; Marcus, David K; French, Brian F

    2017-06-01

    This study examined item and scale functioning in the Psychopathic Personality Inventory-Revised (PPI-R) using an item response theory analysis. PPI-R protocols from 1,052 college student participants (348 male, 704 female) were analyzed. Analyses were conducted on the 131 self-report items comprising the PPI-R's eight content scales, using a graded response model. Scales collected a majority of their information about respondents possessing higher than average levels of the traits being measured. Each scale contained at least some items that evidenced limited ability to differentiate between respondents with differing levels of the trait being measured. Moreover, 80 items (61.1%) yielded significantly different responses between men and women presumably possessing similar levels of the trait being measured. Item performance was also influenced by the scoring format (directly scored vs. reverse-scored) of the items. Overall, the results suggest that the PPI-R, despite identifying psychopathic personality traits in individuals possessing high levels of those traits, may not identify these traits equally well for men and women, and scores are likely influenced by the scoring format of the individual item and scale.

  13. Structure and validity of sluggish cognitive tempo using an expanded item pool in children with attention-deficit/hyperactivity disorder.

    Science.gov (United States)

    McBurnett, Keith; Villodas, Miguel; Burns, G Leonard; Hinshaw, Stephen P; Beaulieu, Allyson; Pfiffner, Linda J

    2014-01-01

    We evaluated the latent structure and validity of an expanded pool of Sluggish Cognitive Tempo (SCT) items. An experimental rating scale with 44 candidate SCT items was administered to parents and teachers of 165 children in grades 2-5 (ages 7-11) recruited for a randomized clinical trial of a psychosocial intervention for Attention-Deficit/Hyperactivity Disorder, Predominantly Inattentive Type. Exploratory factor analyses (EFA) were used to extract items with high loadings (>0.59) on primary factors of SCT and low cross-loadings (0.30 or lower) on other SCT factors and on the Inattention factor of ADHD. Items were required to meet these criteria for both informants. This procedure reduced the pool to 15 items. Generally, items representing slowness and low initiative failed these criteria. SCT factors (termed Daydreaming, Working Memory Problems, and Sleepy/Tired) showed good convergent and discriminant validity in EFA and in a confirmatory model with ADHD factors. Simultaneous regressions of impairment and comorbidity on SCT and ADHD factors found that Daydreams was associated with global impairment, and Sleepy/Tired was associated with organizational problems and depression ratings, across both informants. For teachers, Daydreams also predicted ODD (inversely); Sleepy/Tired also predicted poor academic behavior, low social skills, and problem social behavior; and Working Memory Problems predicted organizational problems and anxiety. When depression, rather than ADHD, was included among the predictors, the only SCT-related associations rendered insignificant were the teacher-reported associations of Daydreams with ODD; Working Memory Problems with anxiety, and Sleepy/Tired with poor social skills. SCT appears to be meaningfully associated with impairment, even when controlling for depression. Common behaviors resembling Working Memory problems may represent a previously undescribed factor of SCT.

  14. Language barriers in Hispanic patients: relation to upper-extremity disability.

    Science.gov (United States)

    Menendez, Mariano E; Eberlin, Kyle R; Mudgal, Chaitanya S; Ring, David

    2015-06-01

    Although upper-extremity disability has been shown to correlate highly with various psychosocial aspects of illness (e.g., self-efficacy, depression, kinesiophobia, and pain catastrophizing), the role of language in musculoskeletal health status is less certain. In an English-speaking outpatient hand surgery office setting, we sought to determine (1) whether a patient's primary native language (English or Spanish) is an independent predictor of upper-extremity disability and (2) whether there are any differences in the contribution of measures of psychological distress to disability between native English- and Spanish-speaking patients. A total of 122 patients (61 native English speakers and 61 Spanish speakers) presenting to an orthopaedic hand clinic completed sociodemographic information and three Patient-Reported Outcomes Measurement Information System (PROMIS)-based computerized adaptive testing questionnaires: PROMIS Pain Interference, PROMIS Depression, and PROMIS Upper-Extremity Physical Function. Bivariate and multivariable linear regression modeling were performed. Spanish-speaking patients reported greater upper-extremity disability, pain interference, and symptoms of depression than English-speaking patients. After adjusting for sociodemographic covariates and measures of psychological distress using multivariable regression modeling, the patient's primary language was not retained as an independent predictor of disability. PROMIS Depression showed a medium correlation (r = -0.35; p Spanish-speaking patients. PROMIS Pain Interference had a large correlation with disability in both patient cohorts (Spanish-speaking: r = -0.66; p immigration to the USA did not correlate with disability among Spanish speakers. Primary language has less influence on symptom intensity and magnitude of disability than psychological distress and ineffective coping strategies. Interventions to optimize mood and to reduce pain interference should be considered in

  15. Law in Transition Biblioessay: Globalization, Human Rights, Environment, Technology

    Directory of Open Access Journals (Sweden)

    Michael Marien

    2012-04-01

    Full Text Available As globalization continues, many transformations in international and domestic laws areunderway or called for. There are too many laws and too few, too much law that is inadequateor obsolete, and too much law-breaking. This biblioessay covers some 100 recentbooks, nearly all recently published, arranged in four categories. 1 International Lawincludes six overviews/textbooks on comparative law, laws related to warfare and security,pushback against demands of globalization, and gender perspectives; 2 Human Rightsencompasses general overviews and normative visions, several books on how some statesviolate human rights, five items on how good laws can end poverty and promote prosperity,and laws regulating working conditions and health rights; 3 Environment/Resources coversgrowth of international environmental law, visions of law for a better environmental future,laws to govern genetic resources and increasingly stressed water resources, two books onprospects for climate change liability, and items on toxic hazards and problems of compliance;4 Technology, Etc. identifies eight books on global crime and the failed war on drugs,books on the response to terrorism and guarding privacy and mobility in our high-tech age,seven books on how infotech is changing law and legal processes while raising intellectualproperty questions, biomedical technologies and the law, and general views on the need forupdated laws and constitutions. In sum, this essay suggests the need for deeper and timelyanalysis of the many books on changes in law.

  16. 26 CFR 301.6231(a)(3)-1 - Partnership items.

    Science.gov (United States)

    2010-04-01

    ... 26 Internal Revenue 18 2010-04-01 2010-04-01 false Partnership items. 301.6231(a)(3)-1 Section 301... Partnership items. (a) In general. For purposes of subtitle F of the Internal Revenue Code of 1954, the following items which are required to be taken into account for the taxable year of a partnership under...

  17. Item Response Theory Models for Performance Decline during Testing

    Science.gov (United States)

    Jin, Kuan-Yu; Wang, Wen-Chung

    2014-01-01

    Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

  18. Identifying predictors of physics item difficulty: A linear regression approach

    Science.gov (United States)

    Mesic, Vanes; Muratovic, Hasnija

    2011-06-01

    Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge

  19. Identifying predictors of physics item difficulty: A linear regression approach

    Directory of Open Access Journals (Sweden)

    Hasnija Muratovic

    2011-06-01

    Full Text Available Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal

  20. Item response theory analysis of the Pain Self-Efficacy Questionnaire.

    Science.gov (United States)

    Costa, Daniel S J; Asghari, Ali; Nicholas, Michael K

    2017-01-01

    The Pain Self-Efficacy Questionnaire (PSEQ) is a 10-item instrument designed to assess the extent to which a person in pain believes s/he is able to accomplish various activities despite their pain. There is strong evidence for the validity and reliability of both the full-length PSEQ and a 2-item version. The purpose of this study is to further examine the properties of the PSEQ using an item response theory (IRT) approach. We used the two-parameter graded response model to examine the category probability curves, and location and discrimination parameters of the 10 PSEQ items. In item response theory, responses to a set of items are assumed to be probabilistically determined by a latent (unobserved) variable. In the graded-response model specifically, item response threshold (the value of the latent variable for which adjacent response categories are equally likely) and discrimination parameters are estimated for each item. Participants were 1511 mixed, chronic pain patients attending for initial assessment at a tertiary pain management centre. All items except item 7 ('I can cope with my pain without medication') performed well in IRT analysis, and the category probability curves suggested that participants used the 7-point response scale consistently. Items 6 ('I can still do many of the things I enjoy doing, such as hobbies or leisure activity, despite pain'), 8 ('I can still accomplish most of my goals in life, despite the pain') and 9 ('I can live a normal lifestyle, despite the pain') captured higher levels of the latent variable with greater precision. The results from this IRT analysis add to the body of evidence based on classical test theory illustrating the strong psychometric properties of the PSEQ. Despite the relatively poor performance of Item 7, its clinical utility warrants its retention in the questionnaire. The strong psychometric properties of the PSEQ support its use as an effective tool for assessing self-efficacy in people with pain

  1. The CERN Global Network opens its doors to companies

    CERN Multimedia

    Francesco Poppi

    2010-01-01

    Six months after its launch, the CERN Global Network already has almost one thousand members. Today, it is opening its doors to companies from CERN's Member States. This will open up a variety of new professional and career opportunities to all the members and will enhance the networking capabilities of all parties involved.   Screenshot of the CERN Global Network website. A new item has recently appeared on the top menu of the Network's website: “Organisations”. This is the entry point for companies and, later, research institutes, wishing to join. “The CERN Global Network brings together hundreds of people who have worked at or with CERN and who have a wealth of skills and expertise. Thanks to the Network, the job opportunities made available by the companies will become visible to the wider community,” says Linda Orr-Easo, a member of the Knowledge and Technology Transfer Group and the CERN Global Network Manager. In addition to creating new career opp...

  2. Movement of global warming issues

    International Nuclear Information System (INIS)

    Sugiyama, Taishi

    2015-01-01

    This paper summarizes the report of IPCC (Intergovernmental Panel on Climate Change), and the movement of the global warming issues as seen from the United Nations Framework Convention on Climate Change (Conference of the Parties: COP) and the policy discussions in Japan. From the Fifth Assessment Report published by IPCC, it shows the following items: (1) increasing trends of greenhouse effect gas emissions during 1970 and 2010, (2) trends in world's greenhouse effect gas emissions according to income segment, and (3) factor analysis of changes in greenhouse effect gas emissions. Next, it takes up the greenhouse gas emission scenario of IPCC, shows the scenario due to temperature rise pattern, and introduces the assumption of emission reduction due to BECCS. Regarding the 2 deg. scenario that has become a hot topic in international negotiations, it describes the reason for difficulties in its implementation. In addition, as the international trends of global warming, it describes the agreement of numerical targets for emissions at COP3 (Kyoto Conference) and the subsequent movements. Finally, it introduces Japan's measures against global warming, as well as the future movement. (A.O.)

  3. Suspect/Counterfeit Items Information Guide for Subcontractors/Suppliers

    Energy Technology Data Exchange (ETDEWEB)

    Tessmar, Nancy D. [Los Alamos National Laboratory; Salazar, Michael J. [Los Alamos National Laboratory

    2012-09-18

    Counterfeiting of industrial and commercial grade items is an international problem that places worker safety, program objectives, expensive equipment, and security at risk. In order to prevent the introduction of Suspect/Counterfeit Items (S/CI), this information sheet is being made available as a guide to assist in the implementation of S/CI awareness and controls, in conjunction with subcontractor's/supplier's quality assurance programs. When it comes to counterfeit goods, including industrial materials, items, and equipment, no market is immune. Some manufactures have been known to misrepresent their products and intentionally use inferior materials and processes to manufacture substandard items, whose properties can significantly cart from established standards and specifications. These substandard items termed by the Department of Energy (DOE) as S/CI, pose immediate and potential threats to the safety of DOE and contractor workers, the public, and the environment. Failure of certain systems and processes caused by an S/CI could also have national security implications at Los Alamos National Laboratory (LANL). Nuclear Safety Rules (federal Laws), DOE Orders, and other regulations set forth requirements for DOE contractors to implement effective controls to assure that items and services meet specified requirements. This includes techniques to implement and thereby minimizing the potential threat of entry of S/CI to LANL. As a qualified supplier of goods or services to the LANL, your company will be required to establish and maintain effective controls to prevent the introduction of S/CI to LANL. This will require that your company warrant that all items (including their subassemblies, components, and parts) sold to LANL are genuine (i.e. not counterfeit), new, and unused, and conform to the requirements of the LANL purchase orders/contracts unless otherwise approved in writing to the Los Alamos National Security (LANS) contract administrator

  4. Combining item and bulk material loss-detection uncertainties

    International Nuclear Information System (INIS)

    Eggers, R.F.

    1982-01-01

    Loss detection requirements, such as five formula kilograms with 99% probability of detection, which apply to the sum of losses from material in both item and bulk form, constitute a special problem for the nuclear material statistician. Requirements of this type are included in the Material Control and Accounting Reform Amendments described in the Advance Notice of Proposed Rule Making (Federal Register, 46(175):45144-46151). Attribute test sampling of items is the method used to detect gross defects in the inventory of items in a given control unit. Attribute sampling plans are designed to detect a loss of a specificed goal quantity of material with a given probability. In contrast to the methods and statistical models used for item loss detection, bulk material loss detection requires all the material entering and leaving a control unit to be measured and the calculation of a loss estimator that will be tested against an appropriate alarm threshold. The alarm threshold is determined from an estimate of the error inherent in the components of the loss estimator. In this paper a simple grahical method of evaluating the combined capabilities of bulk material loss detection methods and item attribute testing procedures will be described. Quantitative results will be given for several cases, indicating how a decrease in the precision of the item loss detection method tends to force an increase in the precision of the bulk loss detection procedure in order to meet the overall detection requirement. 4 figures

  5. Detection of person misfit in computerized adaptive tests with polytomous items

    NARCIS (Netherlands)

    van Krimpen-Stoop, Edith; Meijer, R.R.

    2000-01-01

    Item scores that do not fit an assumed item response theory model may cause the latent trait value to be estimated inaccurately. For computerized adaptive tests (CAT) with dichotomous items, several person-fit statistics for detecting nonfitting item score patterns have been proposed. Both for

  6. Binary classification of items of interest in a repeatable process

    Science.gov (United States)

    Abell, Jeffrey A.; Spicer, John Patrick; Wincek, Michael Anthony; Wang, Hui; Chakraborty, Debejyo

    2014-06-24

    A system includes host and learning machines in electrical communication with sensors positioned with respect to an item of interest, e.g., a weld, and memory. The host executes instructions from memory to predict a binary quality status of the item. The learning machine receives signals from the sensor(s), identifies candidate features, and extracts features from the candidates that are more predictive of the binary quality status relative to other candidate features. The learning machine maps the extracted features to a dimensional space that includes most of the items from a passing binary class and excludes all or most of the items from a failing binary class. The host also compares the received signals for a subsequent item of interest to the dimensional space to thereby predict, in real time, the binary quality status of the subsequent item of interest.

  7. The Dif Identification in Constructed Response Items Using Partial Credit Model

    Directory of Open Access Journals (Sweden)

    Heri Retnawati

    2017-10-01

    Full Text Available The study was to identify the load, the type and the significance of differential item functioning (DIF in constructed response item using the partial credit model (PCM. The data in the study were the students’ instruments and the students’ responses toward the PISA-like test items that had been completed by 386 ninth grade students and 460 tenth grade students who had been about 15 years old in the Province of Yogyakarta Special Region in Indonesia. The analysis toward the item characteristics through the student categorization based on their class was conducted toward the PCM using CONQUEST software. Furthermore, by applying these items characteristics, the researcher draw the category response function (CRF graphic in order to identify whether the type of DIF content had been in uniform or non-uniform. The significance of DIF was identified by comparing the discrepancy between the difficulty level parameter and the error in the CONQUEST output results. The results of the analysis showed that from 18 items that had been analyzed there were 4 items which had not been identified load DIF, there were 5 items that had been identified containing DIF but not statistically significant and there were 9 items that had been identified containing DIF significantly. The causes of items containing DIF were discussed.

  8. Grouping of Items in Mobile Web Questionnaires

    Science.gov (United States)

    Mavletova, Aigul; Couper, Mick P.

    2016-01-01

    There is some evidence that a scrolling design may reduce breakoffs in mobile web surveys compared to a paging design, but there is little empirical evidence to guide the choice of the optimal number of items per page. We investigate the effect of the number of items presented on a page on data quality in two types of questionnaires: with or…

  9. Polytomous latent scales for the investigation of the ordering of items

    NARCIS (Netherlands)

    Ligtvoet, R.; van der Ark, L.A.; Bergsma, W. P.; Sijtsma, K.

    2011-01-01

    We propose three latent scales within the framework of nonparametric item response theory for polytomously scored items. Latent scales are models that imply an invariant item ordering, meaning that the order of the items is the same for each measurement value on the latent scale. This ordering

  10. Computerized adaptive testing item selection in computerized adaptive learning systems

    NARCIS (Netherlands)

    Eggen, Theodorus Johannes Hendrikus Maria; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item selection methods traditionally developed for computerized adaptive testing (CAT) are explored for their usefulness in item-based computerized adaptive learning (CAL) systems. While in CAT Fisher information-based selection is optimal, for recovering learning populations in CAL systems item

  11. Optimal item discrimination and maximum information for logistic IRT models

    NARCIS (Netherlands)

    Veerkamp, W.J.J.; Veerkamp, Wim J.J.; Berger, Martijn P.F.; Berger, Martijn

    1999-01-01

    Items with the highest discrimination parameter values in a logistic item response theory model do not necessarily give maximum information. This paper derives discrimination parameter values, as functions of the guessing parameter and distances between person parameters and item difficulty, that

  12. Data Visualization of Item-Total Correlation by Median Smoothing

    Directory of Open Access Journals (Sweden)

    Chong Ho Yu

    2016-02-01

    Full Text Available This paper aims to illustrate how data visualization could be utilized to identify errors prior to modeling, using an example with multi-dimensional item response theory (MIRT. MIRT combines item response theory and factor analysis to identify a psychometric model that investigates two or more latent traits. While it may seem convenient to accomplish two tasks by employing one procedure, users should be cautious of problematic items that affect both factor analysis and IRT. When sample sizes are extremely large, reliability analyses can misidentify even random numbers as meaningful patterns. Data visualization, such as median smoothing, can be used to identify problematic items in preliminary data cleaning.

  13. Deliberate ambiguity in a finite environment: The urban ecology of artificial items

    Directory of Open Access Journals (Sweden)

    Abraham Akkerman

    2000-01-01

    Full Text Available A distinction is made between visual declaration and virtual usage of artificial items within a physical environment, such as a street. Visual declaration is a formal pictorial designation, or a function, e.g. “decoration,” of an item, such as a “planter.” Virtual usage refers to the item when it is used in lieu of another item. The formal designation, “sitting,” customarily designated to an item such as “bench,” could also be a virtual usage of the item “planter.” The question asked is, “What is the relationship between items, given their formal, visual declaration and their informal, virtual, usage?” An artificial item, according to its visual declaration, is referred to as a ‘visual’ or ‘real item’. Each visual item has the property of being used as another item by virtue of its undeclared usage. Pending on the item's design and configuration, a visual item can be then substituted for another visual item. An artificial item, thus, attains deliberate ambiguity between its formal designation and its virtual usage. This ambiguity between visual declaration and virtual usage can be quantified. Within the full domain of n possible usages, this relationship can be conveniently presented in a nonnegative matrix. It is shown that the inverse of this matrix belongs to a class of well-known matrices. This being the case, the relationship between visual and virtual properties of items within the environment can be formalized. The formalization throws further light on the emerging opportunities in streetscape design.

  14. Three sides of the same coin: measuring global cognitive impairment with the MMSE, ADAS-cog and CAMCOG.

    Science.gov (United States)

    Wouters, Hans; van Gool, Willem A; Schmand, Ben; Zwinderman, Aeilko H; Lindeboom, Robert

    2010-08-01

    The total scores of the ADAS-cog, MMSE and CAMCOG, comprising various cognitive tasks, are widely used to measure a dimension of global cognitive impairment. It is unknown, however, whether this dimension is common to these instruments. This hampers comparisons when either of these instruments is used. The extent to which these instruments share a common dimension of global cognitive impairment and how their scores relate was examined. Rasch analysis of CAMCOG and MMSE data of participants from a population based study and two memory clinics pooled with ADAS-cog and MMSE data of participants from three RCTs (overall N = 1566) to estimate a common dimension of global cognitive impairment and to examine the goodness of fit of the individual items to this dimension. Using the estimated common dimension of global cognitive impairment, the total scores of the instruments could be related, e.g. a mean level of global cognitive impairment corresponded to a predicted score of 11.4 (ADAS-cog), 72.6 (CAMCOG) and 22.2 (MMSE). When revised according to The Rasch validity analyses, every individual item could be fitted to the dimension. The MMSE, ADAS-cog and CAMCOG reflect a valid common dimension of global cognitive impairment, which enables comparisons of RCTs that use the ADAS-cog and observational studies that use the CAMCOG and MMSE.

  15. Comparing Two Versions of the MEOCS Using Differential Item Functioning

    National Research Council Canada - National Science Library

    Truhon, Stephen

    2003-01-01

    ...) from item response theory (IRT). DIF was found for the majority of the 40 items examined, although in many cases the DIF indicated improvements in the revised items. Implications for these scales and for the use of IRT with the MEOCS are discussed.

  16. A simple and fast item selection procedure for adaptive testing

    NARCIS (Netherlands)

    Veerkamp, W.J.J.; Veerkamp, Wim J.J.; Berger, Martijn; Berger, Martijn P.F.

    1994-01-01

    Items with the highest discrimination parameter values in a logistic item response theory (IRT) model do not necessarily give maximum information. This paper shows which discrimination parameter values (as a function of the guessing parameter and the distance between person ability and item

  17. Comparison of Alternate and Original Items on the Montreal Cognitive Assessment.

    Science.gov (United States)

    Lebedeva, Elena; Huang, Mei; Koski, Lisa

    2016-03-01

    The Montreal Cognitive Assessment (MoCA) is a screening tool for mild cognitive impairment (MCI) in elderly individuals. We hypothesized that measurement error when using the new alternate MoCA versions to monitor change over time could be related to the use of items that are not of comparable difficulty to their corresponding originals of similar content. The objective of this study was to compare the difficulty of the alternate MoCA items to the original ones. Five selected items from alternate versions of the MoCA were included with items from the original MoCA administered adaptively to geriatric outpatients (N = 78). Rasch analysis was used to estimate the difficulty level of the items. None of the five items from the alternate versions matched the difficulty level of their corresponding original items. This study demonstrates the potential benefits of a Rasch analysis-based approach for selecting items during the process of development of parallel forms. The results suggest that better match of the items from different MoCA forms by their difficulty would result in higher sensitivity to changes in cognitive function over time.

  18. Bayes factor covariance testing in item response models

    NARCIS (Netherlands)

    Fox, J.P.; Mulder, J.; Sinharay, Sandip

    2017-01-01

    Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning

  19. Bayes Factor Covariance Testing in Item Response Models

    NARCIS (Netherlands)

    Fox, Jean-Paul; Mulder, Joris; Sinharay, Sandip

    2017-01-01

    Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning

  20. 26 CFR 301.6222(a)-1 - Consistent treatment of partnership items.

    Science.gov (United States)

    2010-04-01

    ... 26 Internal Revenue 18 2010-04-01 2010-04-01 false Consistent treatment of partnership items. 301... Consistent treatment of partnership items. (a) In general. The treatment of a partnership item on the partner's return must be consistent with the treatment of that item by the partnership on the partnership...

  1. Analyzing force concept inventory with item response theory

    Science.gov (United States)

    Wang, Jing; Bao, Lei

    2010-10-01

    Item response theory is a popular assessment method used in education. It rests on the assumption of a probability framework that relates students' innate ability and their performance on test questions. Item response theory transforms students' raw test scores into a scaled proficiency score, which can be used to compare results obtained with different test questions. The scaled score also addresses the issues of ceiling effects and guessing, which commonly exist in quantitative assessment. We used item response theory to analyze the force concept inventory (FCI). Our results show that item response theory can be useful for analyzing physics concept surveys such as the FCI and produces results about the individual questions and student performance that are beyond the capability of classical statistics. The theory yields detailed measurement parameters regarding the difficulty, discrimination features, and probability of correct guess for each of the FCI questions.

  2. Criteria for eliminating items of a Test of Figural Analogies

    Directory of Open Access Journals (Sweden)

    Diego Blum

    2013-12-01

    Full Text Available This paper describes the steps taken to eliminate two of the items in a Test of Figural Analogies (TFA. The main guidelines of psychometric analysis concerning Classical Test Theory (CTT and Item Response Theory (IRT are explained. The item elimination process was based on both the study of the CTT difficulty and discrimination index, and the unidimensionality analysis. The a, b, and c parameters of the Three Parameter Logistic Model of IRT were also considered for this purpose, as well as the assessment of each item fitting this model. The unfavourable characteristics of a group of TFA items are detailed, and decisions leading to their possible elimination are discussed.

  3. Combined Versus Detailed Evaluation Components in Medical Student Global Rating Indexes

    Directory of Open Access Journals (Sweden)

    Kim L. Askew

    2015-11-01

    Full Text Available Introduction: To determine if there is any correlation between any of the 10 individual components of a global rating index on an emergency medicine (EM student clerkship evaluation form. If there is correlation, to determine if a weighted average of highly correlated components loses predictive value for the final clerkship grade. Methods: This study reviewed medical student evaluations collected over two years of a required fourth-year rotation in EM. Evaluation cards, comprised of a detailed 10-part evaluation, were completed after each shift. We used a correlation matrix between evaluation category average scores, using Spearman’s rho, to determine if there was any correlation of the grades between any of the 10 items on the evaluation form. Results: A total of 233 students completed the rotation over the two-year period of the study. There were strong correlations (>0.80 between assessment components of medical knowledge, history taking, physical exam, and differential diagnosis. There were also strong correlations between assessment components of team rapport, patient rapport, and motivation. When these highly correlated were combined to produce a four-component model, linear regression demonstrated similar predictive power in terms of final clerkship grade (R2 =0.71, CI95=0.65–0.77 and R2 =0.69, CI95=0.63–0.76 for the full and reduced models respectively. Conclusion: This study revealed that several components of the evaluation card had a high degree of correlation. Combining the correlated items, a reduced model containing four items (clinical skills, interpersonal skills, procedural skills, and documentation was as predictive of the student’s clinical grade as the full 10-item evaluation. Clerkship directors should be aware of the performance of their individual global rating scales when assessing medical student performance, especially if attempting to measure greater than four components.

  4. Validating and Determining the Weight of Items Used for Evaluating Clinical Governance Implementation Based on Analytic Hierarchy Process Model

    Directory of Open Access Journals (Sweden)

    Elaheh Hooshmand

    2015-10-01

    Full Text Available Background The purpose of implementing a system such as Clinical Governance (CG is to integrate, establish and globalize distinct policies in order to improve quality through increasing professional knowledge and the accountability of healthcare professional toward providing clinical excellence. Since CG is related to change, and change requires money and time, CG implementation has to be focused on priority areas that are in more dire need of change. The purpose of the present study was to validate and determine the significance of items used for evaluating CG implementation. Methods The present study was descriptive-quantitative in method and design. Items used for evaluating CG implementation were first validated by the Delphi method and then compared with one another and ranked based on the Analytical Hierarchy Process (AHP model. Results The items that were validated for evaluating CG implementation in Iran include performance evaluation, training and development, personnel motivation, clinical audit, clinical effectiveness, risk management, resource allocation, policies and strategies, external audit, information system management, research and development, CG structure, implementation prerequisites, the management of patients’ non-medical needs, complaints and patients’ participation in the treatment process. The most important items based on their degree of significance were training and development, performance evaluation, and risk management. The least important items included the management of patients’ non-medical needs, patients’ participation in the treatment process and research and development. Conclusion The fundamental requirements of CG implementation included having an effective policy at national level, avoiding perfectionism, using the expertise and potentials of the entire country and the coordination of this model with other models of quality improvement such as accreditation and patient safety.

  5. Validating and determining the weight of items used for evaluating clinical governance implementation based on analytic hierarchy process model.

    Science.gov (United States)

    Hooshmand, Elaheh; Tourani, Sogand; Ravaghi, Hamid; Vafaee Najar, Ali; Meraji, Marziye; Ebrahimipour, Hossein

    2015-04-08

    The purpose of implementing a system such as Clinical Governance (CG) is to integrate, establish and globalize distinct policies in order to improve quality through increasing professional knowledge and the accountability of healthcare professional toward providing clinical excellence. Since CG is related to change, and change requires money and time, CG implementation has to be focused on priority areas that are in more dire need of change. The purpose of the present study was to validate and determine the significance of items used for evaluating CG implementation. The present study was descriptive-quantitative in method and design. Items used for evaluating CG implementation were first validated by the Delphi method and then compared with one another and ranked based on the Analytical Hierarchy Process (AHP) model. The items that were validated for evaluating CG implementation in Iran include performance evaluation, training and development, personnel motivation, clinical audit, clinical effectiveness, risk management, resource allocation, policies and strategies, external audit, information system management, research and development, CG structure, implementation prerequisites, the management of patients' non-medical needs, complaints and patients' participation in the treatment process. The most important items based on their degree of significance were training and development, performance evaluation, and risk management. The least important items included the management of patients' non-medical needs, patients' participation in the treatment process and research and development. The fundamental requirements of CG implementation included having an effective policy at national level, avoiding perfectionism, using the expertise and potentials of the entire country and the coordination of this model with other models of quality improvement such as accreditation and patient safety. © 2015 by Kerman University of Medical Sciences.

  6. Detection of Uniform and Nonuniform Differential Item Functioning by Item-Focused Trees

    Science.gov (United States)

    Berger, Moritz; Tutz, Gerhard

    2016-01-01

    Detection of differential item functioning (DIF) by use of the logistic modeling approach has a long tradition. One big advantage of the approach is that it can be used to investigate nonuniform (NUDIF) as well as uniform DIF (UDIF). The classical approach allows one to detect DIF by distinguishing between multiple groups. We propose an…

  7. An improved non-Markovian degradation model with long-term dependency and item-to-item uncertainty

    Science.gov (United States)

    Xi, Xiaopeng; Chen, Maoyin; Zhang, Hanwen; Zhou, Donghua

    2018-05-01

    It is widely noted in the literature that the degradation should be simplified into a memoryless Markovian process for the purpose of predicting the remaining useful life (RUL). However, there actually exists the long-term dependency in the degradation processes of some industrial systems, including electromechanical equipments, oil tankers, and large blast furnaces. This implies the new degradation state depends not only on the current state, but also on the historical states. Such dynamic systems cannot be accurately described by traditional Markovian models. Here we present an improved non-Markovian degradation model with both the long-term dependency and the item-to-item uncertainty. As a typical non-stationary process with dependent increments, fractional Brownian motion (FBM) is utilized to simulate the fractal diffusion of practical degradations. The uncertainty among multiple items can be represented by a random variable of the drift. Based on this model, the unknown parameters are estimated through the maximum likelihood (ML) algorithm, while a closed-form solution to the RUL distribution is further derived using a weak convergence theorem. The practicability of the proposed model is fully verified by two real-world examples. The results demonstrate that the proposed method can effectively reduce the prediction error.

  8. Mathematical-programming approaches to test item pool design

    NARCIS (Netherlands)

    Veldkamp, Bernard P.; van der Linden, Willem J.; Ariel, A.

    2002-01-01

    This paper presents an approach to item pool design that has the potential to improve on the quality of current item pools in educational and psychological testing andhence to increase both measurement precision and validity. The approach consists of the application of mathematical programming

  9. 17 CFR 229.1010 - (Item 1010) Financial statements.

    Science.gov (United States)

    2010-04-01

    ....1010 (Item 1010) Financial statements. (a) Financial information. Furnish the following financial information: (1) Audited financial statements for the two fiscal years required to be filed with the company's... 17 Commodity and Securities Exchanges 2 2010-04-01 2010-04-01 false (Item 1010) Financial...

  10. Procedures for Selecting Items for Computerized Adaptive Tests.

    Science.gov (United States)

    Kingsbury, G. Gage; Zara, Anthony R.

    1989-01-01

    Several classical approaches and alternative approaches to item selection for computerized adaptive testing (CAT) are reviewed and compared. The study also describes procedures for constrained CAT that may be added to classical item selection approaches to allow them to be used for applied testing. (TJH)

  11. Improving Measurement Efficiency of the Inner EAR Scale with Item Response Theory.

    Science.gov (United States)

    Jessen, Annika; Ho, Andrew D; Corrales, C Eduardo; Yueh, Bevan; Shin, Jennifer J

    2018-02-01

    Objectives (1) To assess the 11-item Inner Effectiveness of Auditory Rehabilitation (Inner EAR) instrument with item response theory (IRT). (2) To determine whether the underlying latent ability could also be accurately represented by a subset of the items for use in high-volume clinical scenarios. (3) To determine whether the Inner EAR instrument correlates with pure tone thresholds and word recognition scores. Design IRT evaluation of prospective cohort data. Setting Tertiary care academic ambulatory otolaryngology clinic. Subjects and Methods Modern psychometric methods, including factor analysis and IRT, were used to assess unidimensionality and item properties. Regression methods were used to assess prediction of word recognition and pure tone audiometry scores. Results The Inner EAR scale is unidimensional, and items varied in their location and information. Information parameter estimates ranged from 1.63 to 4.52, with higher values indicating more useful items. The IRT model provided a basis for identifying 2 sets of items with relatively lower information parameters. Item information functions demonstrated which items added insubstantial value over and above other items and were removed in stages, creating a 8- and 3-item Inner EAR scale for more efficient assessment. The 8-item version accurately reflected the underlying construct. All versions correlated moderately with word recognition scores and pure tone averages. Conclusion The 11-, 8-, and 3-item versions of the Inner EAR scale have strong psychometric properties, and there is correlational validity evidence for the observed scores. Modern psychometric methods can help streamline care delivery by maximizing relevant information per item administered.

  12. QA in the procurement of items and services

    International Nuclear Information System (INIS)

    Wilhelm, H.

    1980-01-01

    Procurement of items and services is one of the important elements during the design and construction of Nuclear Power Plants. The purchaser has to establish and implement controls over the procurement process to ensure that the quality criteria, quality level and other quality requirements specified for the particuliar item or service are taken into account. The effect on safety of an error in service or the malfunction of an item is the most important factor to be considered in determining the extent of quality assurance efforts. A typical example of a procurement process will be demonstrated for safety related mechanical components. (orig./RW)

  13. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    Science.gov (United States)

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  14. DRD4 long allele carriers show heightened attention to high-priority items relative to low-priority items.

    Science.gov (United States)

    Gorlick, Marissa A; Worthy, Darrell A; Knopik, Valerie S; McGeary, John E; Beevers, Christopher G; Maddox, W Todd

    2015-03-01

    Humans with seven or more repeats in exon III of the DRD4 gene (long DRD4 carriers) sometimes demonstrate impaired attention, as seen in attention-deficit hyperactivity disorder, and at other times demonstrate heightened attention, as seen in addictive behavior. Although the clinical effects of DRD4 are the focus of much work, this gene may not necessarily serve as a "risk" gene for attentional deficits, but as a plasticity gene where attention is heightened for priority items in the environment and impaired for minor items. Here we examine the role of DRD4 in two tasks that benefit from selective attention to high-priority information. We examine a category learning task where performance is supported by focusing on features and updating verbal rules. Here, selective attention to the most salient features is associated with good performance. In addition, we examine the Operation Span (OSPAN) task, a working memory capacity task that relies on selective attention to update and maintain items in memory while also performing a secondary task. Long DRD4 carriers show superior performance relative to short DRD4 homozygotes (six or less tandem repeats) in both the category learning and OSPAN tasks. These results suggest that DRD4 may serve as a "plasticity" gene where individuals with the long allele show heightened selective attention to high-priority items in the environment, which can be beneficial in the appropriate context.

  15. Australian Chemistry Test Item Bank: Years 11 and 12. Volume 2.

    Science.gov (United States)

    Commons, C., Ed.; Martin, P., Ed.

    The second volume of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the…

  16. The Body Appreciation Scale-2: item refinement and psychometric evaluation.

    Science.gov (United States)

    Tylka, Tracy L; Wood-Barcalow, Nichole L

    2015-01-01

    Considered a positive body image measure, the 13-item Body Appreciation Scale (BAS; Avalos, Tylka, & Wood-Barcalow, 2005) assesses individuals' acceptance of, favorable opinions toward, and respect for their bodies. While the BAS has accrued psychometric support, we improved it by rewording certain BAS items (to eliminate sex-specific versions and body dissatisfaction-based language) and developing additional items based on positive body image research. In three studies, we examined the reworded, newly developed, and retained items to determine their psychometric properties among college and online community (Amazon Mechanical Turk) samples of 820 women and 767 men. After exploratory factor analysis, we retained 10 items (five original BAS items). Confirmatory factor analysis upheld the BAS-2's unidimensionality and invariance across sex and sample type. Its internal consistency, test-retest reliability, and construct (convergent, incremental, and discriminant) validity were supported. The BAS-2 is a psychometrically sound positive body image measure applicable for research and clinical settings. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. PENGEMBANGAN TES BERPIKIR KRITIS DENGAN PENDEKATAN ITEM RESPONSE THEORY

    Directory of Open Access Journals (Sweden)

    Fajrianthi Fajrianthi

    2016-06-01

    Full Text Available Penelitian ini bertujuan untuk menghasilkan sebuah alat ukur (tes berpikir kritis yang valid dan reliabel untuk digunakan, baik dalam lingkup pendidikan maupun kerja di Indonesia. Tahapan penelitian dilakukan berdasarkan tahap pengembangan tes menurut Hambleton dan Jones (1993. Kisi-kisi dan pembuatan butir didasarkan pada konsep dalam tes Watson-Glaser Critical Thinking Appraisal (WGCTA. Pada WGCTA, berpikir kritis terdiri dari lima dimensi yaitu Inference, Recognition Assumption, Deduction, Interpretation dan Evaluation of arguments. Uji coba tes dilakukan pada 1.453 peserta tes seleksi karyawan di Surabaya, Gresik, Tuban, Bojonegoro, Rembang. Data dikotomi dianalisis dengan menggunakan model IRT dengan dua parameter yaitu daya beda dan tingkat kesulitan butir. Analisis dilakukan dengan menggunakan program statistik Mplus versi 6.11 Sebelum melakukan analisis dengan IRT, dilakukan pengujian asumsi yaitu uji unidimensionalitas, independensi lokal dan Item Characteristic Curve (ICC. Hasil analisis terhadap 68 butir menghasilkan 15 butir dengan daya beda yang cukup baik dan tingkat kesulitan butir yang berkisar antara –4 sampai dengan 2.448. Sedikitnya jumlah butir yang berkualitas baik disebabkan oleh kelemahan dalam menentukan subject matter experts di bidang berpikir kritis dan pemilihan metode skoring. Kata kunci: Pengembangan tes, berpikir kritis, item response theory   DEVELOPING CRITICAL THINKING TEST UTILISING ITEM RESPONSE THEORY Abstract The present study was aimed to develop a valid and reliable instrument in assesing critical thinking which can be implemented both in educational and work settings in Indonesia. Following the Hambleton and Jones’s (1993 procedures on test development, the study developed the instrument by employing the concept of critical thinking from Watson-Glaser Critical Thinking Appraisal (WGCTA. The study included five dimensions of critical thinking as adopted from the WGCTA: Inference, Recognition

  18. Item response theory at subject- and group-level

    NARCIS (Netherlands)

    Tobi, Hilde

    1990-01-01

    This paper reviews the literature about item response models for the subject level and aggregated level (group level). Group-level item response models (IRMs) are used in the United States in large-scale assessment programs such as the National Assessment of Educational Progress and the California

  19. 48 CFR 52.212-2 - Evaluation-Commercial Items.

    Science.gov (United States)

    2010-10-01

    ... 48 Federal Acquisition Regulations System 2 2010-10-01 2010-10-01 false Evaluation-Commercial....212-2 Evaluation—Commercial Items. As prescribed in 12.301(c), the Contracting Officer may insert a provision substantially as follows: Evaluation—Commercial Items (JAN 1999) (a) The Government will award a...

  20. Loglinear multidimensional IRT models for polytomously scored items

    NARCIS (Netherlands)

    Kelderman, Henk; Rijkes, Carl P.M.; Rijkes, Carl

    1994-01-01

    A loglinear IRT model is proposed that relates polytomously scored item responses to a multidimensional latent space. The analyst may specify a response function for each response, indicating which latent abilities are necessary to arrive at that response. Each item may have a different number of

  1. Item Response Theory Models for Wording Effects in Mixed-Format Scales

    Science.gov (United States)

    Wang, Wen-Chung; Chen, Hui-Fang; Jin, Kuan-Yu

    2015-01-01

    Many scales contain both positively and negatively worded items. Reverse recoding of negatively worded items might not be enough for them to function as positively worded items do. In this study, we commented on the drawbacks of existing approaches to wording effect in mixed-format scales and used bi-factor item response theory (IRT) models to…

  2. 16 CFR 304.5 - Marking requirements for imitation political items.

    Science.gov (United States)

    2010-01-01

    ... 16 Commercial Practices 1 2010-01-01 2010-01-01 false Marking requirements for imitation political... imitation political items. (a) An imitation political item which is manufactured in the United States, or...) An imitation political item of incusable material shall be incused with the calendar year in sans...

  3. Evaluating an Automated Number Series Item Generator Using Linear Logistic Test Models

    Directory of Open Access Journals (Sweden)

    Bao Sheng Loe

    2018-04-01

    Full Text Available This study investigates the item properties of a newly developed Automatic Number Series Item Generator (ANSIG. The foundation of the ANSIG is based on five hypothesised cognitive operators. Thirteen item models were developed using the numGen R package and eleven were evaluated in this study. The 16-item ICAR (International Cognitive Ability Resource1 short form ability test was used to evaluate construct validity. The Rasch Model and two Linear Logistic Test Model(s (LLTM were employed to estimate and predict the item parameters. Results indicate that a single factor determines the performance on tests composed of items generated by the ANSIG. Under the LLTM approach, all the cognitive operators were significant predictors of item difficulty. Moderate to high correlations were evident between the number series items and the ICAR test scores, with high correlation found for the ICAR Letter-Numeric-Series type items, suggesting adequate nomothetic span. Extended cognitive research is, nevertheless, essential for the automatic generation of an item pool with predictable psychometric properties.

  4. Evaluating and Refining the Construct of Sexual Quality With Item Response Theory: Development of the Quality of Sex Inventory.

    Science.gov (United States)

    Shaw, Amanda M; Rogge, Ronald D

    2016-02-01

    This study took a critical look at the construct of sexual quality. The 65 items of four well-validated self-report measures of sexual satisfaction (the Index of Sexual Satisfaction [ISS], Hudson, Harrison, & Crosscup, 1981; the Global Measure of Sexual Satisfaction [GMSEX], Lawrance & Byers, 1995; the Pinney Sexual Satisfaction Inventory [PSSI], Pinney, Gerrard, & Denney, 1987; the Young Sexual Satisfaction Scale [YSSS], Young, Denny, Luquis, & Young, 1998) and an additional 74 potential sexual quality items were given to 3060 online participants. Using Item Response Theory (IRT), we demonstrated that the ISS, YSSS, and PSSI scales provided suboptimal levels of precision in assessing sexual quality, particularly given the length of those scales. Exploratory factor analyses, IRT, differential item functioning analyses, and longitudinal responsiveness analyses were used to develop and evaluate the Quality of Sex Inventory. Results suggested that, in comparison to existing scales, the QSI (1) offers investigators and clinicians more theoretically focused scales, (2) distinguishes sexual satisfaction from sexual dissatisfaction, and (3) offers greater precision and power for detecting differences with (4) comparably high levels of responsiveness for detecting change over time despite being notably shorter than most of the existing scales. The QSI-satisfaction subscales demonstrated strong convergent validity with other measures of sexual satisfaction and excellent construct validity with anchor scales from the nomological net surrounding that construct, suggesting that they continue to assess the same theoretical construct as prior scales. Implications for research are discussed.

  5. Improving measurement of injection drug risk behavior using item response theory.

    Science.gov (United States)

    Janulis, Patrick

    2014-03-01

    Recent research highlights the multiple steps to preparing and injecting drugs and the resultant viral threats faced by drug users. This research suggests that more sensitive measurement of injection drug HIV risk behavior is required. In addition, growing evidence suggests there are gender differences in injection risk behavior. However, the potential for differential item functioning between genders has not been explored. To explore item response theory as an improved measurement modeling technique that provides empirically justified scaling of injection risk behavior and to examine for potential gender-based differential item functioning. Data is used from three studies in the National Institute on Drug Abuse's Criminal Justice Drug Abuse Treatment Studies. A two-parameter item response theory model was used to scale injection risk behavior and logistic regression was used to examine for differential item functioning. Item fit statistics suggest that item response theory can be used to scale injection risk behavior and these models can provide more sensitive estimates of risk behavior. Additionally, gender-based differential item functioning is present in the current data. Improved measurement of injection risk behavior using item response theory should be encouraged as these models provide increased congruence between construct measurement and the complexity of injection-related HIV risk. Suggestions are made to further improve injection risk behavior measurement. Furthermore, results suggest direct comparisons of composite scores between males and females may be misleading and future work should account for differential item functioning before comparing levels of injection risk behavior.

  6. Medial temporal lobe contributions to cued retrieval of items and contexts.

    Science.gov (United States)

    Hannula, Deborah E; Libby, Laura A; Yonelinas, Andrew P; Ranganath, Charan

    2013-10-01

    Several models have proposed that different regions of the medial temporal lobes contribute to different aspects of episodic memory. For instance, according to one view, the perirhinal cortex represents specific items, parahippocampal cortex represents information regarding the context in which these items were encountered, and the hippocampus represents item-context bindings. Here, we used event-related functional magnetic resonance imaging (fMRI) to test a specific prediction of this model-namely, that successful retrieval of items from context cues will elicit perirhinal recruitment and that successful retrieval of contexts from item cues will elicit parahippocampal cortex recruitment. Retrieval of the bound representation in either case was expected to elicit hippocampal engagement. To test these predictions, we had participants study several item-context pairs (i.e., pictures of objects and scenes, respectively), and then had them attempt to recall items from associated context cues and contexts from associated item cues during a scanned retrieval session. Results based on both univariate and multivariate analyses confirmed a role for hippocampus in content-general relational memory retrieval, and a role for parahippocampal cortex in successful retrieval of contexts from item cues. However, we also found that activity differences in perirhinal cortex were correlated with successful cued recall for both items and contexts. These findings provide partial support for the above predictions and are discussed with respect to several models of medial temporal lobe function. Copyright © 2013 Elsevier Ltd. All rights reserved.

  7. Medial Temporal Lobe Contributions to Cued Retrieval of Items and Contexts

    Science.gov (United States)

    Hannula, Deborah E.; Libby, Laura A.; Yonelinas, Andrew P.; Ranganath, Charan

    2013-01-01

    Several models have proposed that different regions of the medial temporal lobes contribute to different aspects of episodic memory. For instance, according to one view, the perirhinal cortex represents specific items, parahippocampal cortex represents information regarding the context in which these items were encountered, and the hippocampus represents item-context bindings. Here, we used event-related functional magnetic resonance imaging (fMRI) to test a specific prediction of this model – namely, that successful retrieval of items from context cues will elicit perirhinal recruitment and that successful retrieval of contexts from item cues will elicit parahippocampal cortex recruitment. Retrieval of the bound representation in either case was expected to elicit hippocampal engagement. To test these predictions, we had participants study several item-context pairs (i.e., pictures of objects and scenes, respectively), and then had them attempt to recall items from associated context cues and contexts from associated item cues during a scanned retrieval session. Results based on both univariate and multivariate analyses confirmed a role for hippocampus in content-general relational memory retrieval, and a role for parahippocampal cortex in successful retrieval of contexts from item cues. However, we also found that activity differences in perirhinal cortex were correlated with successful cued recall for both items and contexts. These findings provide partial support for the above predictions and are discussed with respect to several models of medial temporal lobe function. PMID:23466350

  8. A scale purification procedure for evaluation of differential item functioning

    NARCIS (Netherlands)

    Khalid, Muhammad Naveed; Glas, Cornelis A.W.

    2014-01-01

    Item bias or differential item functioning (DIF) has an important impact on the fairness of psychological and educational testing. In this paper, DIF is seen as a lack of fit to an item response (IRT) model. Inferences about the presence and importance of DIF require a process of so-called test

  9. ACER Chemistry Test Item Collection (ACER CHEMTIC Year 12 Supplement).

    Science.gov (United States)

    Australian Council for Educational Research, Hawthorn.

    This publication contains 317 multiple-choice chemistry test items related to topics covered in the Victorian (Australia) Year 12 chemistry course. It allows teachers access to a range of items suitable for diagnostic and achievement purposes, supplementing the ACER Chemistry Test Item Collection--Year 12 (CHEMTIC). The topics covered are: organic…

  10. 48 CFR 53.212 - Acquisition of commercial items.

    Science.gov (United States)

    2010-10-01

    ... 48 Federal Acquisition Regulations System 2 2010-10-01 2010-10-01 false Acquisition of commercial... (CONTINUED) CLAUSES AND FORMS FORMS Prescription of Forms 53.212 Acquisition of commercial items. SF 1449 (Rev. 3/2005), Solicitation/Contract/Order for Commercial Items. SF 1449 is prescribed for use in...

  11. The utility of single-item readiness screeners in middle school.

    Science.gov (United States)

    Lewis, Crystal G; Herman, Keith C; Huang, Francis L; Stormont, Melissa; Grossman, Caroline; Eddy, Colleen; Reinke, Wendy M

    2017-10-01

    This study examined the benefit of utilizing one-item academic and one-item behavior readiness teacher-rated screeners at the beginning of the school year to predict end-of-school year outcomes for middle school students. The Middle School Academic and Behavior Readiness (M-ABR) screeners were developed to provide an efficient and effective way to assess readiness in students. Participants included 889 students in 62 middle school classrooms in an urban Missouri school district. Concurrent validity with the M-ABR items and other indicators of readiness in the fall were evaluated using Pearson product-moment correlation coefficients, with the academic readiness item having medium to strong correlations with other baseline academic indicators (r=±0.56 to 0.91) and the behavior readiness item having low to strong correlations with baseline behavior items (r=±0.20 to 0.79). Next, the predictive validity of the M-ABR items was analyzed with hierarchical linear regressions using end-of-year outcomes as the dependent variable. The academic and behavior readiness items demonstrated adequate validity for all outcomes with moderate effects (β=±0.31 to 0.73 for academic outcomes and β=±0.24 to 0.59 for behavioral outcomes) after controlling for baseline demographics. Even after controlling for baseline scores, the M-ABR items predicted unique variance in almost all outcome variables. Four conditional probability indices were calculated to obtain an optimal cut score, to determine ready vs. not ready, for both single-item M-ABR scales. The cut point of "fair" yielded the most acceptable values for the indices. The odd ratios (OR) of experiencing negative outcomes given a "fair" or lower readiness rating (2 or below on the M-ABR screeners) at the beginning of the year were significant and strong for all outcomes (OR=2.29 to OR=14.46), except for internalizing problems. These findings suggest promise for using single readiness items to screen for varying negative end

  12. Method of locating related items in a geometric space for data mining

    Science.gov (United States)

    Hendrickson, Bruce A.

    1999-01-01

    A method for locating related items in a geometric space transforms relationships among items to geometric locations. The method locates items in the geometric space so that the distance between items corresponds to the degree of relatedness. The method facilitates communication of the structure of the relationships among the items. The method is especially beneficial for communicating databases with many items, and with non-regular relationship patterns. Examples of such databases include databases containing items such as scientific papers or patents, related by citations or keywords. A computer system adapted for practice of the present invention can include a processor, a storage subsystem, a display device, and computer software to direct the location and display of the entities. The method comprises assigning numeric values as a measure of similarity between each pairing of items. A matrix is constructed, based on the numeric values. The eigenvectors and eigenvalues of the matrix are determined. Each item is located in the geometric space at coordinates determined from the eigenvectors and eigenvalues. Proper construction of the matrix and proper determination of coordinates from eigenvectors can ensure that distance between items in the geometric space is representative of the numeric value measure of the items' similarity.

  13. Optimal Cycle Time and Preservation Technology Investment for Deteriorating Items with Price-sensitive Stock-dependent Demand Under Inflation

    Science.gov (United States)

    Shah, Nita H.; Shah, Arpan D.

    2014-04-01

    The article analyzes economic order quantity for the retailer who has to handle imperfect quality of the product and the units are subject to deteriorate at a constant rate. To control deterioration of the units in inventory, the retailer has to deploy advanced preservation technology. Another challenge for the retailer is to have perfect quality product. This requires mandatory inspection during the production process. This model is developed with the condition of random fraction of defective items. It is assumed that after inspection, the screened defective items are sold at a discounted rate instantly. Demand is considered to be price-sensitive stock-dependent. The model is incorporating effect of inflation which is critical factor globally. The objective is to maximize profit of the retailer with respect to preservation technology investment, order quantity and cycle time. The numerical example is given to validate the proposed model. Sensitivity analysis is carried out to work out managerial issues.

  14. An Efficient Way to Detect Poststroke Depression by Subsequent Administration of a 9-Item and a 2-Item Patient Health Questionnaire

    NARCIS (Netherlands)

    de Man-van Ginkel, Janneke M.; Hafsteinsdottir, Thora; Lindeman, Eline; Burger, Huibert; Grobbee, Diederick; Schuurmans, Marieke

    Background and Purpose-The early detection of poststroke depression is essential for optimizing recovery after stroke. A prospective study was conducted to investigate the diagnostic value of the 9-item and the 2-item Patient Health Questionnaire (PHQ-9, PHQ-2). Methods-One hundred seventy-one

  15. Cross-National Prevalence of Traditional Bullying, Traditional Victimization, Cyberbullying and Cyber-Victimization: Comparing Single-Item and Multiple-Item Approaches of Measurement

    Science.gov (United States)

    Yanagida, Takuya; Gradinger, Petra; Strohmeier, Dagmar; Solomontos-Kountouri, Olga; Trip, Simona; Bora, Carmen

    2016-01-01

    Many large-scale cross-national studies rely on a single-item measurement when comparing prevalence rates of traditional bullying, traditional victimization, cyberbullying, and cyber-victimization between countries. However, the reliability and validity of single-item measurement approaches are highly problematic and might be biased. Data from…

  16. Storage options for Long Length Contaminated Equipment (LLCE) items

    International Nuclear Information System (INIS)

    Hodgson, R.D.

    1994-11-01

    A review of the Washington state requirements for the storage of long equipment items removed from tanks indicate that if the contaminated materials on the long equipment items are analyzed and determined to be DW, and not EHW, the containers can be stored on an uncovered, RCRA approved, storage pad. Long equipment items contaminated with reportable levels of EHW, or suspected of being contaminated with EHW, must be protected from the elements by means of a building or other protective covering that otherwise allows adequate inspection of the containers. Storage of the long equipment item containers on an uncovered storage pad is recommended and will reduce construction costs for new storage by an estimated 60 percent when compared to construction costs for enclosed storage

  17. Gender Invariance of the Gambling Behavior Scale for Adolescents (GBS-A): An Analysis of Differential Item Functioning Using Item Response Theory.

    Science.gov (United States)

    Donati, Maria Anna; Chiesi, Francesca; Izzo, Viola A; Primi, Caterina

    2017-01-01

    As there is a lack of evidence attesting the equivalent item functioning across genders for the most employed instruments used to measure pathological gambling in adolescence, the present study was aimed to test the gender invariance of the Gambling Behavior Scale for Adolescents (GBS-A), a new measurement tool to assess the severity of Gambling Disorder (GD) in adolescents. The equivalence of the items across genders was assessed by analyzing Differential Item Functioning within an Item Response Theory framework. The GBS-A was administered to 1,723 adolescents, and the graded response model was employed. The results attested the measurement equivalence of the GBS-A when administered to male and female adolescent gamblers. Overall, findings provided evidence that the GBS-A is an effective measurement tool of the severity of GD in male and female adolescents and that the scale was unbiased and able to relieve truly gender differences. As such, the GBS-A can be profitably used in educational interventions and clinical treatments with young people.

  18. Safety Evaluation for Packaging (onsite) T Plant Canyon Items

    International Nuclear Information System (INIS)

    OBRIEN, J.H.

    2000-01-01

    This safety evaluation for packaging (SEP) evaluates and documents the ability to safely ship mostly unique inventories of miscellaneous T Plant canyon waste items (T-P Items) encountered during the canyon deck clean off campaign. In addition, this SEP addresses contaminated items and material that may be shipped in a strong tight package (STP). The shipments meet the criteria for onsite shipments as specified by Fluor Hanford in HNF-PRO-154, Responsibilities and Procedures for all Hazardous Material Shipments

  19. Safety Evaluation for Packaging (onsite) T Plant Canyon Items

    Energy Technology Data Exchange (ETDEWEB)

    OBRIEN, J.H.

    2000-07-14

    This safety evaluation for packaging (SEP) evaluates and documents the ability to safely ship mostly unique inventories of miscellaneous T Plant canyon waste items (T-P Items) encountered during the canyon deck clean off campaign. In addition, this SEP addresses contaminated items and material that may be shipped in a strong tight package (STP). The shipments meet the criteria for onsite shipments as specified by Fluor Hanford in HNF-PRO-154, Responsibilities and Procedures for all Hazardous Material Shipments.

  20. Development and Psychometric Assessment of the Measure of Globalization Influence on Health Risk (MGIHR) Among Mexican Women with Breast Cancer.

    Science.gov (United States)

    Nodora, Jesse N; Carvajal, Scott C; Robles-Garcia, Rebeca; Agraz, Francisco Páez; Daneri-Navarro, Adrian; Meza-Montenegro, Maria Mercedes; Gutierrez-Millan, Luis Enrique; Martinez, Maria Elena

    2015-08-01

    Lacking in the literature are data addressing the extent to which changes in reproductive and lifestyle factors predispose women in developing nations to higher breast cancer rates, and the degree to which these are due to globalization influences. This article describes the development and psychometric assessment of an instrument intended to measure global, predominantly U.S., influences on breast cancer risk profile among women residing in Mexico. Using investigator consensus and a focus group methodology, the Measure of Globalization Influence on Health Risk (MGIHR) was developed and completed by 341 women. Psychometric analysis support the use of an 11-item Consumerism and Modernity scale and 7-item Reproductive Control and Gender Role scale. The MGIHR is a valid and reliable instrument for understanding changing lifestyle and reproductive factors for breast cancer risk and may provide a more complete understanding of breast cancer development and needed interventions.

  1. Reliability and validity of the Spanish version of the 10-item Connor-Davidson Resilience Scale (10-item CD-RISC in young adults

    Directory of Open Access Journals (Sweden)

    García-Campayo Javier

    2011-08-01

    Full Text Available Abstract Background The 10-item Connor-Davidson Resilience Scale (10-item CD-RISC is an instrument for measuring resilience that has shown good psychometric properties in its original version in English. The aim of this study was to evaluate the validity and reliability of the Spanish version of the 10-item CD-RISC in young adults and to verify whether it is structured in a single dimension as in the original English version. Findings Cross-sectional observational study including 681 university students ranging in age from 18 to 30 years. The number of latent factors in the 10 items of the scale was analyzed by exploratory factor analysis. Confirmatory factor analysis was used to verify whether a single factor underlies the 10 items of the scale as in the original version in English. The convergent validity was analyzed by testing whether the mean of the scores of the mental component of SF-12 (MCS and the quality of sleep as measured with the Pittsburgh Sleep Index (PSQI were higher in subjects with better levels of resilience. The internal consistency of the 10-item CD-RISC was estimated using the Cronbach α test and test-retest reliability was estimated with the intraclass correlation coefficient. The Cronbach α coefficient was 0.85 and the test-retest intraclass correlation coefficient was 0.71. The mean MCS score and the level of quality of sleep in both men and women were significantly worse in subjects with lower resilience scores. Conclusions The Spanish version of the 10-item CD-RISC showed good psychometric properties in young adults and thus can be used as a reliable and valid instrument for measuring resilience. Our study confirmed that a single factor underlies the resilience construct, as was the case of the original scale in English.

  2. The Influence of Item Properties on Association-Memory

    Science.gov (United States)

    Madan, Christopher R.; Glaholt, Mackenzie G.; Caplan, Jeremy B.

    2010-01-01

    Word properties like imageability and word frequency improve cued recall of verbal paired-associates. We asked whether these enhancements follow simply from prior effects on item-memory, or also strengthen associations between items. Participants studied word pairs varying in imageability or frequency: pairs were "pure" (high-high, low-low) or…

  3. Assessment of Differential Item Functioning in the Experiences of Discrimination Index

    Science.gov (United States)

    Cunningham, Timothy J.; Berkman, Lisa F.; Gortmaker, Steven L.; Kiefe, Catarina I.; Jacobs, David R.; Seeman, Teresa E.; Kawachi, Ichiro

    2011-01-01

    The psychometric properties of instruments used to measure self-reported experiences of discrimination in epidemiologic studies are rarely assessed, especially regarding construct validity. The authors used 2000–2001 data from the Coronary Artery Risk Development in Young Adults (CARDIA) Study to examine differential item functioning (DIF) in 2 versions of the Experiences of Discrimination (EOD) Index, an index measuring self-reported experiences of racial/ethnic and gender discrimination. DIF may confound interpretation of subgroup differences. Large DIF was observed for 2 of 7 racial/ethnic discrimination items: White participants reported more racial/ethnic discrimination for the “at school” item, and black participants reported more racial/ethnic discrimination for the “getting housing” item. The large DIF by race/ethnicity in the index for racial/ethnic discrimination probably reflects item impact and is the result of valid group differences between blacks and whites regarding their respective experiences of discrimination. The authors also observed large DIF by race/ethnicity for 3 of 7 gender discrimination items. This is more likely to have been due to item bias. Users of the EOD Index must consider the advantages and disadvantages of DIF adjustment (omitting items, constructing separate measures, and retaining items). The EOD Index has substantial usefulness as an instrument that can assess self-reported experiences of discrimination. PMID:22038104

  4. Defining a Leader Role curriculum for radiation oncology: A global Delphi consensus study.

    Science.gov (United States)

    Turner, Sandra; Seel, Matthew; Trotter, Theresa; Giuliani, Meredith; Benstead, Kim; Eriksen, Jesper G; Poortmans, Philip; Verfaillie, Christine; Westerveld, Henrike; Cross, Shamira; Chan, Ming-Ka; Shaw, Timothy

    2017-05-01

    The need for radiation oncologists and other radiation oncology (RO) professionals to lead quality improvement activities and contribute to shaping the future of our specialty is self-evident. Leadership knowledge, skills and behaviours, like other competencies, can be learned (Blumenthal et al., 2012). The objective of this study was to define a globally applicable competency set specific to radiation oncology for the CanMEDS Leader Role (Frank et al., 2015). A modified Delphi consensus process delivering two rounds of on-line surveys was used. Participants included trainees, radiation/clinical oncologists and other RO team members (radiation therapists, physicists, and nurses), professional educators and patients. 72 of 95 (76%) invitees from nine countries completed the Round 1 (R1) survey. Of the 72 respondents to RI, 70 completed Round 2 (R2) (97%). In R1, 35 items were deemed for 'inclusion' and 21 for 'exclusion', leaving 41 'undetermined'. After review of items, informed by participant comments, 14 competencies from the 'inclusion' group went into the final curriculum; 12 from the 'undetermined' group went to R2. In R2, 6 items reached consensus for inclusion. This process resulted in 20 RO Leader Role competencies with apparent global applicability. This is the first step towards developing learning, teaching and assessment tools for this important area of training. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. 41 CFR 101-30.101-1a - Item of production.

    Science.gov (United States)

    2010-07-01

    ... 41 Public Contracts and Property Management 2 2010-07-01 2010-07-01 true Item of production. 101-30.101-1a Section 101-30.101-1a Public Contracts and Property Management Federal Property Management....1-General § 101-30.101-1a Item of production. Item-of-production means those articles, equipment...

  6. Selecting Lower Priced Items.

    Science.gov (United States)

    Kleinert, Harold L.; And Others

    1988-01-01

    A program used to teach moderately to severely mentally handicapped students to select the lower priced items in actual shopping activities is described. Through a five-phase process, students are taught to compare prices themselves as well as take into consideration variations in the sizes of containers and varying product weights. (VW)

  7. E pluribus unum: Harmonization of physical functioning across intervention studies of middle-aged and older adults.

    Directory of Open Access Journals (Sweden)

    Nicole M Armstrong

    Full Text Available Common scales for physical functioning are not directly comparable without harmonization techniques, complicating attempts to pool data across studies. Our aim was to provide a standardized metric for physical functioning in adults based on basic and instrumental activities of daily living scaled to NIH PROMIS norms. We provide an item bank to compare the difficulty of various physical functioning activities. We used item response theory methods to place 232 basic and instrumental activities of daily living questions, administered across eight intervention studies of middle-aged and older adults (N = 2,556, on a common metric. We compared the scale's precision to an average z-score of items and evaluated criterion validity based on objective measures of physical functioning and Fried's frailty criteria. Model-estimated item thresholds were widely distributed across the range of physical functioning. From test information plots, the lowest precision in each dataset was 0.80. Using power calculations, the sample size needed to detect 25% physical functional decline with 80% power based on the physical functioning factor was less than half of what would be needed using an average z-score. The physical functioning factor correlated in expected directions with objective measurements from the Timed Up and Go task, tandem balance, gait speed, chair stands, grip strength, and frailty status. Item-level harmonization enables direct comparison of physical functioning measures across existing and potentially future studies and across levels of function using a nationally representative metric. We identified key thresholds of physical functioning items in an item bank to facilitate clinical and epidemiologic decision-making.

  8. Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.

    Science.gov (United States)

    Cappelleri, Joseph C; Jason Lundy, J; Hays, Ron D

    2014-05-01

    The US Food and Drug Administration's guidance for industry document on patient-reported outcomes (PRO) defines content validity as "the extent to which the instrument measures the concept of interest" (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity" (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. We review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized "difficulty" (severity) order of items is represented by observed responses. If a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. Classical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures. Copyright © 2014 Elsevier HS Journals, Inc. All rights reserved.

  9. Test Score Equating Using Discrete Anchor Items versus Passage-Based Anchor Items: A Case Study Using "SAT"® Data. Research Report. ETS RR-14-14

    Science.gov (United States)

    Liu, Jinghua; Zu, Jiyun; Curley, Edward; Carey, Jill

    2014-01-01

    The purpose of this study is to investigate the impact of discrete anchor items versus passage-based anchor items on observed score equating using empirical data.This study compares an "SAT"® critical reading anchor that contains more discrete items proportionally, compared to the total tests to be equated, to another anchor that…

  10. Applying modern psychometric techniques to melodic discrimination testing: Item response theory, computerised adaptive testing, and automatic item generation.

    Science.gov (United States)

    Harrison, Peter M C; Collins, Tom; Müllensiefen, Daniel

    2017-06-15

    Modern psychometric theory provides many useful tools for ability testing, such as item response theory, computerised adaptive testing, and automatic item generation. However, these techniques have yet to be integrated into mainstream psychological practice. This is unfortunate, because modern psychometric techniques can bring many benefits, including sophisticated reliability measures, improved construct validity, avoidance of exposure effects, and improved efficiency. In the present research we therefore use these techniques to develop a new test of a well-studied psychological capacity: melodic discrimination, the ability to detect differences between melodies. We calibrate and validate this test in a series of studies. Studies 1 and 2 respectively calibrate and validate an initial test version, while Studies 3 and 4 calibrate and validate an updated test version incorporating additional easy items. The results support the new test's viability, with evidence for strong reliability and construct validity. We discuss how these modern psychometric techniques may also be profitably applied to other areas of music psychology and psychological science in general.

  11. A note on monotonicity of item response functions for ordered polytomous item response theory models.

    Science.gov (United States)

    Kang, Hyeon-Ah; Su, Ya-Hui; Chang, Hua-Hua

    2018-03-08

    A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales. © 2018 The British Psychological Society.

  12. Validation of the Spanish versions of the long (26 items) and short (12 items) forms of the Self-Compassion Scale (SCS).

    Science.gov (United States)

    Garcia-Campayo, Javier; Navarro-Gil, Mayte; Andrés, Eva; Montero-Marin, Jesús; López-Artal, Lorena; Demarzo, Marcelo Marcos Piva

    2014-01-10

    Self-compassion is a key psychological construct for assessing clinical outcomes in mindfulness-based interventions. The aim of this study was to validate the Spanish versions of the long (26 item) and short (12 item) forms of the Self-Compassion Scale (SCS). The translated Spanish versions of both subscales were administered to two independent samples: Sample 1 was comprised of university students (n = 268) who were recruited to validate the long form, and Sample 2 was comprised of Aragon Health Service workers (n = 271) who were recruited to validate the short form. In addition to SCS, the Mindful Attention Awareness Scale (MAAS), the State-Trait Anxiety Inventory-Trait (STAI-T), the Beck Depression Inventory (BDI) and the Perceived Stress Questionnaire (PSQ) were administered. Construct validity, internal consistency, test-retest reliability and convergent validity were tested. The Confirmatory Factor Analysis (CFA) of the long and short forms of the SCS confirmed the original six-factor model in both scales, showing goodness of fit. Cronbach's α for the 26 item SCS was 0.87 (95% CI = 0.85-0.90) and ranged between 0.72 and 0.79 for the 6 subscales. Cronbach's α for the 12-item SCS was 0.85 (95% CI = 0.81-0.88) and ranged between 0.71 and 0.77 for the 6 subscales. The long (26-item) form of the SCS showed a test-retest coefficient of 0.92 (95% CI = 0.89-0.94). The Intraclass Correlation (ICC) for the 6 subscales ranged from 0.84 to 0.93. The short (12-item) form of the SCS showed a test-retest coefficient of 0.89 (95% CI: 0.87-0.93). The ICC for the 6 subscales ranged from 0.79 to 0.91. The long and short forms of the SCS exhibited a significant negative correlation with the BDI, the STAI and the PSQ, and a significant positive correlation with the MAAS. The correlation between the total score of the long and short SCS form was r = 0.92. The Spanish versions of the long (26-item) and short (12-item) forms of the SCS are valid and

  13. In Praise of Canadian Contradictions: Making Our Way in a Globalized World

    Science.gov (United States)

    Rao, Govind

    2004-01-01

    Many of the cultural items that are associated with globalization started out as American cultural products, for example, McDonalds hamburgers, Jeans, Coca-Cola, and Rock-and-Roll. Canada, next-door neighbour to the United States, was the first country to be subjected to this onslaught early in the 20th century, as American cultural and economic…

  14. Exploring differential item functioning (DIF) with the Rasch model: a comparison of gender differences on eighth grade science items in the United States and Spain.

    Science.gov (United States)

    Babiar, Tasha Calvert

    2011-01-01

    Traditionally, women and minorities have not been fully represented in science and engineering. Numerous studies have attributed these differences to gaps in science achievement as measured by various standardized tests. Rather than describe mean group differences in science achievement across multiple cultures, this study focused on an in-depth item-level analysis across two countries: Spain and the United States. This study investigated eighth-grade gender differences on science items across the two countries. A secondary purpose of the study was to explore the nature of gender differences using the many-faceted Rasch Model as a way to estimate gender DIF. A secondary analysis of data from the Third International Mathematics and Science Study (TIMSS) was used to address three questions: 1) Does gender DIF in science achievement exist? 2) Is there a relationship between gender DIF and characteristics of the science items? 3) Do the relationships between item characteristics and gender DIF in science items replicate across countries. Participants included 7,087 eight grade students from the United States and 3,855 students from Spain who participated in TIMSS. The Facets program (Linacre and Wright, 1992) was used to estimate gender DIF. The results of the analysis indicate that the content of the item seemed to be related to gender DIF. The analysis also suggests that there is a relationship between gender DIF and item format. No pattern of gender DIF related to cognitive demand was found. The general pattern of gender DIF was similar across the two countries used in the analysis. The strength of item-level analysis as opposed to group mean difference analysis is that gender differences can be detected at the item level, even when no mean differences can be detected at the group level.

  15. An item-oriented recommendation algorithm on cold-start problem

    Science.gov (United States)

    Qiu, Tian; Chen, Guang; Zhang, Zi-Ke; Zhou, Tao

    2011-09-01

    Based on a hybrid algorithm incorporating the heat conduction and probability spreading processes (Proc. Natl. Acad. Sci. U.S.A., 107 (2010) 4511), in this letter, we propose an improved method by introducing an item-oriented function, focusing on solving the dilemma of the recommendation accuracy between the cold and popular items. Differently from previous works, the present algorithm does not require any additional information (e.g., tags). Further experimental results obtained in three real datasets, RYM, Netflix and MovieLens, show that, compared with the original hybrid method, the proposed algorithm significantly enhances the recommendation accuracy of the cold items, while it keeps the recommendation accuracy of the overall and the popular items. This work might shed some light on both understanding and designing effective methods for long-tailed online applications of recommender systems.

  16. Building an Evaluation Scale using Item Response Theory.

    Science.gov (United States)

    Lalor, John P; Wu, Hao; Yu, Hong

    2016-11-01

    Evaluation of NLP methods requires testing against a previously vetted gold-standard test set and reporting standard metrics (accuracy/precision/recall/F1). The current assumption is that all items in a given test set are equal with regards to difficulty and discriminating power. We propose Item Response Theory (IRT) from psychometrics as an alternative means for gold-standard test-set generation and NLP system evaluation. IRT is able to describe characteristics of individual items - their difficulty and discriminating power - and can account for these characteristics in its estimation of human intelligence or ability for an NLP task. In this paper, we demonstrate IRT by generating a gold-standard test set for Recognizing Textual Entailment. By collecting a large number of human responses and fitting our IRT model, we show that our IRT model compares NLP systems with the performance in a human population and is able to provide more insight into system performance than standard evaluation metrics. We show that a high accuracy score does not always imply a high IRT score, which depends on the item characteristics and the response pattern.

  17. Procurement Engineering Process for Commercial Grade Item Dedication

    International Nuclear Information System (INIS)

    Park, Jong-Hyuck; Park, Jong-Eun; Kwak, Tack-Hun; Yoo, Keun-Bae; Lee, Sang-Guk; Hong, Sung-Yull

    2006-01-01

    Procurement Engineering Process for commercial grade item dedication plays an increasingly important role in operation management of Korea Nuclear Power Plants. The purpose of the Procurement Engineering Process is the provision and assurance of a high quality and quantity of spare, replacement, retrofit and new parts and equipment while maximizing plant availability, minimizing downtime due to parts unavailability and providing reasonable overall program and inventory cost. In this paper, we will review the overview requirements, responsibilities and the process for demonstrating with reasonable assurance that a procured item for potential nuclear safety related services or other essential plant service is adequate with reasonable assurance for its application. This paper does not cover the details of technical evaluation, selecting critical characteristics, selecting acceptance methods, performing failure modes and effects analysis, performing source surveillance, performing quality surveys, performing special tests and inspections, and the other aspects of effective Procurement Engineering and Commercial Grade Item Dedication. The main contribution of this paper is to provide the provision of an overview of Procurement Engineering Process for commercial grade item

  18. Intentional forgetting reduces color-naming interference: evidence from item-method directed forgetting.

    Science.gov (United States)

    Lee, Yuh-Shiow; Lee, Huang-Mou; Fawcett, Jonathan M

    2013-01-01

    In an item-method-directed forgetting task, Chinese words were presented individually, each followed by an instruction to remember or forget. Colored probe items were presented following each memory instruction requiring a speeded color-naming response. Half of the probe items were novel and unrelated to the preceding study item, whereas the remaining half of the probe items were a repetition of the preceding study item. Repeated probe items were either identical to the preceding study item (E1, E2), a phonetic reproduction of the preceding study item (E3), or perceptually matched to the preceding study item (E4). Color-naming interference was calculated by subtracting color-naming reaction times made in response to a string of meaningless symbols from that of the novel and repeated conditions. Across all experiments, participants recalled more to-be-remembered (TBR) than to-be-forgotten (TBF) study words. More importantly, Experiments 1 and 2 found that color-naming interference was reduced for repeated TBF words relative to repeated TBR words. Experiments 3 and 4 further found that this effect occurred at the perceptual rather than semantic level. These findings suggest that participants may bias processing resources away from the perceptual representation of to-be-forgotten information.

  19. Analysis Test of Understanding of Vectors with the Three-Parameter Logistic Model of Item Response Theory and Item Response Curves Technique

    Science.gov (United States)

    Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

    2016-01-01

    This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming…

  20. Cash Impact of the Consumable Item Transfer, Phase II

    National Research Council Canada - National Science Library

    1998-01-01

    ...). This report is the third in a series of reports regarding the consumable item transfer (CIT), phase II. The Deputy Secretary of Defense directed the transfer of the management of consumable items to Defense Logistics Agency...

  1. A confirmative clinimetric analysis of the 36-item Family Assessment Device.

    Science.gov (United States)

    Timmerby, Nina; Cosci, Fiammetta; Watson, Maggie; Csillag, Claudio; Schmitt, Florence; Steck, Barbara; Bech, Per; Thastum, Mikael

    2018-02-07

    The Family Assessment Device (FAD) is a 60-item questionnaire widely used to evaluate self-reported family functioning. However, the factor structure as well as the number of items has been questioned. A shorter and more user-friendly version of the original FAD-scale, the 36-item FAD, has therefore previously been proposed, based on findings in a nonclinical population of adults. We aimed in this study to evaluate the brief 36-item version of the FAD in a clinical population. Data from a European multinational study, examining factors associated with levels of family functioning in adult cancer patients' families, were used. Both healthy and ill parents completed the 60-item version FAD. The psychometric analyses conducted were Principal Component Analysis and Mokken-analysis. A total of 564 participants were included. Based on the psychometric analysis we confirmed that the 36-item version of the FAD has robust psychometric properties and can be used in clinical populations. The present analysis confirmed that the 36-item version of the FAD (18 items assessing 'well-being' and 18 items assessing 'dysfunctional' family function) is a brief scale where the summed total score is a valid measure of the dimensions of family functioning. This shorter version of the FAD is, in accordance with the concept of 'measurement-based care', an easy to use scale that could be considered when the aim is to evaluate self-reported family functioning.

  2. [Instrument to measure adherence in hypertensive patients: contribution of Item Response Theory].

    Science.gov (United States)

    Rodrigues, Malvina Thaís Pacheco; Moreira, Thereza Maria Magalhaes; Vasconcelos, Alexandre Meira de; Andrade, Dalton Francisco de; Silva, Daniele Braz da; Barbetta, Pedro Alberto

    2013-06-01

    To analyze, by means of "Item Response Theory", an instrument to measure adherence to t treatment for hypertension. Analytical study with 406 hypertensive patients with associated complications seen in primary care in Fortaleza, CE, Northeastern Brazil, 2011 using "Item Response Theory". The stages were: dimensionality test, calibrating the items, processing data and creating a scale, analyzed using the gradual response model. A study of the dimensionality of the instrument was conducted by analyzing the polychoric correlation matrix and factor analysis of complete information. Multilog software was used to calibrate items and estimate the scores. Items relating to drug therapy are the most directly related to adherence while those relating to drug-free therapy need to be reworked because they have less psychometric information and low discrimination. The independence of items, the small number of levels in the scale and low explained variance in the adjustment of the models show the main weaknesses of the instrument analyzed. The "Item Response Theory" proved to be a relevant analysis technique because it evaluated respondents for adherence to treatment for hypertension, the level of difficulty of the items and their ability to discriminate between individuals with different levels of adherence, which generates a greater amount of information. The instrument analyzed is limited in measuring adherence to hypertension treatment, by analyzing the "Item Response Theory" of the item, and needs adjustment. The proper formulation of the items is important in order to accurately measure the desired latent trait.

  3. Work ability as prognostic risk marker of disability pension: single-item work ability score versus multi-item work ability index.

    Science.gov (United States)

    Roelen, Corné A M; van Rhenen, Willem; Groothoff, Johan W; van der Klink, Jac J L; Twisk, Jos W R; Heymans, Martijn W

    2014-07-01

    Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. This prospective cohort study comprised 11 537 male construction workers, who completed the WAI at baseline and reported DP after a mean 2.3 years of follow-up. WAS and WAI were calibrated for DP risk predictions with the Hosmer-Lemeshow (H-L) test and their ability to discriminate between high- and low-risk construction workers was investigated with the area under the receiver operating characteristic curve (AUC). At follow-up, 336 (3%) construction workers reported DP. Both WAS [odds ratio (OR) 0.72, 95% confidence interval (95% CI) 0.66-0.78] and WAI (OR 0.57, 95% CI 0.52-0.63) scores were associated with DP at follow-up. The WAS showed miscalibration (H-L model χ (�)=10.60; df=3; P=0.01) and poorly discriminated between high- and low-risk construction workers (AUC 0.67, 95% CI 0.64-0.70). In contrast, calibration (H-L model χ �=8.20; df=8; P=0.41) and discrimination (AUC 0.78, 95% CI 0.75-0.80) were both adequate for the WAI. Although associated with the risk of future DP, the single-item WAS poorly identified male construction workers at risk of DP. We recommend using the multi-item WAI to screen for risk of DP in occupational health practice.

  4. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate and massive objects require a longer procedure and will therefore take longer.

  5. Conjunctive and Disjunctive Item Response Functions.

    Science.gov (United States)

    1984-10-01

    fed set ofvaluesof a, b, AI , B1 A2 2 . 2 A3 , and 13 , the f ’. g ’a. nd h’a in (7) are fied. Equation (7) must still hold for S - e19029e3,..* . Thus...for Item I Is -- b ?(a:1 , b1 ,O) (1 + ’)(I + e4 (22 where a and pi are arbitrary constants. These constants mst be the sam for all Items In a given...NETHERLIS I E3I1 Focility-Acquisitions 4133 Rugby Avnue 1 Lee Cronbach Bethesda, NO 20014 16 Laburnue Road Atherton, CA 94205 1 Dr. Benjamin A. Fairbank

  6. An Effect Size Measure for Raju's Differential Functioning for Items and Tests

    Science.gov (United States)

    Wright, Keith D.; Oshima, T. C.

    2015-01-01

    This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…

  7. An NCME Instructional Module on Item-Fit Statistics for Item Response Theory Models

    Science.gov (United States)

    Ames, Allison J.; Penfield, Randall D.

    2015-01-01

    Drawing valid inferences from item response theory (IRT) models is contingent upon a good fit of the data to the model. Violations of model-data fit have numerous consequences, limiting the usefulness and applicability of the model. This instructional module provides an overview of methods used for evaluating the fit of IRT models. Upon completing…

  8. INVESTIGATION OF MIS ITEM 011589A AND 3013 CONTAINERS HAVING SIMILAR CHARACTERISTICS

    Energy Technology Data Exchange (ETDEWEB)

    Friday, G

    2006-08-23

    Recent testing has identified the presence of hydrogen and oxygen in MIS Item 011589A. This isolated observation has effectuated concern regarding the potential for flammable gas mixtures in containers in the storage inventory. This study examines the known physicochemical characteristics of MIS Item 011589A and queries the ISP Database for items that are most similar or potentially similar. Items identified as most similar are believed to have the highest probability of being chemically and structurally identical to MIS Item 011589A. Items identified as potentially like MIS Item 011589A have some attributes in common, have the potential to generate gases, but have a lower probability of having similar gas generating characteristics. MIS Item 011589A is an oxide that was generated prior to 1990 at Rocky Flats in Building 707. It was associated with foundry processing and had an actinide assay of approximately 77%. Prompt gamma analysis of MIS Item 011589A indicated the presence of chloride, fluorine, magnesium, sodium, and aluminum. Queries based on MIS representation classification and process of origin were applied to the ISP Database. Evaluation criteria included binning classification (i.e., innocuous, pressure, or pressure and corrosion), availability of prompt gamma analyses, presence of chlorine and magnesium, percentage of chlorine by weight, peak ratios (i.e., Na:Cl and Mg:Na), moisture, and percent assay. These queries identified 15 items that were most similar and 106 items that were potentially like MIS Item 011589A. Although these queries identified containers that could potentially generate flammable gases, verification and confirmation can only be accomplished by destructive evaluation and testing of containers from the storage inventory.

  9. Quantum partial search for uneven distribution of multiple target items

    Science.gov (United States)

    Zhang, Kun; Korepin, Vladimir

    2018-06-01

    Quantum partial search algorithm is an approximate search. It aims to find a target block (which has the target items). It runs a little faster than full Grover search. In this paper, we consider quantum partial search algorithm for multiple target items unevenly distributed in a database (target blocks have different number of target items). The algorithm we describe can locate one of the target blocks. Efficiency of the algorithm is measured by number of queries to the oracle. We optimize the algorithm in order to improve efficiency. By perturbation method, we find that the algorithm runs the fastest when target items are evenly distributed in database.

  10. Core outcome measurement instruments for clinical trials in nonspecific low back pain

    Science.gov (United States)

    Chiarotto, Alessandro; Boers, Maarten; Deyo, Richard A.; Buchbinder, Rachelle; Corbin, Terry P.; Costa, Leonardo O.P.; Foster, Nadine E.; Grotle, Margreth; Koes, Bart W.; Kovacs, Francisco M.; Lin, C.-W. Christine; Maher, Chris G.; Pearson, Adam M.; Peul, Wilco C.; Schoene, Mark L.; Turk, Dennis C.; van Tulder, Maurits W.; Terwee, Caroline B.; Ostelo, Raymond W.

    2018-01-01

    Abstract To standardize outcome reporting in clinical trials of patients with nonspecific low back pain, an international multidisciplinary panel recommended physical functioning, pain intensity, and health-related quality of life (HRQoL) as core outcome domains. Given the lack of a consensus on measurement instruments for these 3 domains in patients with low back pain, this study aimed to generate such consensus. The measurement properties of 17 patient-reported outcome measures for physical functioning, 3 for pain intensity, and 5 for HRQoL were appraised in 3 systematic reviews following the COSMIN methodology. Researchers, clinicians, and patients (n = 207) were invited in a 2-round Delphi survey to generate consensus (≥67% agreement among participants) on which instruments to endorse. Response rates were 44% and 41%, respectively. In round 1, consensus was achieved on the Oswestry Disability Index version 2.1a for physical functioning (78% agreement) and the Numeric Rating Scale (NRS) for pain intensity (75% agreement). No consensus was achieved on any HRQoL instrument, although the Short Form 12 (SF12) approached the consensus threshold (64% agreement). In round 2, a consensus was reached on an NRS version with a 1-week recall period (96% agreement). Various participants requested 1 free-to-use instrument per domain. Considering all issues together, recommendations on core instruments were formulated: Oswestry Disability Index version 2.1a or 24-item Roland-Morris Disability Questionnaire for physical functioning, NRS for pain intensity, and SF12 or 10-item PROMIS Global Health form for HRQoL. Further studies need to fill the evidence gaps on the measurement properties of these and other instruments. PMID:29194127

  11. Evaluating construct validity of the second version of the Copenhagen Psychosocial Questionnaire through analysis of differential item functioning and differential item effect

    DEFF Research Database (Denmark)

    Bjorner, Jakob Bue; Pejtersen, Jan Hyld

    2010-01-01

    AIMS: To evaluate the construct validity of the Copenhagen Psychosocial Questionnaire II (COPSOQ II) by means of tests for differential item functioning (DIF) and differential item effect (DIE). METHODS: We used a Danish general population postal survey (n = 4,732 with 3,517 wage earners) with a ...

  12. Validation of a mobility item bank for older patients in primary care.

    Science.gov (United States)

    Cabrero-García, Julio; Ramos-Pichardo, Juan Diego; Muñoz-Mendoza, Carmen Luz; Cabañero-Martínez, María José; González-Llopis, Lorena; Reig-Ferrer, Abilio

    2012-12-05

    To develop and validate an item bank to measure mobility in older people in primary care and to analyse differential item functioning (DIF) and differential bundle functioning (DBF) by sex. A pool of 48 mobility items was administered by interview to 593 older people attending primary health care practices. The pool contained four domains based on the International Classification of Functioning: changing and maintaining body position, carrying, lifting and pushing, walking and going up and down stairs. The Late Life Mobility item bank consisted of 35 items, and measured with a reliability of 0.90 or more across the full spectrum of mobility, except at the higher end of better functioning. No evidence was found of non-uniform DIF but uniform DIF was observed, mainly for items in the changing and maintaining body position and carrying, lifting and pushing domains. The walking domain did not display DBF, but the other three domains did, principally the carrying, lifting and pushing items. During the design and validation of an item bank to measure mobility in older people, we found that strength (carrying, lifting and pushing) items formed a secondary dimension that produced DBF. More research is needed to determine how best to include strength items in a mobility measure, or whether it would be more appropriate to design separate measures for each construct.

  13. Development of a questionnaire to assess patient satisfaction with allergen-specific immunotherapy in adults: item generation, item reduction, and preliminary validation

    Directory of Open Access Journals (Sweden)

    Justícia JL

    2011-05-01

    Full Text Available Jose Luis Justícia1, Eva Baró2, Victoria Cardona3, Pedro Guardia4, Pedro Ojeda5, José Maria Olaguíbel6, José Maria Vega7, Carmen Vidal81Medical Department, Stallergenes Ibérica, Barcelona, Spain; 2Health Outcomes Research Department, 3D Health Research, Barcelona, Spain; 3Hospital Vall d'Hebron, Barcelona, Spain; 4Hospital Virgen Macarena, Sevilla, Spain; 5Clínica de Asma y Alergia Dres. Ojeda, Madrid, Spain; 6Complejo Hospitalario de Navarra, Pamplona, Spain; 7Hospital Regional Universitario Carlos Haya Málaga, Spain; 8Complejo Hospitalario Universitario de Santiago, Santiago de Compostela, SpainBackground: Allergen-specific immunotherapy (SIT is a treatment capable of modifying the natural course of allergy, so ensuring good adherence to SIT is fundamental. Up until now there has not existed an instrument specifically developed to measure patient satisfaction with SIT, although its assessment could help us to comprehend better and improve treatment adherence and effectiveness. The aim of this study was to develop an instrument to measure adult patient satisfaction with SIT.Methods: Items were generated from a literature review, focus groups with allergic adult patients undergoing SIT, and a meeting with experts. Potential items were administered to allergic patients undergoing SIT in an observational, cross-sectional, multicenter study. Item reduction was based on quantitative and qualitative criteria. A preliminary assessment of feasibility, reliability, and validity of the retained items was performed.Results: An initial pool of 70 items was administered to 257 patients undergoing SIT. Fifty-four items were eliminated resulting in a provisional instrument with 16 items. Factor analysis yielded four factors that were identified as perceived efficacy, activities and environment, cost-benefit balance, and overall satisfaction, explaining 74.8% of variance. Ceiling and floor effects were negligible for overall score. Overall score was

  14. Development of a lack of appetite item bank for computer-adaptive testing (CAT)

    DEFF Research Database (Denmark)

    Thamsborg, Lise Laurberg Holst; Petersen, Morten Aa; Aaronson, Neil K

    2015-01-01

    to 12 lack of appetite items. CONCLUSIONS: Phases 1-3 resulted in 12 lack of appetite candidate items. Based on a field testing (phase 4), the psychometric characteristics of the items will be assessed and the final item bank will be generated. This CAT item bank is expected to provide precise...

  15. Omani Students' Views about Global Warming: Beliefs about Actions and Willingness to Act

    Science.gov (United States)

    Ambusaidi, Abdullah; Boyes, Edward; Stanisstreet, Martin; Taylor, Neil

    2012-01-01

    A 44-item questionnaire was designed to determine students' views about how useful various "specific" actions might be in helping to reduce global warming, their willingness to undertake these various actions and the extent to which these two might be related. The instrument was administered to students in Grades 6 to 12 (N = 1532) from…

  16. Graphical modeling for item difficulty in medical faculty exams

    African Journals Online (AJOL)

    . Conclusion: The ... difficulty criteria. Key words: Item difficulty, quality control, statistical process control, variable control charts ..... assumed that 68% of the values fall in the interval ± 1.S; .... The balance of the construction of items of exam has ...

  17. Exploring differential item functioning (DIF) with the Rasch model: A comparison of gender differences on eighth-grade science items in the United States and Spain

    Science.gov (United States)

    Calvert, Tasha

    Despite the attention that has been given to gender and science, boys continue to outperform girls in science achievement, particularly by the end of secondary school. Because it is unclear whether gender differences have narrowed over time (Leder, 1992; Willingham & Cole, 1997), it is important to continue a line of inquiry into the nature of gender differences, specifically at the international level. The purpose of this study was to investigate gender differences in science achievement across two countries: United States and Spain. A secondary purpose was to demonstrate an alternative method for exploring gender differences based on the many-faceted Rasch model (1980). A secondary analysis of the data from the Third International Mathematics and Science Study (TIMSS) was used to examine the relationship between gender DIF (differential item functioning) and item characteristics (item type, content, and performance expectation) across both countries. Nationally representative samples of eighth grade students in the United States and Spain who participated in TIMSS were analyzed to answer the research questions in this study. In both countries, girls showed an advantage over boys on life science items and most extended response items, whereas boys, by and large, had an advantage on earth science, physics, and chemistry items. However, even within areas that favored boys, such as physics, there were items that were differentially easier for girls. In general, patterns in gender differences were similar across both countries although there were a few differences between the countries on individual items. It was concluded that simply looking at mean differences does not provide an adequate understanding of the nature of gender differences in science achievement.

  18. Applying Item Response Theory methods to design a learning progression-based science assessment

    Science.gov (United States)

    Chen, Jing

    Learning progressions are used to describe how students' understanding of a topic progresses over time and to classify the progress of students into steps or levels. This study applies Item Response Theory (IRT) based methods to investigate how to design learning progression-based science assessments. The research questions of this study are: (1) how to use items in different formats to classify students into levels on the learning progression, (2) how to design a test to give good information about students' progress through the learning progression of a particular construct and (3) what characteristics of test items support their use for assessing students' levels. Data used for this study were collected from 1500 elementary and secondary school students during 2009--2010. The written assessment was developed in several formats such as the Constructed Response (CR) items, Ordered Multiple Choice (OMC) and Multiple True or False (MTF) items. The followings are the main findings from this study. The OMC, MTF and CR items might measure different components of the construct. A single construct explained most of the variance in students' performances. However, additional dimensions in terms of item format can explain certain amount of the variance in student performance. So additional dimensions need to be considered when we want to capture the differences in students' performances on different types of items targeting the understanding of the same underlying progression. Items in each item format need to be improved in certain ways to classify students more accurately into the learning progression levels. This study establishes some general steps that can be followed to design other learning progression-based tests as well. For example, first, the boundaries between levels on the IRT scale can be defined by using the means of the item thresholds across a set of good items. Second, items in multiple formats can be selected to achieve the information criterion at all

  19. Intentional Forgetting Reduces Color-Naming Interference: Evidence from Item-Method Directed Forgetting

    Science.gov (United States)

    Lee, Yuh-shiow; Lee, Huang-mou; Fawcett, Jonathan M.

    2013-01-01

    In an item-method-directed forgetting task, Chinese words were presented individually, each followed by an instruction to remember or forget. Colored probe items were presented following each memory instruction requiring a speeded color-naming response. Half of the probe items were novel and unrelated to the preceding study item, whereas the…

  20. The Long-Term Conditions Questionnaire: conceptual framework and item development.

    Science.gov (United States)

    Peters, Michele; Potter, Caroline M; Kelly, Laura; Hunter, Cheryl; Gibbons, Elizabeth; Jenkinson, Crispin; Coulter, Angela; Forder, Julien; Towers, Ann-Marie; A'Court, Christine; Fitzpatrick, Ray

    2016-01-01

    To identify the main issues of importance when living with long-term conditions to refine a conceptual framework for informing the item development of a patient-reported outcome measure for long-term conditions. Semi-structured qualitative interviews (n=48) were conducted with people living with at least one long-term condition. Participants were recruited through primary care. The interviews were transcribed verbatim and analyzed by thematic analysis. The analysis served to refine the conceptual framework, based on reviews of the literature and stakeholder consultations, for developing candidate items for a new measure for long-term conditions. Three main organizing concepts were identified: impact of long-term conditions, experience of services and support, and self-care. The findings helped to refine a conceptual framework, leading to the development of 23 items that represent issues of importance in long-term conditions. The 23 candidate items formed the first draft of the measure, currently named the Long-Term Conditions Questionnaire. The aim of this study was to refine the conceptual framework and develop items for a patient-reported outcome measure for long-term conditions, including single and multiple morbidities and physical and mental health conditions. Qualitative interviews identified the key themes for assessing outcomes in long-term conditions, and these underpinned the development of the initial draft of the measure. These initial items will undergo cognitive testing to refine the items prior to further validation in a survey.

  1. Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating

    Science.gov (United States)

    He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei

    2013-01-01

    Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…

  2. Development and initial validation of a brief self-report measure of cognitive dysfunction in fibromyalgia.

    Science.gov (United States)

    Kratz, Anna L; Schilling, Stephen G; Goesling, Jenna; Williams, David A

    2015-06-01

    Pain is often the focus of research and clinical care in fibromyalgia (FM); however, cognitive dysfunction is also a common, distressing, and disabling symptom in FM. Current efforts to address this problem are limited by the lack of a comprehensive, valid measure of subjective cognitive dysfunction in FM that is easily interpretable, accessible, and brief. The purpose of this study was to leverage cognitive functioning item banks that were developed as part of the Patient Reported Outcomes Measurement Information System (PROMIS) to devise a 10-item short form measure of cognitive functioning for use in FM. In study 1, a nationwide (U.S.) sample of 1,035 adults with FM (age range = 18-82, 95.2% female) completed 2 cognitive item pools. Factor analyses and item response theory analyses were used to identify dimensionality and optimally performing items. A recommended 10-item measure, called the Multidimensional Inventory of Subjective Cognitive Impairment (MISCI) was created. In study 2, 232 adults with FM completed the MISCI and a legacy measure of cognitive functioning that is used in FM clinical trials, the Multiple Ability Self-Report Questionnaire (MASQ). The MISCI showed excellent internal reliability, low ceiling/floor effects, and good convergent validity with the MASQ (r = -.82). This paper presents the MISCI, a 10-item measure of cognitive dysfunction in FM, developed through classical test theory and item response theory. This brief but comprehensive measure shows evidence of excellent construct validity through large correlations with a lengthy legacy measure of cognitive functioning. Copyright © 2015 American Pain Society. Published by Elsevier Inc. All rights reserved.

  3. The impact of item order on ratings of cancer risk perception.

    Science.gov (United States)

    Taylor, Kathryn L; Shelby, Rebecca A; Schwartz, Marc D; Ackerman, Josh; LaSalle, V Holland; Gelmann, Edward P; McGuire, Colleen

    2002-07-01

    Although perceived risk is central to most theories of health behavior, there is little consensus on its measurement with regard to item wording, response set, or the number of items to include. In a methodological assessment of perceived risk, we assessed the impact of changing the order of three commonly used perceived risk items: quantitative personal risk, quantitative population risk, and comparative risk. Participants were 432 men and women enrolled in an ancillary study of the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial. Three groups of consecutively enrolled participants responded to the three items in one of three question orders. Results indicated that item order was related to the perceived risk ratings of both ovarian (P Perceptions of risk were significantly lower when the comparative rating was made first. The findings suggest that compelling participants to consider their own risk relative to the risk of others results in lower ratings of perceived risk. Although the use of multiple items may provide more information than when only a single method is used, different conclusions may be reached depending on the context in which an item is assessed.

  4. Representation of Item Position in Immediate Serial Recall: Evidence from Intrusion Errors

    Science.gov (United States)

    Fischer-Baum, Simon; McCloskey, Michael

    2015-01-01

    In immediate serial recall, participants are asked to recall novel sequences of items in the correct order. Theories of the representations and processes required for this task differ in how order information is maintained; some have argued that order is represented through item-to-item associations, while others have argued that each item is…

  5. The Dif Identification in Constructed Response Items Using Partial Credit Model

    OpenAIRE

    Heri Retnawati

    2017-01-01

    The study was to identify the load, the type and the significance of differential item functioning (DIF) in constructed response item using the partial credit model (PCM). The data in the study were the students’ instruments and the students’ responses toward the PISA-like test items that had been completed by 386 ninth grade students and 460 tenth grade students who had been about 15 years old in the Province of Yogyakarta Special Region in Indonesia. The analysis toward the item characteris...

  6. Global scene layout modulates contextual learning in change detection

    Directory of Open Access Journals (Sweden)

    Markus eConci

    2014-02-01

    Full Text Available Change in the visual scene often goes unnoticed – a phenomenon referred to as ‘change blindness’. This study examined whether the hierarchical structure, i.e., the global-local layout of a scene can influence performance in a one-shot change detection paradigm. To this end, natural scenes of a laid breakfast table were presented, and observers were asked to locate the onset of a new local object. Importantly, the global structure of the scene was manipulated by varying the relations among objects in the scene layouts. The very same items were either presented as global-congruent (typical layouts or as global-incongruent (random arrangements. Change blindness was less severe for congruent than for incongruent displays, and this congruency benefit increased with the duration of the experiment. These findings show that global layouts are learned, supporting detection of local changes with enhanced efficiency. However, performance was not affected by scene congruency in a subsequent control experiment that required observers to localize a static discontinuity (i.e., an object that was missing from the repeated layouts. Our results thus show that learning of the global layout is particularly linked to the local objects. Taken together, our results reveal an effect of global precedence in natural scenes. We suggest that relational properties within the hierarchy of a natural scene are governed, in particular, by global image analysis, reducing change blindness for local objects through scene learning.

  7. Global scene layout modulates contextual learning in change detection.

    Science.gov (United States)

    Conci, Markus; Müller, Hermann J

    2014-01-01

    Change in the visual scene often goes unnoticed - a phenomenon referred to as "change blindness." This study examined whether the hierarchical structure, i.e., the global-local layout of a scene can influence performance in a one-shot change detection paradigm. To this end, natural scenes of a laid breakfast table were presented, and observers were asked to locate the onset of a new local object. Importantly, the global structure of the scene was manipulated by varying the relations among objects in the scene layouts. The very same items were either presented as global-congruent (typical) layouts or as global-incongruent (random) arrangements. Change blindness was less severe for congruent than for incongruent displays, and this congruency benefit increased with the duration of the experiment. These findings show that global layouts are learned, supporting detection of local changes with enhanced efficiency. However, performance was not affected by scene congruency in a subsequent control experiment that required observers to localize a static discontinuity (i.e., an object that was missing from the repeated layouts). Our results thus show that learning of the global layout is particularly linked to the local objects. Taken together, our results reveal an effect of "global precedence" in natural scenes. We suggest that relational properties within the hierarchy of a natural scene are governed, in particular, by global image analysis, reducing change blindness for local objects through scene learning.

  8. Secondary Psychometric Examination of the Dimensional Obsessive-Compulsive Scale: Classical Testing, Item Response Theory, and Differential Item Functioning.

    Science.gov (United States)

    Thibodeau, Michel A; Leonard, Rachel C; Abramowitz, Jonathan S; Riemann, Bradley C

    2015-12-01

    The Dimensional Obsessive-Compulsive Scale (DOCS) is a promising measure of obsessive-compulsive disorder (OCD) symptoms but has received minimal psychometric attention. We evaluated the utility and reliability of DOCS scores. The study included 832 students and 300 patients with OCD. Confirmatory factor analysis supported the originally proposed four-factor structure. DOCS total and subscale scores exhibited good to excellent internal consistency in both samples (α = .82 to α = .96). Patient DOCS total scores reduced substantially during treatment (t = 16.01, d = 1.02). DOCS total scores discriminated between students and patients (sensitivity = 0.76, 1 - specificity = 0.23). The measure did not exhibit gender-based differential item functioning as tested by Mantel-Haenszel chi-square tests. Expected response options for each item were plotted as a function of item response theory and demonstrated that DOCS scores incrementally discriminate OCD symptoms ranging from low to extremely high severity. Incremental differences in DOCS scores appear to represent unbiased and reliable differences in true OCD symptom severity. © The Author(s) 2014.

  9. Optimal lot sizing in screening processes with returnable defective items

    Science.gov (United States)

    Vishkaei, Behzad Maleki; Niaki, S. T. A.; Farhangi, Milad; Rashti, Mehdi Ebrahimnezhad Moghadam

    2014-07-01

    This paper is an extension of Hsu and Hsu (Int J Ind Eng Comput 3(5):939-948, 2012) aiming to determine the optimal order quantity of product batches that contain defective items with percentage nonconforming following a known probability density function. The orders are subject to 100 % screening process at a rate higher than the demand rate. Shortage is backordered, and defective items in each ordering cycle are stored in a warehouse to be returned to the supplier when a new order is received. Although the retailer does not sell defective items at a lower price and only trades perfect items (to avoid loss), a higher holding cost incurs to store defective items. Using the renewal-reward theorem, the optimal order and shortage quantities are determined. Some numerical examples are solved at the end to clarify the applicability of the proposed model and to compare the new policy to an existing one. The results show that the new policy provides better expected profit per time.

  10. Optimizing incomplete sample designs for item response model parameters

    NARCIS (Netherlands)

    van der Linden, Willem J.

    Several models for optimizing incomplete sample designs with respect to information on the item parameters are presented. The following cases are considered: (1) known ability parameters; (2) unknown ability parameters; (3) item sets with multiple ability scales; and (4) response models with

  11. Using Reversed MFCC and IT-EM for Automatic Speaker Verification

    Directory of Open Access Journals (Sweden)

    Sheeraz Memon

    2012-01-01

    Full Text Available This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/ Reverse Mel Frequency Coefficients and IT-EM (Information Theoretic Expectation Maximization. To perform speaker verification, feature extraction using Mel scale has been widely applied and has established better results. The IMFCC is based on inverse Mel-scale. The IMFCC effectively captures information available at the high frequency formants which is ignored by the MFCC. In this paper the fusion of MFCC and IMFCC at input level is proposed. GMMs (Gaussian Mixture Models based on EM (Expectation Maximization have been widely used for classification of text independent verification. However EM comes across the convergence issue. In this paper we use our proposed IT-EM which has faster convergence, to train speaker models. IT-EM uses information theory principles such as PDE (Parzen Density Estimation and KL (Kullback-Leibler divergence measure. IT-EM acclimatizes the weights, means and covariances, like EM. However, IT-EM process is not performed on feature vector sets but on a set of centroids obtained using IT (Information Theoretic metric. The IT-EM process at once diminishes divergence measure between PDE estimates of features distribution within a given class and the centroids distribution within the same class. The feature level fusion and IT-EM is tested for the task of speaker verification using NIST2001 and NIST2004. The experimental evaluation validates that MFCC/IMFCC has better results than the conventional delta/MFCC feature set. The MFCC/IMFCC feature vector size is also much smaller than the delta MFCC thus reducing the computational burden as well. IT-EM method also showed faster convergence, than the conventional EM method, and thus it leads to higher speaker recognition scores.

  12. Designing a Virtual Item Bank Based on the Techniques of Image Processing

    Science.gov (United States)

    Liao, Wen-Wei; Ho, Rong-Guey

    2011-01-01

    One of the major weaknesses of the item exposure rates of figural items in Intelligence Quotient (IQ) tests lies in its inaccuracies. In this study, a new approach is proposed and a useful test tool known as the Virtual Item Bank (VIB) is introduced. The VIB combine Automatic Item Generation theory and image processing theory with the concepts of…

  13. Developing offshore outsourcing practices in a global selective outsourcing environment – the IT supplier’s viewpoint

    Directory of Open Access Journals (Sweden)

    Anne-Maarit University of Turku

    2017-01-01

    Full Text Available Currently, internal IT organizations use outsourcing and offshore arrangements to achieve cost savings and gain access to new capabilities. It was found that suppliers’ personnel at the operational level can face challenges with internalizing their operations based on the agreed outsourcing practices and transferred responsibilities. This study gives voice to the supplier and studies the impact of offshore outsourcing operation development activities. The internal IT unit from Nokia Devices selectively outsourced global IT service activities and responsibilities to the IT supplier. The outsourced activities were implemented by offshore centers in India and China. It was found that the global selective outsourcing environment (GSOE did not provide a solution to all of their expectations, and new unexpected challenges occurred. Several practices, communication and information sharing, and behavior-related lessons learned items were identified. It was found that the GSOE operation needs to be developed and implemented in an agile and incremental manner, instead of a singular implementation approach. Also, the globally distributed teams’ group dynamics critically impacted on the teams’ ability to work. The lessons learned items and recommendations can be utilized by other companies during their mode-of-operation development.

  14. Using Patient Health Questionnaire-9 item parameters of a common metric resulted in similar depression scores compared to independent item response theory model reestimation.

    Science.gov (United States)

    Liegl, Gregor; Wahl, Inka; Berghöfer, Anne; Nolte, Sandra; Pieh, Christoph; Rose, Matthias; Fischer, Felix

    2016-03-01

    To investigate the validity of a common depression metric in independent samples. We applied a common metrics approach based on item-response theory for measuring depression to four German-speaking samples that completed the Patient Health Questionnaire (PHQ-9). We compared the PHQ item parameters reported for this common metric to reestimated item parameters that derived from fitting a generalized partial credit model solely to the PHQ-9 items. We calibrated the new model on the same scale as the common metric using two approaches (estimation with shifted prior and Stocking-Lord linking). By fitting a mixed-effects model and using Bland-Altman plots, we investigated the agreement between latent depression scores resulting from the different estimation models. We found different item parameters across samples and estimation methods. Although differences in latent depression scores between different estimation methods were statistically significant, these were clinically irrelevant. Our findings provide evidence that it is possible to estimate latent depression scores by using the item parameters from a common metric instead of reestimating and linking a model. The use of common metric parameters is simple, for example, using a Web application (http://www.common-metrics.org) and offers a long-term perspective to improve the comparability of patient-reported outcome measures. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. ASIC design in the KM3NeT detector

    International Nuclear Information System (INIS)

    Gajanana, D; Gromov, V; Timmer, P

    2013-01-01

    In the KM3NeT project [1], Cherenkov light from the muon interactions with transparent matter around the detector, is used to detect neutrinos. Photo multiplier tubes (PMT) used as photon sensor, are housed in a glass sphere (aka Optical Module) to detect single photons from the Cherenkov light. The PMT needs high operational voltage ( ∼ 1.5 kV) and is generated by a Cockroft-Walton (CW) multiplier circuit. The electronics required to control the PMT's and collect the signals is integrated in two ASIC's namely: 1) a front-end mixed signal ASIC (PROMiS) for the readout of the PMT and 2) an analog ASIC (CoCo) to generate pulses for charging the CW circuit and to control the feedback of the CW circuit. In this article, we discuss the two integrated circuits and test results of the complete setup. PROMiS amplifies the input charge, converts it to a pulse width and delivers the information via LVDS signals. These LVDS signals carry accurate information on the Time of arrival ( 2 C bus. This unique combination of the ASIC's results in a very cost and power efficient PMT base design.

  16. 48 CFR 46.202-1 - Contracts for commercial items.

    Science.gov (United States)

    2010-10-01

    ... 48 Federal Acquisition Regulations System 1 2010-10-01 2010-10-01 false Contracts for commercial... CONTRACT MANAGEMENT QUALITY ASSURANCE Contract Quality Requirements 46.202-1 Contracts for commercial items. When acquiring commercial items (see part 12), the Government shall rely on contractors' existing...

  17. Algorithms for computerized test construction using classical item parameters

    NARCIS (Netherlands)

    Adema, Jos J.; van der Linden, Willem J.

    1989-01-01

    Recently, linear programming models for test construction were developed. These models were based on the information function from item response theory. In this paper another approach is followed. Two 0-1 linear programming models for the construction of tests using classical item and test

  18. Reduced-Item Food Audits Based on the Nutrition Environment Measures Surveys.

    Science.gov (United States)

    Partington, Susan N; Menzies, Tim J; Colburn, Trina A; Saelens, Brian E; Glanz, Karen

    2015-10-01

    The community food environment may contribute to obesity by influencing food choice. Store and restaurant audits are increasingly common methods for assessing food environments, but are time consuming and costly. A valid, reliable brief measurement tool is needed. The purpose of this study was to develop and validate reduced-item food environment audit tools for stores and restaurants. Nutrition Environment Measures Surveys for stores (NEMS-S) and restaurants (NEMS-R) were completed in 820 stores and 1,795 restaurants in West Virginia, San Diego, and Seattle. Data mining techniques (correlation-based feature selection and linear regression) were used to identify survey items highly correlated to total survey scores and produce reduced-item audit tools that were subsequently validated against full NEMS surveys. Regression coefficients were used as weights that were applied to reduced-item tool items to generate comparable scores to full NEMS surveys. Data were collected and analyzed in 2008-2013. The reduced-item tools included eight items for grocery, ten for convenience, seven for variety, and five for other stores; and 16 items for sit-down, 14 for fast casual, 19 for fast food, and 13 for specialty restaurants-10% of the full NEMS-S and 25% of the full NEMS-R. There were no significant differences in median scores for varying types of retail food outlets when compared to the full survey scores. Median in-store audit time was reduced 25%-50%. Reduced-item audit tools can reduce the burden and complexity of large-scale or repeated assessments of the retail food environment without compromising measurement quality. Copyright © 2015 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.

  19. Single-Item Measurement of Suicidal Behaviors: Validity and Consequences of Misclassification.

    Directory of Open Access Journals (Sweden)

    Alexander J Millner

    Full Text Available Suicide is a leading cause of death worldwide. Although research has made strides in better defining suicidal behaviors, there has been less focus on accurate measurement. Currently, the widespread use of self-report, single-item questions to assess suicide ideation, plans and attempts may contribute to measurement problems and misclassification. We examined the validity of single-item measurement and the potential for statistical errors. Over 1,500 participants completed an online survey containing single-item questions regarding a history of suicidal behaviors, followed by questions with more precise language, multiple response options and narrative responses to examine the validity of single-item questions. We also conducted simulations to test whether common statistical tests are robust against the degree of misclassification produced by the use of single-items. We found that 11.3% of participants that endorsed a single-item suicide attempt measure engaged in behavior that would not meet the standard definition of a suicide attempt. Similarly, 8.8% of those who endorsed a single-item measure of suicide ideation endorsed thoughts that would not meet standard definitions of suicide ideation. Statistical simulations revealed that this level of misclassification substantially decreases statistical power and increases the likelihood of false conclusions from statistical tests. Providing a wider range of response options for each item reduced the misclassification rate by approximately half. Overall, the use of single-item, self-report questions to assess the presence of suicidal behaviors leads to misclassification, increasing the likelihood of statistical decision errors. Improving the measurement of suicidal behaviors is critical to increase understanding and prevention of suicide.

  20. Item Purification Does Not Always Improve DIF Detection: A Counterexample with Angoff's Delta Plot

    Science.gov (United States)

    Magis, David; Facon, Bruno

    2013-01-01

    Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score-based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores to get purified sets of items, unaffected by DIF. The…

  1. Translation of Culture-Specific Items in Self-Help Literature: A Study on Domestication and Foreignization Strategies

    Directory of Open Access Journals (Sweden)

    Volga Yılmaz-Gümüş

    2012-05-01

    Full Text Available The last two decades have witnessed a boom in self-help materials in both global and local markets. This self-help trend, growing rapidly in our modern day, should be an area of interest for Translation Studies as an increasing number of self-help materials have been translated particularly from English every year. Self-help books involve a great deal of references to the material and social culture of the original country. One of the key issues in the translation of self-help books is the choice between foreignizing and domesticating these culture-specific items. This paper aims to discuss the procedures used for the translation of culture-specific items with regard to the particular function that these books assume in the target society. The analysis on the example of Outliers, a self-help book of sorts written by Malcolm Gladwell, has shown that the translator mostly adopted foreignizing strategies in translating the text into Turkish. The study also discusses whether these foreignizing strategies contribute to the fulfillment of target-text function, which is to provide quick-fix remedies to people struggling with modern-day challenges and demands.

  2. Statistical Indexes for Monitoring Item Behavior under Computer Adaptive Testing Environment.

    Science.gov (United States)

    Zhu, Renbang; Yu, Feng; Liu, Su

    A computerized adaptive test (CAT) administration usually requires a large supply of items with accurately estimated psychometric properties, such as item response theory (IRT) parameter estimates, to ensure the precision of examinee ability estimation. However, an estimated IRT model of a given item in any given pool does not always correctly…

  3. 41 CFR 101-26.103-2 - Restriction on personal convenience items.

    Science.gov (United States)

    2010-07-01

    ... convenience items. 101-26.103-2 Section 101-26.103-2 Public Contracts and Property Management Federal Property... SOURCES AND PROGRAM 26.1-General § 101-26.103-2 Restriction on personal convenience items. Government... type items intended solely for the personal convenience or to satisfy the personal desire of an...

  4. Using Item Response Theory to Describe the Nonverbal Literacy Assessment (NVLA)

    Science.gov (United States)

    Fleming, Danielle; Wilson, Mark; Ahlgrim-Delzell, Lynn

    2018-01-01

    The Nonverbal Literacy Assessment (NVLA) is a literacy assessment designed for students with significant intellectual disabilities. The 218-item test was initially examined using confirmatory factor analysis. This method showed that the test worked as expected, but the items loaded onto a single factor. This article uses item response theory to…

  5. Developing and testing items for the South African Personality Inventory (SAPI

    Directory of Open Access Journals (Sweden)

    Carin Hill

    2013-11-01

    Research purpose: This article reports on the process of identifying items for, and provides a quantitative evaluation of, the South African Personality Inventory (SAPI items. Motivation for the study: The study intended to develop an indigenous and psychometrically sound personality instrument that adheres to the requirements of South African legislation and excludes cultural bias. Research design, approach and method: The authors used a cross-sectional design. They measured the nine SAPI clusters identified in the qualitative stage of the SAPI project in 11 separate quantitative studies. Convenience sampling yielded 6735 participants. Statistical analysis focused on the construct validity and reliability of items. The authors eliminated items that showed poor performance, based on common psychometric criteria, and selected the best performing items to form part of the final version of the SAPI. Main findings: The authors developed 2573 items from the nine SAPI clusters. Of these, 2268 items were valid and reliable representations of the SAPI facets. Practical/managerial implications: The authors developed a large item pool. It measures personality in South Africa. Researchers can refine it for the SAPI. Furthermore, the project illustrates an approach that researchers can use in projects that aim to develop culturally-informed psychological measures. Contribution/value-add: Personality assessment is important for recruiting, selecting and developing employees. This study contributes to the current knowledge about the early processes researchers follow when they develop a personality instrument that measures personality fairly in different cultural groups, as the SAPI does.

  6. Validity and Reliability of the 8-Item Work Limitations Questionnaire.

    Science.gov (United States)

    Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C

    2017-12-01

    Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.

  7. Application of Item Response Theory to Tests of Substance-related Associative Memory

    Science.gov (United States)

    Shono, Yusuke; Grenard, Jerry L.; Ames, Susan L.; Stacy, Alan W.

    2015-01-01

    A substance-related word association test (WAT) is one of the commonly used indirect tests of substance-related implicit associative memory and has been shown to predict substance use. This study applied an item response theory (IRT) modeling approach to evaluate psychometric properties of the alcohol- and marijuana-related WATs and their items among 775 ethnically diverse at-risk adolescents. After examining the IRT assumptions, item fit, and differential item functioning (DIF) across gender and age groups, the original 18 WAT items were reduced to 14- and 15-items in the alcohol- and marijuana-related WAT, respectively. Thereafter, unidimensional one- and two-parameter logistic models (1PL and 2PL models) were fitted to the revised WAT items. The results demonstrated that both alcohol- and marijuana-related WATs have good psychometric properties. These results were discussed in light of the framework of a unified concept of construct validity (Messick, 1975, 1989, 1995). PMID:25134051

  8. Development and validation of a ten-item questionnaire with explanatory illustrations to assess upper extremity disorders: favorable effect of illustrations in the item reduction process.

    Science.gov (United States)

    Kurimoto, Shigeru; Suzuki, Mikako; Yamamoto, Michiro; Okui, Nobuyuki; Imaeda, Toshihiko; Hirata, Hitoshi

    2011-11-01

    The purpose of this study is to develop a short and valid measure for upper extremity disorders and to assess the effect of attached illustrations in item reduction of a self-administered disability questionnaire while retaining psychometric properties. A validated questionnaire used to assess upper extremity disorders, the Hand20, was reduced to ten items using two item-reduction techniques. The psychometric properties of the abbreviated form, the Hand10, were evaluated on an independent sample that was used for the shortening process. Validity, reliability, and responsiveness of the Hand10 were retained in the item reduction process. It was possible that the use of explanatory illustrations attached to the Hand10 helped with its reproducibility. The illustrations for the Hand10 promoted text comprehension and motivation to answer the items. These changes resulted in high acceptability; more than 99.3% of patients, including 98.5% of elderly patients, could complete the Hand10 properly. The illustrations had favorable effects on the item reduction process and made it possible to retain precision of the instrument. The Hand10 is a reliable and valid instrument for individual-level applications with the advantage of being compact and broadly applicable, even in elderly individuals.

  9. Effects of Reducing the Cognitive Load of Mathematics Test Items on Student Performance

    Directory of Open Access Journals (Sweden)

    Susan C. Gillmor

    2015-01-01

    Full Text Available This study explores a new item-writing framework for improving the validity of math assessment items. The authors transfer insights from Cognitive Load Theory (CLT, traditionally used in instructional design, to educational measurement. Fifteen, multiple-choice math assessment items were modified using research-based strategies for reducing extraneous cognitive load. An experimental design with 222 middle-school students tested the effects of the reduced cognitive load items on student performance and anxiety. Significant findings confirm the main research hypothesis that reducing the cognitive load of math assessment items improves student performance. Three load-reducing item modifications are identified as particularly effective for reducing item difficulty: signalling important information, aesthetic item organization, and removing extraneous content. Load reduction was not shown to impact student anxiety. Implications for classroom assessment and future research are discussed.

  10. Response Mixture Modeling: Accounting for Heterogeneity in Item Characteristics across Response Times.

    Science.gov (United States)

    Molenaar, Dylan; de Boeck, Paul

    2018-06-01

    In item response theory modeling of responses and response times, it is commonly assumed that the item responses have the same characteristics across the response times. However, heterogeneity might arise in the data if subjects resort to different response processes when solving the test items. These differences may be within-subject effects, that is, a subject might use a certain process on some of the items and a different process with different item characteristics on the other items. If the probability of using one process over the other process depends on the subject's response time, within-subject heterogeneity of the item characteristics across the response times arises. In this paper, the method of response mixture modeling is presented to account for such heterogeneity. Contrary to traditional mixture modeling where the full response vectors are classified, response mixture modeling involves classification of the individual elements in the response vector. In a simulation study, the response mixture model is shown to be viable in terms of parameter recovery. In addition, the response mixture model is applied to a real dataset to illustrate its use in investigating within-subject heterogeneity in the item characteristics across response times.

  11. Prevalence of item level negative symptoms in first episode psychosis diagnoses.

    LENUS (Irish Health Repository)

    Lyne, John

    2012-03-01

    The relevance of negative symptoms across the diagnostic spectrum of the psychoses remains uncertain. The purpose of this study was to report on prevalence of item and subscale level negative symptoms across the first episode psychosis (FEP) diagnostic spectrum in an epidemiological sample, and to ascertain whether items and subscales were more prevalent in a schizophrenia spectrum diagnoses group compared to an \\'all other psychotic diagnoses\\' group. We measured negative symptoms in 330 patients presenting with FEP using the Scale for Assessment of Negative Symptoms (SANS), and ascertained diagnosis using the Structured Clinical Interview for DSM IV. Prevalence of SANS items and subscales were tabulated across all psychotic diagnoses, and logistic regression analysis determined which items and subscales were predictive of schizophrenia spectrum diagnoses. SANS items were most prevalent in schizophrenia spectrum conditions but frequently presented in other FEP diagnoses, particularly substance induced psychotic disorder and Major Depressive Disorder. Brief psychotic disorder and bipolar disorders had low levels of negative symptoms. SANS items and subscales which significantly predicted schizophrenia spectrum diagnoses, were also frequently present in some of the other psychotic diagnoses. Conclusions: SANS items have high prevalence in FEP, and while commonest in schizophrenia spectrum conditions are not restricted to this diagnostic subgroup.

  12. Differential item functioning analysis of the Vanderbilt Expertise Test for cars.

    Science.gov (United States)

    Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W; Van Gulick, Ana Beth; Gauthier, Isabel

    2015-01-01

    The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.

  13. ABORTION ATTITUDES, 1984-1987-1988 - EFFECTS OF ITEM ORDER AND DIMENSIONALITY

    NARCIS (Netherlands)

    TENVERGERT, E; GILLESPIE, MW; KINGMA, J; KLASEN, H

    The comparability of surveys is often hampered by differences in the item order of presentation. The major focus of the present study was to investigate whether a general item or a specific item at the beginning of the questionnaire would affect the endorsement as well as the scalability of a set of

  14. The Effects of Goal Relevance and Perceptual Features on Emotional Items and Associative Memory.

    Science.gov (United States)

    Mao, Wei B; An, Shu; Yang, Xiao F

    2017-01-01

    Showing an emotional item in a neutral background scene often leads to enhanced memory for the emotional item and impaired associative memory for background details. Meanwhile, both top-down goal relevance and bottom-up perceptual features played important roles in memory binding. We conducted two experiments and aimed to further examine the effects of goal relevance and perceptual features on emotional items and associative memory. By manipulating goal relevance (asking participants to categorize only each item image as living or non-living or to categorize each whole composite picture consisted of item image and background scene as natural scene or manufactured scene) and perceptual features (controlling visual contrast and visual familiarity) in two experiments, we found that both high goal relevance and salient perceptual features (high salience of items vs. high familiarity of items) could promote emotional item memory, but they had different effects on associative memory for emotional items and neutral backgrounds. Specifically, high goal relevance and high perceptual-salience of items could jointly impair the associative memory for emotional items and neutral backgrounds, while the effect of item familiarity on associative memory for emotional items would be modulated by goal relevance. High familiarity of items could increase associative memory for negative items and neutral backgrounds only in the low goal relevance condition. These findings suggest the effect of emotion on associative memory is not only related to attentional capture elicited by emotion, but also can be affected by goal relevance and perceptual features of stimulus.

  15. Linking Existing Instruments to Develop an Activity of Daily Living Item Bank.

    Science.gov (United States)

    Li, Chih-Ying; Romero, Sergio; Bonilha, Heather S; Simpson, Kit N; Simpson, Annie N; Hong, Ickpyo; Velozo, Craig A

    2018-03-01

    This study examined dimensionality and item-level psychometric properties of an item bank measuring activities of daily living (ADL) across inpatient rehabilitation facilities and community living centers. Common person equating method was used in the retrospective veterans data set. This study examined dimensionality, model fit, local independence, and monotonicity using factor analyses and fit statistics, principal component analysis (PCA), and differential item functioning (DIF) using Rasch analysis. Following the elimination of invalid data, 371 veterans who completed both the Functional Independence Measure (FIM) and minimum data set (MDS) within 6 days were retained. The FIM-MDS item bank demonstrated good internal consistency (Cronbach's α = .98) and met three rating scale diagnostic criteria and three of the four model fit statistics (comparative fit index/Tucker-Lewis index = 0.98, root mean square error of approximation = 0.14, and standardized root mean residual = 0.07). PCA of Rasch residuals showed the item bank explained 94.2% variance. The item bank covered the range of θ from -1.50 to 1.26 (item), -3.57 to 4.21 (person) with person strata of 6.3. The findings indicated the ADL physical function item bank constructed from FIM and MDS measured a single latent trait with overall acceptable item-level psychometric properties, suggesting that it is an appropriate source for developing efficient test forms such as short forms and computerized adaptive tests.

  16. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate, preparation of the package and related paperwork). Large and massive objects require a longer procedure and will therefore take longer.

  17. Item Response Theory analysis of Fagerström Test for Cigarette Dependence.

    Science.gov (United States)

    Svicher, Andrea; Cosci, Fiammetta; Giannini, Marco; Pistelli, Francesco; Fagerström, Karl

    2018-02-01

    The Fagerström Test for Cigarette Dependence (FTCD) and the Heaviness of Smoking Index (HSI) are the gold standard measures to assess cigarette dependence. However, FTCD reliability and factor structure have been questioned and HSI psychometric properties are in need of further investigations. The present study examined the psychometrics properties of the FTCD and the HSI via the Item Response Theory. The study was a secondary analysis of data collected in 862 Italian daily smokers. Confirmatory factor analysis was run to evaluate the dimensionality of FTCD. A Grade Response Model was applied to FTCD and HSI to verify the fit to the data. Both item and test functioning were analyzed and item statistics, Test Information Function, and scale reliabilities were calculated. Mokken Scale Analysis was applied to estimate homogeneity and Loevinger's coefficients were calculated. The FTCD showed unidimensionality and homogeneity for most of the items and for the total score. It also showed high sensitivity and good reliability from medium to high levels of cigarette dependence, although problems related to some items (i.e., items 3 and 5) were evident. HSI had good homogeneity, adequate item functioning, and high reliability from medium to high levels of cigarette dependence. Significant Differential Item Functioning was found for items 1, 4, 5 of the FTCD and for both items of HSI. HSI seems highly recommended in clinical settings addressed to heavy smokers while FTCD would be better used in smokers with a level of cigarette dependence ranging between low and high. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Overcoming the effects of differential skewness of test items in scale construction

    Directory of Open Access Journals (Sweden)

    Johann M. Schepers

    2004-10-01

    Full Text Available The principal objective of the study was to develop a procedure for overcoming the effects of differential skewness of test items in scale construction. It was shown that the degree of skewness of test items places an upper limit on the correlations between the items, regardless of the contents of the items. If the items are ordered in terms of skewness the resulting inter correlation matrix forms a simplex or a pseudo simplex. Factoring such a matrix results in a multiplicity of factors, most of which are artifacts. A procedure for overcoming this problem was demonstrated with items from the Locus of Control Inventory (Schepers, 1995. The analysis was based on a sample of 1662 first year university students. Opsomming Die hoofdoel van die studie was om ’n prosedure te ontwikkel om die gevolge van differensiële skeefheid van toetsitems, in skaalkonstruksie, teen te werk. Daar is getoon dat die graad van skeefheid van toetsitems ’n boonste grens plaas op die korrelasies tussen die items ongeag die inhoud daarvan. Indien die items gerangskik word volgens graad van skeefheid, sal die interkorrelasiematriks van die items ’n simpleks of pseudosimpleks vorm. Indien so ’n matriks aan faktorontleding onderwerp word, lei dit tot ’n veelheid van faktore waarvan die meerderheid artefakte is. ’n Prosedure om hierdie probleem te bowe te kom, is gedemonstreer met behulp van die items van die Lokus van Beheer-vraelys (Schepers, 1995. Die ontledings is op ’n steekproef van 1662 eerstejaaruniversiteitstudente gebaseer.

  19. Converging evidence for control of color-word Stroop interference at the item level.

    Science.gov (United States)

    Bugg, Julie M; Hutchison, Keith A

    2013-04-01

    Prior studies have shown that cognitive control is implemented at the list and context levels in the color-word Stroop task. At first blush, the finding that Stroop interference is reduced for mostly incongruent items as compared with mostly congruent items (i.e., the item-specific proportion congruence [ISPC] effect) appears to provide evidence for yet a third level of control, which modulates word reading at the item level. However, evidence to date favors the view that ISPC effects reflect the rapid prediction of high-contingency responses and not item-specific control. In Experiment 1, we first show that an ISPC effect is obtained when the relevant dimension (i.e., color) signals proportion congruency, a problematic pattern for theories based on differential response contingencies. In Experiment 2, we replicate and extend this pattern by showing that item-specific control settings transfer to new stimuli, ruling out alternative frequency-based accounts. In Experiment 3, we revert to the traditional design in which the irrelevant dimension (i.e., word) signals proportion congruency. Evidence for item-specific control, including transfer of the ISPC effect to new stimuli, is apparent when 4-item sets are employed but not when 2-item sets are employed. We attribute this pattern to the absence of high-contingency responses on incongruent trials in the 4-item set. These novel findings provide converging evidence for reactive control of color-word Stroop interference at the item level, reveal theoretically important factors that modulate reliance on item-specific control versus contingency learning, and suggest an update to the item-specific control account (Bugg, Jacoby, & Chanani, 2011).

  20. Concreteness effects in short-term memory: a test of the item-order hypothesis.

    Science.gov (United States)

    Roche, Jaclynn; Tolan, G Anne; Tehan, Gerald

    2011-12-01

    The following experiments explore word length and concreteness effects in short-term memory within an item-order processing framework. This framework asserts order memory is better for those items that are relatively easy to process at the item level. However, words that are difficult to process benefit at the item level for increased attention/resources being applied. The prediction of the model is that differential item and order processing can be detected in episodic tasks that differ in the degree to which item or order memory are required by the task. The item-order account has been applied to the word length effect such that there is a short word advantage in serial recall but a long word advantage in item recognition. The current experiment considered the possibility that concreteness effects might be explained within the same framework. In two experiments, word length (Experiment 1) and concreteness (Experiment 2) are examined using forward serial recall, backward serial recall, and item recognition. These results for word length replicate previous studies showing the dissociation in item and order tasks. The same was not true for the concreteness effect. In all three tasks concrete words were better remembered than abstract words. The concreteness effect cannot be explained in terms of an item-order trade off. PsycINFO Database Record (c) 2011 APA, all rights reserved.