measurable items related: Topics by WorldWideScience.org

Sample records for measurable items related

Assessing errors related to characteristics of the items measured

International Nuclear Information System (INIS)

Liggett, W.

1980-01-01

Errors that are related to some intrinsic property of the items measured are often encountered in nuclear material accounting. An example is the error in nondestructive assay measurements caused by uncorrected matrix effects. Nuclear material accounting requires for each materials type one measurement method for which bounds on these errors can be determined. If such a method is available, a second method might be used to reduce costs or to improve precision. If the measurement error for the first method is longer-tailed than Gaussian, then precision might be improved by measuring all items by both methods. 8 refs
Differential item functioning magnitude and impact measures from item response theory models.

Science.gov (United States)

Kleinman, Marjorie; Teresi, Jeanne A

2016-01-01

Measures of magnitude and impact of differential item functioning (DIF) at the item and scale level, respectively are presented and reviewed in this paper. Most measures are based on item response theory models. Magnitude refers to item level effect sizes, whereas impact refers to differences between groups at the scale score level. Reviewed are magnitude measures based on group differences in the expected item scores and impact measures based on differences in the expected scale scores. The similarities among these indices are demonstrated. Various software packages are described that provide magnitude and impact measures, and new software presented that computes all of the available statistics conveniently in one program with explanations of their relationships to one another.
Constructing the 32-item Fitness-to-Drive Screening Measure.

Science.gov (United States)

Medhizadah, Shabnam; Classen, Sherrilene; Johnson, Andrew M

2018-04-01

The Fitness-to-Drive Screening Measure © (FTDS) enables proxies to identify at-risk older drivers via 54 driving-related items, but may be too lengthy for widespread uptake. We reduced the number of items in the FTDS and validated the shorter measure, using 200 caregiver responses. Exploratory factor analysis and classical test theory techniques were used to determine the most interpretable factor model and the minimum number of items to be used for predicting fitness to drive. The extent to which the shorter FTDS predicted the results of the 54-item FTDS was evaluated through correlational analysis. A three-factor model best represented the empirical data. Classical test theory techniques lead to the development of the 32-item FTDS. The 32-item FTDS was highly correlated ( r = .99, p = .05) with the FTDS. The 32-item FTDS may provide raters with a faster and more efficient way to identify at-risk older drivers.
[Instrument to measure adherence in hypertensive patients: contribution of Item Response Theory].

Science.gov (United States)

Rodrigues, Malvina Thaís Pacheco; Moreira, Thereza Maria Magalhaes; Vasconcelos, Alexandre Meira de; Andrade, Dalton Francisco de; Silva, Daniele Braz da; Barbetta, Pedro Alberto

2013-06-01

To analyze, by means of "Item Response Theory", an instrument to measure adherence to t treatment for hypertension. Analytical study with 406 hypertensive patients with associated complications seen in primary care in Fortaleza, CE, Northeastern Brazil, 2011 using "Item Response Theory". The stages were: dimensionality test, calibrating the items, processing data and creating a scale, analyzed using the gradual response model. A study of the dimensionality of the instrument was conducted by analyzing the polychoric correlation matrix and factor analysis of complete information. Multilog software was used to calibrate items and estimate the scores. Items relating to drug therapy are the most directly related to adherence while those relating to drug-free therapy need to be reworked because they have less psychometric information and low discrimination. The independence of items, the small number of levels in the scale and low explained variance in the adjustment of the models show the main weaknesses of the instrument analyzed. The "Item Response Theory" proved to be a relevant analysis technique because it evaluated respondents for adherence to treatment for hypertension, the level of difficulty of the items and their ability to discriminate between individuals with different levels of adherence, which generates a greater amount of information. The instrument analyzed is limited in measuring adherence to hypertension treatment, by analyzing the "Item Response Theory" of the item, and needs adjustment. The proper formulation of the items is important in order to accurately measure the desired latent trait.
Improving measurement of injection drug risk behavior using item response theory.

Science.gov (United States)

Janulis, Patrick

2014-03-01

Recent research highlights the multiple steps to preparing and injecting drugs and the resultant viral threats faced by drug users. This research suggests that more sensitive measurement of injection drug HIV risk behavior is required. In addition, growing evidence suggests there are gender differences in injection risk behavior. However, the potential for differential item functioning between genders has not been explored. To explore item response theory as an improved measurement modeling technique that provides empirically justified scaling of injection risk behavior and to examine for potential gender-based differential item functioning. Data is used from three studies in the National Institute on Drug Abuse's Criminal Justice Drug Abuse Treatment Studies. A two-parameter item response theory model was used to scale injection risk behavior and logistic regression was used to examine for differential item functioning. Item fit statistics suggest that item response theory can be used to scale injection risk behavior and these models can provide more sensitive estimates of risk behavior. Additionally, gender-based differential item functioning is present in the current data. Improved measurement of injection risk behavior using item response theory should be encouraged as these models provide increased congruence between construct measurement and the complexity of injection-related HIV risk. Suggestions are made to further improve injection risk behavior measurement. Furthermore, results suggest direct comparisons of composite scores between males and females may be misleading and future work should account for differential item functioning before comparing levels of injection risk behavior.
Method of locating related items in a geometric space for data mining

Science.gov (United States)

Hendrickson, Bruce A.

1999-01-01

A method for locating related items in a geometric space transforms relationships among items to geometric locations. The method locates items in the geometric space so that the distance between items corresponds to the degree of relatedness. The method facilitates communication of the structure of the relationships among the items. The method is especially beneficial for communicating databases with many items, and with non-regular relationship patterns. Examples of such databases include databases containing items such as scientific papers or patents, related by citations or keywords. A computer system adapted for practice of the present invention can include a processor, a storage subsystem, a display device, and computer software to direct the location and display of the entities. The method comprises assigning numeric values as a measure of similarity between each pairing of items. A matrix is constructed, based on the numeric values. The eigenvectors and eigenvalues of the matrix are determined. Each item is located in the geometric space at coordinates determined from the eigenvectors and eigenvalues. Proper construction of the matrix and proper determination of coordinates from eigenvectors can ensure that distance between items in the geometric space is representative of the numeric value measure of the items' similarity.
The measurement of cyberbullying: dimensional structure and relative item severity and discrimination.

Science.gov (United States)

Menesini, Ersilia; Nocentini, Annalaura; Calussi, Pamela

2011-05-01

In relation to a sample of 1,092 Italian adolescents (50.9% females), the present study aims to: (a) analyze the most parsimonious structure of the cyberbullying and cybervictimization construct in male and female Italian adolescents through confirmatory factor analysis; and (b) analyze the severity and the discrimination parameters of each act using the item response theory. Results showed that the structure of the cyberbullying scale for perpetrated and received behaviors in both genders could best be represented by a monodimensional model where each item lies on a continuum of severity of aggressive acts. For both genders, the less severe acts are silent/prank calls and insults on instant messaging, and the most severe acts are unpleasant pictures/photos on Web sites, phone pictures/photos/videos of intimate scenes, and phone pictures/photos/videos of violent scenes. The items nasty text messages, nasty or rude e-mails, insults on Web sites, insults in chatrooms, and insults on blogs range from moderate to high levels of severity. Regarding the discrimination level of the acts, several items emerged as good indicators at various levels of cyberbullying and cybervictimization severity, with the exception of silent/prank calls. Furthermore, gender specificities underlined that the visual items can be considered good indicators of severe cyberbullies and cybervictims only in males. This information can help in understanding better the nature of the phenomenon, its severity in a given population, and to plan more specific prevention and intervention strategies.
Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS: An item response theory approach

Directory of Open Access Journals (Sweden)

JOSEPH P. EIMICKE

2009-06-01

Full Text Available The aims of this paper are to present findings related to differential item functioning (DIF in the Patient Reported Outcome Measurement Information System (PROMIS depression item bank, and to discuss potential threats to the validity of results from studies of DIF. The 32 depression items studied were modified from several widely used instruments. DIF analyses of gender, age and education were performed using a sample of 735 individuals recruited by a survey polling firm. DIF hypotheses were generated by asking content experts to indicate whether or not they expected DIF to be present, and the direction of the DIF with respect to the studied comparison groups. Primary analyses were conducted using the graded item response model (for polytomous, ordered response category data with likelihood ratio tests of DIF, accompanied by magnitude measures. Sensitivity analyses were performed using other item response models and approaches to DIF detection. Despite some caveats, the items that are recommended for exclusion or for separate calibration were "I felt like crying" and "I had trouble enjoying things that I used to enjoy." The item, "I felt I had no energy," was also flagged as evidencing DIF, and recommended for additional review. On the one hand, false DIF detection (Type 1 error was controlled to the extent possible by ensuring model fit and purification. On the other hand, power for DIF detection might have been compromised by several factors, including sparse data and small sample sizes. Nonetheless, practical and not just statistical significance should be considered. In this case the overall magnitude and impact of DIF was small for the groups studied, although impact was relatively large for some individuals.
The medial temporal lobes distinguish between within-item and item-context relations during autobiographical memory retrieval.

Science.gov (United States)

Sheldon, Signy; Levine, Brian

2015-12-01

During autobiographical memory retrieval, the medial temporal lobes (MTL) relate together multiple event elements, including object (within-item relations) and context (item-context relations) information, to create a cohesive memory. There is consistent support for a functional specialization within the MTL according to these relational processes, much of which comes from recognition memory experiments. In this study, we compared brain activation patterns associated with retrieving within-item relations (i.e., associating conceptual and sensory-perceptual object features) and item-context relations (i.e., spatial relations among objects) with respect to naturalistic autobiographical retrieval. We developed a novel paradigm that cued participants to retrieve information about past autobiographical events, non-episodic within-item relations, and non-episodic item-context relations with the perceptuomotor aspects of retrieval equated across these conditions. We used multivariate analysis techniques to extract common and distinct patterns of activity among these conditions within the MTL and across the whole brain, both in terms of spatial and temporal patterns of activity. The anterior MTL (perirhinal cortex and anterior hippocampus) was preferentially recruited for generating within-item relations later in retrieval whereas the posterior MTL (posterior parahippocampal cortex and posterior hippocampus) was preferentially recruited for generating item-context relations across the retrieval phase. These findings provide novel evidence for functional specialization within the MTL with respect to naturalistic memory retrieval. © 2015 Wiley Periodicals, Inc.
Developing an item bank to measure the coping strategies of people with hereditary retinal diseases.

Science.gov (United States)

Prem Senthil, Mallika; Khadka, Jyoti; De Roach, John; Lamey, Tina; McLaren, Terri; Campbell, Isabella; Fenwick, Eva K; Lamoureux, Ecosse L; Pesudovs, Konrad

2018-05-05

Our understanding of the coping strategies used by people with visual impairment to manage stress related to visual loss is limited. This study aims to develop a sophisticated coping instrument in the form of an item bank implemented via Computerised adaptive testing (CAT) for hereditary retinal diseases. Items on coping were extracted from qualitative interviews with patients which were supplemented by items from a literature review. A systematic multi-stage process of item refinement was carried out followed by expert panel discussion and cognitive interviews. The final coping item bank had 30 items. Rasch analysis was used to assess the psychometric properties. A CAT simulation was carried out to estimate an average number of items required to gain precise measurement of hereditary retinal disease-related coping. One hundred eighty-nine participants answered the coping item bank (median age = 58 years). The coping scale demonstrated good precision and targeting. The standardised residual loadings for items revealed six items grouped together. Removal of the six items reduced the precision of the main coping scale and worsened the variance explained by the measure. Therefore, the six items were retained within the main scale. Our CAT simulation indicated that, on average, less than 10 items are required to gain a precise measurement of coping. This is the first study to develop a psychometrically robust coping instrument for hereditary retinal diseases. CAT simulation indicated that on an average, only four and nine items were required to gain measurement at moderate and high precision, respectively.
Method using a density field for locating related items for data mining

Science.gov (United States)

Wylie, Brian N.

2002-01-01

A method for locating related items in a geometric space transforms relationships among items to geometric locations. The method locates items in the geometric space so that the distance between items corresponds to the degree of relatedness. The method facilitates communication of the structure of the relationships among the items. The method makes use of numeric values as a measure of similarity between each pairing of items. The items are given initial coordinates in the space. An energy is then determined for each item from the item's distance and similarity to other items, and from the density of items assigned coordinates near the item. The distance and similarity component can act to draw items with high similarities close together, while the density component can act to force all items apart. If a terminal condition is not yet reached, then new coordinates can be determined for one or more items, and the energy determination repeated. The iteration can terminate, for example, when the total energy reaches a threshold, when each item's energy is below a threshold, after a certain amount of time or iterations.
Assessing Differential Item Functioning on the Test of Relational Reasoning

Directory of Open Access Journals (Sweden)

Denis Dumas

2018-03-01

Full Text Available The test of relational reasoning (TORR is designed to assess the ability to identify complex patterns within visuospatial stimuli. The TORR is designed for use in school and university settings, and therefore, its measurement invariance across diverse groups is critical. In this investigation, a large sample, representative of a major university on key demographic variables, was collected, and the resulting data were analyzed using a multi-group, multidimensional item-response theory model-comparison procedure. No significant differential item functioning was found on any of the TORR items across any of the demographic groups of interest. This finding is interpreted as evidence of the cultural fairness of the TORR, and potential test-development choices that may have contributed to that cultural fairness are discussed.
Memory for Items and Relationships among Items Embedded in Realistic Scenes: Disproportionate Relational Memory Impairments in Amnesia

Science.gov (United States)

Hannula, Deborah E.; Tranel, Daniel; Allen, John S.; Kirchhoff, Brenda A.; Nickel, Allison E.; Cohen, Neal J.

2014-01-01

Objective The objective of this study was to examine the dependence of item memory and relational memory on medial temporal lobe (MTL) structures. Patients with amnesia, who either had extensive MTL damage or damage that was relatively restricted to the hippocampus, were tested, as was a matched comparison group. Disproportionate relational memory impairments were predicted for both patient groups, and those with extensive MTL damage were also expected to have impaired item memory. Method Participants studied scenes, and were tested with interleaved two-alternative forced-choice probe trials. Probe trials were either presented immediately after the corresponding study trial (lag 1), five trials later (lag 5), or nine trials later (lag 9) and consisted of the studied scene along with a manipulated version of that scene in which one item was replaced with a different exemplar (item memory test) or was moved to a new location (relational memory test). Participants were to identify the exact match of the studied scene. Results As predicted, patients were disproportionately impaired on the test of relational memory. Item memory performance was marginally poorer among patients with extensive MTL damage, but both groups were impaired relative to matched comparison participants. Impaired performance was evident at all lags, including the shortest possible lag (lag 1). Conclusions The results are consistent with the proposed role of the hippocampus in relational memory binding and representation, even at short delays, and suggest that the hippocampus may also contribute to successful item memory when items are embedded in complex scenes. PMID:25068665
An Item Bank to Measure Systems, Services, and Policies: Environmental Factors Affecting People With Disabilities.

Science.gov (United States)

Lai, Jin-Shei; Hammel, Joy; Jerousek, Sara; Goldsmith, Arielle; Miskovic, Ana; Baum, Carolyn; Wong, Alex W; Dashner, Jessica; Heinemann, Allen W

2016-12-01

To develop a measure of perceived systems, services, and policies facilitators (see Chapter 5 of the International Classification of Functioning, Disability and Health) for people with neurologic disabilities and to evaluate the effect of perceived systems, services, and policies facilitators on health-related quality of life. Qualitative approaches to develop and refine items. Confirmatory factor analysis including 1-factor confirmatory factor analysis and bifactor analysis to evaluate unidimensionality of items. Rasch analysis to identify misfitting items. Correlational and analysis of variance methods to evaluate construct validity. Community-dwelling individuals participated in telephone interviews or traveled to the academic medical centers where this research took place. Participants (N=571) had a diagnosis of spinal cord injury, stroke, or traumatic brain injury. They were 18 years or older and English speaking. Not applicable. An item bank to evaluate environmental access and support levels of services, systems, and policies for people with disabilities. We identified a general factor defined as "access and support levels of the services, systems, and policies at the level of community living" and 3 local factors defined as "health services," "community living," and "community resources." The systems, services, and policies measure correlated moderately with participation measures: Community Participation Indicators (CPI) - Involvement, CPI - Control over Participation, Quality of Life in Neurological Disorders - Ability to Participate, Quality of Life in Neurological Disorders - Satisfaction with Role Participation, Patient-Reported Outcomes Measurement Information System (PROMIS) Ability to Participate, PROMIS Satisfaction with Role Participation, and PROMIS Isolation. The measure of systems, services, and policies facilitators contains items pertaining to health services, community living, and community resources. Investigators and clinicians can measure
Few items in the thyroid-related quality of life instrument ThyPRO exhibited differential item functioning

DEFF Research Database (Denmark)

Watt, Torquil; Grønvold, Mogens; Hegedüs, Laszlo

2014-01-01

To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis.......To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis....
Overview of Classical Test Theory and Item Response Theory for Quantitative Assessment of Items in Developing Patient-Reported Outcome Measures

Science.gov (United States)

Cappelleri, Joseph C.; Lundy, J. Jason; Hays, Ron D.

2014-01-01

Introduction The U.S. Food and Drug Administration’s patient-reported outcome (PRO) guidance document defines content validity as “the extent to which the instrument measures the concept of interest” (FDA, 2009, p. 12). “Construct validity is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity” (Strauss & Smith, 2009, p. 7). Hence both qualitative and quantitative information are essential in evaluating the validity of measures. Methods We review classical test theory and item response theory approaches to evaluating PRO measures including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized “difficulty” (severity) order of items is represented by observed responses. Conclusion Classical test theory and item response theory can be useful in providing a quantitative assessment of items and scales during the content validity phase of patient-reported outcome measures. Depending on the particular type of measure and the specific circumstances, either one or both approaches should be considered to help maximize the content validity of PRO measures. PMID:24811753
NHRIC (National Health Related Items Code)

Data.gov (United States)

U.S. Department of Health & Human Services — The National Health Related Items Code (NHRIC) is a system for identification and numbering of marketed device packages that is compatible with other numbering...
Item response theory analysis of Centers for Disease Control and Prevention Health-Related Quality of Life (CDC HRQOL) items in adults with arthritis.

Science.gov (United States)

Mielenz, Thelma J; Callahan, Leigh F; Edwards, Michael C

2016-03-12

Examine the feasibility of performing an item response theory (IRT) analysis on two of the Centers for Disease Control and Prevention health-related quality of life (CDC HRQOL) modules - the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM). Previous principal components analyses confirm that the two scales both assess a mix of mental (CDC-MH) and physical health (CDC-PH). The purpose is to conduct item response theory (IRT) analysis on the CDC-MH and CDC-PH scales separately. 2182 patients with self-reported or physician-diagnosed arthritis completed a cross-sectional survey including HDCM and HDSM items. Besides global health, the other 8 items ask the number of days that some statement was true; we chose to recode the data into 8 categories based on observed clustering. The IRT assumptions were assessed using confirmatory factor analysis and the data could be modeled using an unidimensional IRT model. The graded response model was used for IRT analyses and CDC-MH and CDC-PH scales were analyzed separately in flexMIRT. The IRT parameter estimates for the five-item CDC-PH all appeared reasonable. The three-item CDC-MH did not have reasonable parameter estimates. The CDC-PH scale is amenable to IRT analysis but the existing The CDC-MH scale is not. We suggest either using the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM) as they currently stand or the CDC-PH scale alone if the primary goal is to measure physical health related HRQOL.
Single-Item Measurement of Suicidal Behaviors: Validity and Consequences of Misclassification.

Directory of Open Access Journals (Sweden)

Alexander J Millner

Full Text Available Suicide is a leading cause of death worldwide. Although research has made strides in better defining suicidal behaviors, there has been less focus on accurate measurement. Currently, the widespread use of self-report, single-item questions to assess suicide ideation, plans and attempts may contribute to measurement problems and misclassification. We examined the validity of single-item measurement and the potential for statistical errors. Over 1,500 participants completed an online survey containing single-item questions regarding a history of suicidal behaviors, followed by questions with more precise language, multiple response options and narrative responses to examine the validity of single-item questions. We also conducted simulations to test whether common statistical tests are robust against the degree of misclassification produced by the use of single-items. We found that 11.3% of participants that endorsed a single-item suicide attempt measure engaged in behavior that would not meet the standard definition of a suicide attempt. Similarly, 8.8% of those who endorsed a single-item measure of suicide ideation endorsed thoughts that would not meet standard definitions of suicide ideation. Statistical simulations revealed that this level of misclassification substantially decreases statistical power and increases the likelihood of false conclusions from statistical tests. Providing a wider range of response options for each item reduced the misclassification rate by approximately half. Overall, the use of single-item, self-report questions to assess the presence of suicidal behaviors leads to misclassification, increasing the likelihood of statistical decision errors. Improving the measurement of suicidal behaviors is critical to increase understanding and prevention of suicide.
An Effect Size Measure for Raju's Differential Functioning for Items and Tests

Science.gov (United States)

Wright, Keith D.; Oshima, T. C.

2015-01-01

This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…

A Factor Analysis of Need-Fulfillment Items Designed to Measure Maslow Need Categories

Science.gov (United States)

Waters, L. K.; Roach, Darrell

1973-01-01

The purpose of the present study was to factor analyze a set of items frequently used to measure Maslow need categories to obtain further information on their structure in relation to the Maslow system. (Author)
Application of Item Response Theory to Tests of Substance-related Associative Memory

Science.gov (United States)

Shono, Yusuke; Grenard, Jerry L.; Ames, Susan L.; Stacy, Alan W.

2015-01-01

A substance-related word association test (WAT) is one of the commonly used indirect tests of substance-related implicit associative memory and has been shown to predict substance use. This study applied an item response theory (IRT) modeling approach to evaluate psychometric properties of the alcohol- and marijuana-related WATs and their items among 775 ethnically diverse at-risk adolescents. After examining the IRT assumptions, item fit, and differential item functioning (DIF) across gender and age groups, the original 18 WAT items were reduced to 14- and 15-items in the alcohol- and marijuana-related WAT, respectively. Thereafter, unidimensional one- and two-parameter logistic models (1PL and 2PL models) were fitted to the revised WAT items. The results demonstrated that both alcohol- and marijuana-related WATs have good psychometric properties. These results were discussed in light of the framework of a unified concept of construct validity (Messick, 1975, 1989, 1995). PMID:25134051
Factorial Structure and Age-Related Psychometrics of the MIDUS Personality Adjective Items across the Lifespan

Science.gov (United States)

Zimprich, Daniel; Allemand, Mathias; Lachman, Margie E.

2014-01-01

The present study addresses issues of measurement invariance and comparability of factor parameters of Big Five personality adjective items across age. Data from the Midlife in the United States (MIDUS) survey were used to investigate age-related developmental psychometrics of the MIDUS personality adjective items in two large cross-sectional samples (exploratory sample: N = 862; analysis sample: N = 3,000). After having established and replicated a comprehensive five-factor structure of the measure, increasing levels of measurement invariance were tested across ten age groups. Results indicate that the measure demonstrates strict measurement invariance in terms of number of factors and factor loadings. Also, we found that factor variances and covariances were equal across age groups. By contrast, a number of age-related factor mean differences emerged. The practical implications of these results are discussed and future research is suggested. PMID:21910548
Development of an item bank for computerized adaptive test (CAT) measurement of pain

DEFF Research Database (Denmark)

Petersen, Morten Aa.; Aaronson, Neil K; Chie, Wei-Chu

2016-01-01

PURPOSE: Patient-reported outcomes should ideally be adapted to the individual patient while maintaining comparability of scores across patients. This is achievable using computerized adaptive testing (CAT). The aim here was to develop an item bank for CAT measurement of the pain domain as measured...... were obtained from 1103 cancer patients from five countries. Psychometric evaluations showed that 16 items could be retained in a unidimensional item bank. Evaluations indicated that use of the CAT measure may reduce sample size requirements with 15-25 % compared to using the QLQ-C30 pain scale....... CONCLUSIONS: We have established an item bank of 16 items suitable for CAT measurement of pain. While being backward compatible with the QLQ-C30, the new item bank will significantly improve measurement precision of pain. We recommend initiating CAT measurement by screening for pain using the two original QLQ...
Measuring participation in patients with chronic back pain-the 5-Item Pain Disability Index.

Science.gov (United States)

McKillop, Ashley B; Carroll, Linda J; Dick, Bruce D; Battié, Michele C

2018-02-01

Of the three broad outcome domains of body functions and structures, activities, and participation (eg, engaging in valued social roles) outlined in the World Health Organization's (WHO) International Classification of Functioning, Disability and Health (ICF), it has been argued that participation is the most important to individuals, particularly those with chronic health problems. Yet, participation is not commonly measured in back pain research. The aim of this study was to investigate the construct validity of a modified 5-Item Pain Disability Index (PDI) score as a measure of participation in people with chronic back pain. A validation study was conducted using cross-sectional data. Participants with chronic back pain were recruited from a multidisciplinary pain center in Alberta, Canada. The outcome measure of interest is the 5-Item PDI. Each study participant was given a questionnaire package containing measures of participation, resilience, anxiety and depression, pain intensity, and pain-related disability, in addition to the PDI. The first five items of the PDI deal with social roles involving family responsibilities, recreation, social activities with friends, work, and sexual behavior, and comprised the 5-Item PDI seeking to measure participation. The last two items of the PDI deal with self-care and life support functions and were excluded. Construct validity of the 5-Item PDI as a measure of participation was examined using Pearson correlations or point-biserial correlations to test each hypothesized association. Participants were 70 people with chronic back pain and a mean age of 48.1 years. Forty-four (62.9%) were women. As hypothesized, the 5-Item PDI was associated with all measures of participation, including the Participation Assessment with Recombined Tools-Objective (r=-0.61), Late-Life Function and Disability Instrument: Disability Component (frequency: r=-0.66; limitation: r=-0.65), Work and Social Adjustment Scale (r=0.85), a global
Thorndike, Thurstone and Rasch: A Comparison of Their Approaches to Item-Invariant Measurement.

Science.gov (United States)

Englehard, George, Jr.

The methods used by E. L. Thorndike, L. L. Thurstone, and G. Rasch to address issues related to item-invariant measurement and the scoring of individual performance are compared. The analyses highlight the close connection among the three methods, and suggest that progress in measurement theory reflects the movement from essentially ad hoc methods…
Assessing the validity of single-item life satisfaction measures: results from three large samples.

Science.gov (United States)

Cheung, Felix; Lucas, Richard E

2014-12-01

The present paper assessed the validity of single-item life satisfaction measures by comparing single-item measures to the Satisfaction with Life Scale (SWLS)-a more psychometrically established measure. Two large samples from Washington (N = 13,064) and Oregon (N = 2,277) recruited by the Behavioral Risk Factor Surveillance System and a representative German sample (N = 1,312) recruited by the Germany Socio-Economic Panel were included in the present analyses. Single-item life satisfaction measures and the SWLS were correlated with theoretically relevant variables, such as demographics, subjective health, domain satisfaction, and affect. The correlations between the two life satisfaction measures and these variables were examined to assess the construct validity of single-item life satisfaction measures. Consistent across three samples, single-item life satisfaction measures demonstrated substantial degree of criterion validity with the SWLS (zero-order r = 0.62-0.64; disattenuated r = 0.78-0.80). Patterns of statistical significance for correlations with theoretically relevant variables were the same across single-item measures and the SWLS. Single-item measures did not produce systematically different correlations compared to the SWLS (average difference = 0.001-0.005). The average absolute difference in the magnitudes of the correlations produced by single-item measures and the SWLS was very small (average absolute difference = 0.015-0.042). Single-item life satisfaction measures performed very similarly compared to the multiple-item SWLS. Social scientists would get virtually identical answer to substantive questions regardless of which measure they use.
A signal detection-item response theory model for evaluating neuropsychological measures.

Science.gov (United States)

Thomas, Michael L; Brown, Gregory G; Gur, Ruben C; Moore, Tyler M; Patt, Virginie M; Risbrough, Victoria B; Baker, Dewleen G

2018-02-05

Models from signal detection theory are commonly used to score neuropsychological test data, especially tests of recognition memory. Here we show that certain item response theory models can be formulated as signal detection theory models, thus linking two complementary but distinct methodologies. We then use the approach to evaluate the validity (construct representation) of commonly used research measures, demonstrate the impact of conditional error on neuropsychological outcomes, and evaluate measurement bias. Signal detection-item response theory (SD-IRT) models were fitted to recognition memory data for words, faces, and objects. The sample consisted of U.S. Infantry Marines and Navy Corpsmen participating in the Marine Resiliency Study. Data comprised item responses to the Penn Face Memory Test (PFMT; N = 1,338), Penn Word Memory Test (PWMT; N = 1,331), and Visual Object Learning Test (VOLT; N = 1,249), and self-report of past head injury with loss of consciousness. SD-IRT models adequately fitted recognition memory item data across all modalities. Error varied systematically with ability estimates, and distributions of residuals from the regression of memory discrimination onto self-report of past head injury were positively skewed towards regions of larger measurement error. Analyses of differential item functioning revealed little evidence of systematic bias by level of education. SD-IRT models benefit from the measurement rigor of item response theory-which permits the modeling of item difficulty and examinee ability-and from signal detection theory-which provides an interpretive framework encompassing the experimentally validated constructs of memory discrimination and response bias. We used this approach to validate the construct representation of commonly used research measures and to demonstrate how nonoptimized item parameters can lead to erroneous conclusions when interpreting neuropsychological test data. Future work might include the
Assessing the Validity of Single-item Life Satisfaction Measures: Results from Three Large Samples

Science.gov (United States)

Cheung, Felix; Lucas, Richard E.

2014-01-01

Purpose The present paper assessed the validity of single-item life satisfaction measures by comparing single-item measures to the Satisfaction with Life Scale (SWLS) - a more psychometrically established measure. Methods Two large samples from Washington (N=13,064) and Oregon (N=2,277) recruited by the Behavioral Risk Factor Surveillance System (BRFSS) and a representative German sample (N=1,312) recruited by the Germany Socio-Economic Panel (GSOEP) were included in the present analyses. Single-item life satisfaction measures and the SWLS were correlated with theoretically relevant variables, such as demographics, subjective health, domain satisfaction, and affect. The correlations between the two life satisfaction measures and these variables were examined to assess the construct validity of single-item life satisfaction measures. Results Consistent across three samples, single-item life satisfaction measures demonstrated substantial degree of criterion validity with the SWLS (zero-order r = 0.62 – 0.64; disattenuated r = 0.78 – 0.80). Patterns of statistical significance for correlations with theoretically relevant variables were the same across single-item measures and the SWLS. Single-item measures did not produce systematically different correlations compared to the SWLS (average difference = 0.001 – 0.005). The average absolute difference in the magnitudes of the correlations produced by single-item measures and the SWLS were very small (average absolute difference = 0.015 −0.042). Conclusions Single-item life satisfaction measures performed very similarly compared to the multiple-item SWLS. Social scientists would get virtually identical answer to substantive questions regardless of which measure they use. PMID:24890827
Varying the item format improved the range of measurement in patient-reported outcome measures assessing physical function.

Science.gov (United States)

Liegl, Gregor; Gandek, Barbara; Fischer, H Felix; Bjorner, Jakob B; Ware, John E; Rose, Matthias; Fries, James F; Nolte, Sandra

2017-03-21

Physical function (PF) is a core patient-reported outcome domain in clinical trials in rheumatic diseases. Frequently used PF measures have ceiling effects, leading to large sample size requirements and low sensitivity to change. In most of these instruments, the response category that indicates the highest PF level is the statement that one is able to perform a given physical activity without any limitations or difficulty. This study investigates whether using an item format with an extended response scale, allowing respondents to state that the performance of an activity is easy or very easy, increases the range of precise measurement of self-reported PF. Three five-item PF short forms were constructed from the Patient-Reported Outcomes Measurement Information System (PROMIS®) wave 1 data. All forms included the same physical activities but varied in item stem and response scale: format A ("Are you able to …"; "without any difficulty"/"unable to do"); format B ("Does your health now limit you …"; "not at all"/"cannot do"); format C ("How difficult is it for you to …"; "very easy"/"impossible"). Each short-form item was answered by 2217-2835 subjects. We evaluated unidimensionality and estimated a graded response model for the 15 short-form items and remaining 119 items of the PROMIS PF bank to compare item and test information for the short forms along the PF continuum. We then used simulated data for five groups with different PF levels to illustrate differences in scoring precision between the short forms using different item formats. Sufficient unidimensionality of all short-form items and the original PF item bank was supported. Compared to formats A and B, format C increased the range of reliable measurement by about 0.5 standard deviations on the positive side of the PF continuum of the sample, provided more item information, and was more useful in distinguishing known groups with above-average functioning. Using an item format with an extended
The measurement of tritium in Canadian food items

International Nuclear Information System (INIS)

Brown, R.M.

1995-03-01

Food items locally grown near Perth, Ontario and grocery store produce and locally grown items from the Pickering-Ajax area in the vicinity of the Pickering Nuclear Generating Station (PNGS) have been analyzed for free water tritium (HTO) and organically bound tritium (OBT). The technique of measuring 3 He ingrowth in samples by mass spectrometry has been used because of its sensitivity and freedom from opportunity for contamination during processing and measurement. Concentrations observed at each site were of the order expected on the basis of known levels of tritium in the local atmosphere and precipitation. There was considerable variation between different materials and limited correlation between materials of a single type. (author). 10 refs., 8 tabs., 4 figs
A measure of satisfaction with food-related life

DEFF Research Database (Denmark)

Grunert, Klaus G.; Dean, Moira; Raats, Monique M.

2007-01-01

A measure of satisfaction with food-related life is developed and tested in three studies in eight European countries. Five items are retained from an original pool of seven; these items exhibit good reliability as measured by Cronbach's alpha, good temporal stability, convergent validity with two...... related measures, and construct validity as indicated by relationships with other indicators of quality of life, including the Satisfaction With Life and the SF-8 scales. It is concluded that this scale will be useful in studies trying to identify factors contributing to satisfaction with food...
Harmonizing Measures of Cognitive Performance Across International Surveys of Aging Using Item Response Theory.

Science.gov (United States)

Chan, Kitty S; Gross, Alden L; Pezzin, Liliana E; Brandt, Jason; Kasper, Judith D

2015-12-01

To harmonize measures of cognitive performance using item response theory (IRT) across two international aging studies. Data for persons ≥65 years from the Health and Retirement Study (HRS, N = 9,471) and the English Longitudinal Study of Aging (ELSA, N = 5,444). Cognitive performance measures varied (HRS fielded 25, ELSA 13); 9 were in common. Measurement precision was examined for IRT scores based on (a) common items, (b) common items adjusted for differential item functioning (DIF), and (c) DIF-adjusted all items. Three common items (day of date, immediate word recall, and delayed word recall) demonstrated DIF by survey. Adding survey-specific items improved precision but mainly for HRS respondents at lower cognitive levels. IRT offers a feasible strategy for harmonizing cognitive performance measures across other surveys and for other multi-item constructs of interest in studies of aging. Practical implications depend on sample distribution and the difficulty mix of in-common and survey-specific items. © The Author(s) 2015.
Reduced-Item Food Audits Based on the Nutrition Environment Measures Surveys.

Science.gov (United States)

Partington, Susan N; Menzies, Tim J; Colburn, Trina A; Saelens, Brian E; Glanz, Karen

2015-10-01

The community food environment may contribute to obesity by influencing food choice. Store and restaurant audits are increasingly common methods for assessing food environments, but are time consuming and costly. A valid, reliable brief measurement tool is needed. The purpose of this study was to develop and validate reduced-item food environment audit tools for stores and restaurants. Nutrition Environment Measures Surveys for stores (NEMS-S) and restaurants (NEMS-R) were completed in 820 stores and 1,795 restaurants in West Virginia, San Diego, and Seattle. Data mining techniques (correlation-based feature selection and linear regression) were used to identify survey items highly correlated to total survey scores and produce reduced-item audit tools that were subsequently validated against full NEMS surveys. Regression coefficients were used as weights that were applied to reduced-item tool items to generate comparable scores to full NEMS surveys. Data were collected and analyzed in 2008-2013. The reduced-item tools included eight items for grocery, ten for convenience, seven for variety, and five for other stores; and 16 items for sit-down, 14 for fast casual, 19 for fast food, and 13 for specialty restaurants-10% of the full NEMS-S and 25% of the full NEMS-R. There were no significant differences in median scores for varying types of retail food outlets when compared to the full survey scores. Median in-store audit time was reduced 25%-50%. Reduced-item audit tools can reduce the burden and complexity of large-scale or repeated assessments of the retail food environment without compromising measurement quality. Copyright © 2015 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.
Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

Science.gov (United States)

Suh, Youngsuk

2016-01-01

This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…
Psychometric properties of the Triarchic Psychopathy Measure: An item response theory approach.

Science.gov (United States)

Shou, Yiyun; Sellbom, Martin; Xu, Jing

2018-05-01

There is cumulative evidence for the cross-cultural validity of the Triarchic Psychopathy Measure (TriPM; Patrick, 2010) among non-Western populations. Recent studies using correlational and regression analyses show promising construct validity of the TriPM in Chinese samples. However, little is known about the efficiency of items in TriPM in assessing the proposed latent traits. The current study evaluated the psychometric properties of the Chinese TriPM at the item level using item response theory analyses. It also examined the measurement invariance of the TriPM between the Chinese and the U.S. student samples by applying differential item functioning analyses under the item response theory framework. The results supported the unidimensional nature of the Disinhibition and Meanness scales. Both scales had a greater level of precision in the respective underlying constructs at the positive ends. The two scales, however, had several items that were weakly associated with their respective latent traits in the Chinese student sample. Boldness, on the other hand, was found to be multidimensional, and reflected a more normally distributed range of variation. The examination of measurement bias via differential item functioning analyses revealed that a number of items of the TriPM were not equivalent across the Chinese and the U.S. Some modification and adaptation of items might be considered for improving the precision of the TriPM for Chinese participants. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Relational and item-specific influences on generate-recognize processes in recall.

Science.gov (United States)

Guynn, Melissa J; McDaniel, Mark A; Strosser, Garrett L; Ramirez, Juan M; Castleberry, Erica H; Arnett, Kristen H

2014-02-01

The generate-recognize model and the relational-item-specific distinction are two approaches to explaining recall. In this study, we consider the two approaches in concert. Following Jacoby and Hollingshead (Journal of Memory and Language 29:433-454, 1990), we implemented a production task and a recognition task following production (1) to evaluate whether generation and recognition components were evident in cued recall and (2) to gauge the effects of relational and item-specific processing on these components. An encoding task designed to augment item-specific processing (anagram-transposition) produced a benefit on the recognition component (Experiments 1-3) but no significant benefit on the generation component (Experiments 1-3), in the context of a significant benefit to cued recall. By contrast, an encoding task designed to augment relational processing (category-sorting) did produce a benefit on the generation component (Experiment 3). These results converge on the idea that in recall, item-specific processing impacts a recognition component, whereas relational processing impacts a generation component.
Robust Measurement via A Fused Latent and Graphical Item Response Theory Model.

Science.gov (United States)

Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Ying, Zhiliang

2018-03-12

Item response theory (IRT) plays an important role in psychological and educational measurement. Unlike the classical testing theory, IRT models aggregate the item level information, yielding more accurate measurements. Most IRT models assume local independence, an assumption not likely to be satisfied in practice, especially when the number of items is large. Results in the literature and simulation studies in this paper reveal that misspecifying the local independence assumption may result in inaccurate measurements and differential item functioning. To provide more robust measurements, we propose an integrated approach by adding a graphical component to a multidimensional IRT model that can offset the effect of unknown local dependence. The new model contains a confirmatory latent variable component, which measures the targeted latent traits, and a graphical component, which captures the local dependence. An efficient proximal algorithm is proposed for the parameter estimation and structure learning of the local dependence. This approach can substantially improve the measurement, given no prior information on the local dependence structure. The model can be applied to measure both a unidimensional latent trait and multidimensional latent traits.
Measuring stigma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Stigma item bank and short form.

Science.gov (United States)

Kisala, Pamela A; Tulsky, David S; Pace, Natalie; Victorson, David; Choi, Seung W; Heinemann, Allen W

2015-05-01

To develop a calibrated item bank and computer adaptive test (CAT) to assess the effects of stigma on health-related quality of life in individuals with spinal cord injury (SCI). Grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, and item response theory (IRT)-based psychometric analyses. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. SCI-QOL Stigma Item Bank A sample of 611 individuals with traumatic SCI completed 30 items assessing SCI-related stigma. After 7 items were iteratively removed, factor analyses confirmed a unidimensional pool of items. Graded Response Model IRT analyses were used to estimate slopes and thresholds for the final 23 items. The SCI-QOL Stigma item bank is unique not only in the assessment of SCI-related stigma but also in the inclusion of individuals with SCI in all phases of its development. Use of confirmatory factor analytic and IRT methods provide flexibility and precision of measurement. The item bank may be administered as a CAT or as a 10-item fixed-length short form and can be used for research and clinical applications.
Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.

Science.gov (United States)

Cappelleri, Joseph C; Jason Lundy, J; Hays, Ron D

2014-05-01

The US Food and Drug Administration's guidance for industry document on patient-reported outcomes (PRO) defines content validity as "the extent to which the instrument measures the concept of interest" (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity" (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. We review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized "difficulty" (severity) order of items is represented by observed responses. If a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. Classical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures. Copyright © 2014 Elsevier HS Journals, Inc. All rights reserved.

Few items in the thyroid-related quality of life instrument ThyPRO exhibited differential item functioning.

Science.gov (United States)

Watt, Torquil; Groenvold, Mogens; Hegedüs, Laszlo; Bonnema, Steen Joop; Rasmussen, Åse Krogh; Feldt-Rasmussen, Ulla; Bjorner, Jakob Bue

2014-02-01

To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis. A total of 838 patients with benign thyroid diseases completed the ThyPRO questionnaire (84 five-point items, 13 scales). Uniform and nonuniform DIF were investigated using ordinal logistic regression, testing for both statistical significance and magnitude (∆R(2) > 0.02). Scale level was estimated by the sum score, after purification. Twenty instances of DIF in 17 of the 84 items were found. Eight according to diagnosis, where the goiter scale was the one most affected, possibly due to differing perceptions in patients with auto-immune thyroid diseases compared to patients with simple goiter. Eight DIFs according to age were found, of which 5 were in positively worded items, which younger patients were more likely to endorse; one according to gender: women were more likely to report crying, and three according to educational level. The vast majority of DIF had only minor influence on the scale scores (0.1-2.3 points on the 0-100 scales), but two DIF corresponded to a difference of 4.6 and 9.8, respectively. Ordinal logistic regression identified DIF in 17 of 84 items. The potential impact of this on the present scales was low, but items displaying DIF could be avoided when developing abbreviated scales, where the potential impact of DIF (due to fewer items) will be larger.
Cross-National Prevalence of Traditional Bullying, Traditional Victimization, Cyberbullying and Cyber-Victimization: Comparing Single-Item and Multiple-Item Approaches of Measurement

Science.gov (United States)

Yanagida, Takuya; Gradinger, Petra; Strohmeier, Dagmar; Solomontos-Kountouri, Olga; Trip, Simona; Bora, Carmen

2016-01-01

Many large-scale cross-national studies rely on a single-item measurement when comparing prevalence rates of traditional bullying, traditional victimization, cyberbullying, and cyber-victimization between countries. However, the reliability and validity of single-item measurement approaches are highly problematic and might be biased. Data from…
More is not Always Better: The Relation between Item Response and Item Response Time in Raven’s Matrices

Directory of Open Access Journals (Sweden)

Frank Goldhammer

2015-03-01

Full Text Available The role of response time in completing an item can have very different interpretations. Responding more slowly could be positively related to success as the item is answered more carefully. However, the association may be negative if working faster indicates higher ability. The objective of this study was to clarify the validity of each assumption for reasoning items considering the mode of processing. A total of 230 persons completed a computerized version of Raven’s Advanced Progressive Matrices test. Results revealed that response time overall had a negative effect. However, this effect was moderated by items and persons. For easy items and able persons the effect was strongly negative, for difficult items and less able persons it was less negative or even positive. The number of rules involved in a matrix problem proved to explain item difficulty significantly. Most importantly, a positive interaction effect between the number of rules and item response time indicated that the response time effect became less negative with an increasing number of rules. Moreover, exploratory analyses suggested that the error type influenced the response time effect.
Measurement equivalence and differential item functioning in family psychology.

Science.gov (United States)

Bingenheimer, Jeffrey B; Raudenbush, Stephen W; Leventhal, Tama; Brooks-Gunn, Jeanne

2005-09-01

Several hypotheses in family psychology involve comparisons of sociocultural groups. Yet the potential for cross-cultural inequivalence in widely used psychological measurement instruments threatens the validity of inferences about group differences. Methods for dealing with these issues have been developed via the framework of item response theory. These methods deal with an important type of measurement inequivalence, called differential item functioning (DIF). The authors introduce DIF analytic methods, linking them to a well-established framework for conceptualizing cross-cultural measurement equivalence in psychology (C.H. Hui and H.C. Triandis, 1985). They illustrate the use of DIF methods using data from the Project on Human Development in Chicago Neighborhoods (PHDCN). Focusing on the Caregiver Warmth and Environmental Organization scales from the PHDCN's adaptation of the Home Observation for Measurement of the Environment Inventory, the authors obtain results that exemplify the range of outcomes that may result when these methods are applied to psychological measurement instruments. (c) 2005 APA, all rights reserved
Refinement of the Brazilian Household Food Insecurity Measurement Scale: Recommendation for a 14-item EBIA

Directory of Open Access Journals (Sweden)

Ana Maria Segall-Corrêa

2014-04-01

Full Text Available OBJECTIVE: To review and refine Brazilian Household Food Insecurity Measurement Scale structure. METHODS: The study analyzed the impact of removing the item "adult lost weight" and one of two possibly redundant items on Brazilian Household Food Insecurity Measurement Scale psychometric behavior using the one-parameter logistic (Rasch model. Brazilian Household Food Insecurity Measurement Scale psychometric behavior was analyzed with respect to acceptable adjustment values ranging from 0.7 to 1.3, and to severity scores of the items with theoretically expected gradients. The socioeconomic and food security indicators came from the 2004 National Household Sample Survey, which obtained complete answers to Brazilian Household Food Insecurity Measurement Scale items from 112,665 households. RESULTS: Removing the items "adult reduced amount..." followed by "adult ate less..." did not change the infit of the remaining items, except for "adult lost weight", whose infit increased from 1.21 to 1.56. The internal consistency and item severity scores did not change when "adult ate less" and one of the two redundant items were removed. CONCLUSION: Brazilian Household Food Insecurity Measurement Scale reanalysis reduced the number of scale items from 16 to 14 without changing its internal validity. Its use as a nationwide household food security measure is strongly recommended.
Varying the item format improved the range of measurement in patient-reported outcome measures assessing physical function

DEFF Research Database (Denmark)

Liegl, Gregor; Gandek, Barbara; Fischer, H. Felix

2017-01-01

precision between the short forms using different item formats. Results: Sufficient unidimensionality of all short-form items and the original PF item bank was supported. Compared to formats A and B, format C increased the range of reliable measurement by about 0.5 standard deviations on the positive side...
Memory deficit in patients with schizophrenia and posttraumatic stress disorder: relational vs item-specific memory

Directory of Open Access Journals (Sweden)

Jung W

2016-05-01

Full Text Available Wookyoung Jung,1 Seung-Hwan Lee1,2 1Clinical Emotions and Cognition Research Laboratory, Department of Psychiatry, Inje University, Ilsan-Paik Hospital, 2Department of Psychiatry, Inje University, Ilsan-Paik Hospital, Goyang, Korea Abstract: It has been well established that patients with schizophrenia have impairments in cognitive functioning and also that patients who experienced traumatic events suffer from cognitive deficits. Of the cognitive deficits revealed in schizophrenia or posttraumatic stress disorder (PTSD patients, the current article provides a brief review of deficit in episodic memory, which is highly predictive of patients’ quality of life and global functioning. In particular, we have focused on studies that compared relational and item-specific memory performance in schizophrenia and PTSD, because measures of relational and item-specific memory are considered the most promising constructs for immediate tangible development of clinical trial paradigm. The behavioral findings of schizophrenia are based on the tasks developed by the Cognitive Neuroscience Treatment Research to Improve Cognition in Schizophrenia (CNTRICS initiative and the Cognitive Neuroscience Test Reliability and Clinical Applications for Schizophrenia (CNTRACS Consortium. The findings we reviewed consistently showed that schizophrenia and PTSD are closely associated with more severe impairments in relational memory compared to item-specific memory. Candidate brain regions involved in relational memory impairment in schizophrenia and PTSD are also discussed. Keywords: schizophrenia, posttraumatic stress disorder, episodic memory deficit, relational memory, item-specific memory, prefrontal cortex, hippocampus
Radioactivity measurement in imported food and food related items

International Nuclear Information System (INIS)

Sombrito, E.Z.; Santos, F.L.; Rosa, A.M. de la; Tangonan, M.C.; Bulos, A.D.; Nuguid, Z.F.

1989-01-01

The Philippine Nuclear Research Institute (PNRI), formerly Philippine Atomic Energy Commission (PAEC) undertook the radioactivity monitoring of imported food and food-related products after the Chernobyl Plant accident in April 1986. Food samples were analyzed for 137 Cs and 134 Cs by gamma spectral method of analysis. This report deals with the measurement process and gives the result of the activity covering the period June 1986 to December 1987. (Auth.). 9 tabs., 7 figs., 4 refs
Separating relational from item load effects in paired recognition: temporoparietal and middle frontal gyral activity with increased associates, but not items during encoding and retention.

Science.gov (United States)

Phillips, Steven; Niki, Kazuhisa

2002-10-01

Working memory is affected by items stored and the relations between them. However, separating these factors has been difficult, because increased items usually accompany increased associations/relations. Hence, some have argued, relational effects are reducible to item effects. We overcome this problem by manipulating index length: the fewest number of item positions at which there is a unique item, or tuple of items (if length >1), for every instance in the relational (memory) set. Longer indexes imply greater similarity (number of shared items) between instances and higher load on encoding processes. Subjects were given lists of study pairs and asked to make a recognition judgement. The number of unique items and index length in the three list conditions were: (1) AB, CD: four/one; (2) AB, CD, EF: six/one; and (3) AB, AD, CB: four/two, respectively. Japanese letters were used in Experiments 1 (kanji-ideograms) and 2 (hiragana-phonograms); numbers in Experiment 3; and shapes generated from Fourier descriptors in Experiment 4. Across all materials, right dominant temporoparietal and middle frontal gyral activity was found with increased index length, but not items during study. In Experiment 5, a longer delay was used to isolate retention effects in the absence of visual stimuli. Increased left hemispheric activity was observed in the precuneus, middle frontal gyrus, and superior temporal gyrus with increased index length for the delay period. These results show that relational load is not reducible to item load.
A single-item global job satisfaction measure is associated with quantitative blood immune indices in white-collar employees.

Science.gov (United States)

Nakata, Akinori; Irie, Masahiro; Takahashi, Masaya

2013-01-01

Although a single-item job satisfaction measure has been shown to be reliable and inclusive as multiple-item scales in relation to health, studies including immunological data are few. The purpose of this study was to evaluate the validity of single-item job and family life satisfaction based on its association with immune indices. A total of 189 white-collar employees (70% men) underwent a blood draw for the measurement of natural killer (NK), total T, and B cell counts as well as plasma immunoglobulin (Ig) G concentrations and completed single-item job and family life satisfaction measures, respectively. The response options for satisfaction measures were 'dissatisfied' (coded 1) to 'satisfied' (coded 4). Spearman's partial correlations controlling for cofactors revealed that increased job satisfaction was positively associated with NK cells (rsp=0.201, p=0.007) and IgG (rsp=0.178, p=0.018), while family life satisfaction was unrelated to immune indices. Those who reported a combination of low job/low family life satisfaction had significantly lower NK and higher B cell counts than those with a high job/high family life satisfaction. Our study suggests that the single-item summary measure of job satisfaction, but not family life satisfaction, may be a valid tool to evaluate immune status in healthy white-collar employees.
Using Item Response Theory to Develop Measures of Acquisitive and Protective Self-Monitoring From the Original Self-Monitoring Scale.

Science.gov (United States)

Wilmot, Michael P; Kostal, Jack W; Stillwell, David; Kosinski, Michal

2017-07-01

For the past 40 years, the conventional univariate model of self-monitoring has reigned as the dominant interpretative paradigm in the literature. However, recent findings associated with an alternative bivariate model challenge the conventional paradigm. In this study, item response theory is used to develop measures of the bivariate model of acquisitive and protective self-monitoring using original Self-Monitoring Scale (SMS) items, and data from two large, nonstudent samples ( Ns = 13,563 and 709). Results indicate that the new acquisitive (six-item) and protective (seven-item) self-monitoring scales are reliable, unbiased in terms of gender and age, and demonstrate theoretically consistent relations to measures of personality traits and cognitive ability. Additionally, by virtue of using original SMS items, previously collected responses can be reanalyzed in accordance with the alternative bivariate model. Recommendations for the reanalysis of archival SMS data, as well as directions for future research, are provided.
The effect of the spatial positioning of items on the reliability of questionnaires measuring affect

Directory of Open Access Journals (Sweden)

Leigh Leo

2016-08-01

Full Text Available Orientation: Extant research has shown that the relationship between spatial location and affect may have pervasive effects on evaluation. In particular, experimental findings on embodied cognition indicate that a person is spatially orientated to position what is positive at the top and what is negative at the bottom (vertical spatial orientation, and to a lesser extent, to position what is positive on the left and what is negative on the right (horizontal spatial orientation. It is therefore hypothesised, that when there is congruence between a respondent’s spatial orientation (related to affect and the spatial positioning (layout of a questionnaire, the reliability will be higher than in the case of incongruence. Research purpose: The principal objective of the two studies reported here was to ascertain the extent to which congruence between a respondent’s spatial orientation (related to affect and the layout of the questionnaire (spatial positioning of questionnaire items may impact on the reliability of a questionnaire measuring affect. Motivation for the study: The spatial position of items on a questionnaire measuring affect may indirectly impact on the reliability of the questionnaire. Research approach, design and method: In both studies, a controlled experimental research design was conducted using a sample of university students (n = 1825. Major findings: In both experiments, evidence was found to support the hypothesis that greater congruence between a respondent’s spatial orientation (related to affect and the spatial positioning (layout of a questionnaire leads to higher reliability on a questionnaire measuring affect. Practical implications: These findings may serve to create awareness of the influence of the spatial positioning of items as a confounding variable in questionnaire design. Contribution/value-add: Overall, this research complements previous studies by confirming the metaphorical representation of affect and
A more general model for testing measurement invariance and differential item functioning.

Science.gov (United States)

Bauer, Daniel J

2017-09-01

The evaluation of measurement invariance is an important step in establishing the validity and comparability of measurements across individuals. Most commonly, measurement invariance has been examined using 1 of 2 primary latent variable modeling approaches: the multiple groups model or the multiple-indicator multiple-cause (MIMIC) model. Both approaches offer opportunities to detect differential item functioning within multi-item scales, and thereby to test measurement invariance, but both approaches also have significant limitations. The multiple groups model allows 1 to examine the invariance of all model parameters but only across levels of a single categorical individual difference variable (e.g., ethnicity). In contrast, the MIMIC model permits both categorical and continuous individual difference variables (e.g., sex and age) but permits only a subset of the model parameters to vary as a function of these characteristics. The current article argues that moderated nonlinear factor analysis (MNLFA) constitutes an alternative, more flexible model for evaluating measurement invariance and differential item functioning. We show that the MNLFA subsumes and combines the strengths of the multiple group and MIMIC models, allowing for a full and simultaneous assessment of measurement invariance and differential item functioning across multiple categorical and/or continuous individual difference variables. The relationships between the MNLFA model and the multiple groups and MIMIC models are shown mathematically and via an empirical demonstration. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Evaluation of psychometric properties and differential item functioning of 8-item Child Perceptions Questionnaires using item response theory.

Science.gov (United States)

Yau, David T W; Wong, May C M; Lam, K F; McGrath, Colman

2015-08-19

Four-factor structure of the two 8-item short forms of Child Perceptions Questionnaire CPQ11-14 (RSF:8 and ISF:8) has been confirmed. However, the sum scores are typically reported in practice as a proxy of Oral health-related Quality of Life (OHRQoL), which implied a unidimensional structure. This study first assessed the unidimensionality of 8-item short forms of CPQ11-14. Item response theory (IRT) was employed to offer an alternative and complementary approach of validation and to overcome the limitations of classical test theory assumptions. A random sample of 649 12-year-old school children in Hong Kong was analyzed. Unidimensionality of the scale was tested by confirmatory factor analysis (CFA), principle component analysis (PCA) and local dependency (LD) statistic. Graded response model was fitted to the data. Contribution of each item to the scale was assessed by item information function (IIF). Reliability of the scale was assessed by test information function (TIF). Differential item functioning (DIF) across gender was identified by Wald test and expected score functions. Both CPQ11-14 RSF:8 and ISF:8 did not deviate much from the unidimensionality assumption. Results from CFA indicated acceptable fit of the one-factor model. PCA indicated that the first principle component explained >30 % of the total variation with high factor loadings for both RSF:8 and ISF:8. Almost all LD statistic items suggesting little contribution of information to the scale and item removal caused little practical impact. Comparing the TIFs, RSF:8 showed slightly better information than ISF:8. In addition to oral symptoms items, the item "Concerned with what other people think" demonstrated a uniform DIF (p Items related to oral symptoms were not informative to OHRQoL and deletion of these items is suggested. The impact of DIF across gender on the overall score was minimal. CPQ11-14 RSF:8 performed slightly better than ISF:8 in measurement precision. The 6-item short forms
Validation of the PROMIS Sleep Disturbance and Sleep-Related Impairment item banks in Dutch adolescents.

Science.gov (United States)

van Kooten, Jojanneke A M C; van Litsenburg, Raphaёle R L; Yoder, Whitney R; Kaspers, Gertjan J L; Terwee, Caroline B

2018-04-16

Sleep problems are common in adolescents and have a negative impact on daytime functioning. However, there is a lack of well-validated adolescent sleep questionnaires. The Patient-Reported Outcomes Measurement Information System (PROMIS) Sleep Disturbance and Sleep-Related Impairment item banks are well-validated instruments developed for and tested in adults. The aim of this study was to evaluate their structural validity in adolescents. Test and retest data were collected for the Dutch-Flemish V1.0 PROMIS Sleep Disturbance (27) and Sleep-Related Impairment (16 items) item banks from 1046 adolescents (11-19 years). Cross-validation methods, Confirmatory (CFA), and Exploratory Factor Analyses (EFA) were used. Fit indices and factor loadings were used to improve the models. The final models were assessed for model fit using retest data. The one-factor Sleep Disturbance (CFI = 0.795, TLI = 0.778, RMSEA = 0.117) and Sleep-Related Impairment (CFI = 0.897, TLI = 0.882, RMSEA = 0.156) models could not be replicated in adolescents. Cross-validation resulted in a final Sleep Disturbance model of 23 and a Sleep-Related Impairment model of 11 items. Retest data CFA showed adequate fit for the Sleep-Related Impairment-11 (CFI = 0.981, TLI = 0.976, RMSEA = 0.116). The Sleep Disturbance-23 model fit indices stayed below the recommended values (CFI = 0.895, TLI = 0.885, RMSEA = 0.105). While the PROMIS Sleep Disturbance-23 for adolescents and PROMIS Sleep-Related Impairment-11 for adolescents provide a framework to assess adolescent sleep, additional research is needed to replicate these findings in a larger and more diverse sample.
Sharing the cost of redundant items

DEFF Research Database (Denmark)

Hougaard, Jens Leth; Moulin, Hervé

2014-01-01

We ask how to share the cost of finitely many public goods (items) among users with different needs: some smaller subsets of items are enough to serve the needs of each user, yet the cost of all items must be covered, even if this entails inefficiently paying for redundant items. Typical examples...... are network connectivity problems when an existing (possibly inefficient) network must be maintained. We axiomatize a family cost ratios based on simple liability indices, one for each agent and for each item, measuring the relative worth of this item across agents, and generating cost allocation rules...... additive in costs....
An Introduction to Item Response Theory for Patient-Reported Outcome Measurement

Science.gov (United States)

Nguyen, Tam H.; Han, Hae-Ra; Kim, Miyong T.

2015-01-01

The growing emphasis on patient-centered care has accelerated the demand for high-quality data from patient-reported outcome (PRO) measures. Traditionally, the development and validation of these measures has been guided by classical test theory. However, item response theory (IRT), an alternate measurement framework, offers promise for addressing practical measurement problems found in health-related research that have been difficult to solve through classical methods. This paper introduces foundational concepts in IRT, as well as commonly used models and their assumptions. Existing data on a combined sample (n = 636) of Korean American and Vietnamese American adults who responded to the High Blood Pressure Health Literacy Scale and the Patient Health Questionnaire-9 are used to exemplify typical applications of IRT. These examples illustrate how IRT can be used to improve the development, refinement, and evaluation of PRO measures. Greater use of methods based on this framework can increase the accuracy and efficiency with which PROs are measured. PMID:24403095
Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank.

Science.gov (United States)

Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Vonkeman, Harald E; van de Laar, Mart A F J

2017-11-01

Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Seventy-two items generated from patient interviews and mapped to the International Classification of Functioning, Disability and Health (ICF) domestic life chapter were administered to 1128 adults representative of the Dutch population. The partial credit model was fitted to the item responses and evaluated with respect to its assumptions, model fit, and differential item functioning (DIF). Measurement performance of a computerized adaptive testing (CAT) algorithm was compared with the SF-36 physical functioning scale (PF-10). A final bank of 41 items was developed. All items demonstrated acceptable fit to the partial credit model and measurement invariance across age, sex, and educational level. Five- and ten-item CAT simulations were shown to have high measurement precision, which exceeded that of SF-36 physical functioning scale across the physical function continuum. Floor effects were absent for a 10-item empirical CAT simulation, and ceiling effects were low (13.5%) compared with SF-36 physical functioning (38.1%). CAT also discriminated better than SF-36 physical functioning between age groups, number of chronic conditions, and respondents with or without rheumatic conditions. The Rasch assessment of everyday activity limitations (REAL) item bank will hopefully prove a useful instrument for assessing everyday activity limitations. T-scores obtained using derived measures can be used to benchmark physical function outcomes against the general Dutch adult population.
Psychometric evaluation of Persian Nomophobia Questionnaire: Differential item functioning and measurement invariance across gender.

Science.gov (United States)

Lin, Chung-Ying; Griffiths, Mark D; Pakpour, Amir H

2018-03-01

Background and aims Research examining problematic mobile phone use has increased markedly over the past 5 years and has been related to "no mobile phone phobia" (so-called nomophobia). The 20-item Nomophobia Questionnaire (NMP-Q) is the only instrument that assesses nomophobia with an underlying theoretical structure and robust psychometric testing. This study aimed to confirm the construct validity of the Persian NMP-Q using Rasch and confirmatory factor analysis (CFA) models. Methods After ensuring the linguistic validity, Rasch models were used to examine the unidimensionality of each Persian NMP-Q factor among 3,216 Iranian adolescents and CFAs were used to confirm its four-factor structure. Differential item functioning (DIF) and multigroup CFA were used to examine whether males and females interpreted the NMP-Q similarly, including item content and NMP-Q structure. Results Each factor was unidimensional according to the Rach findings, and the four-factor structure was supported by CFA. Two items did not quite fit the Rasch models (Item 14: "I would be nervous because I could not know if someone had tried to get a hold of me;" Item 9: "If I could not check my smartphone for a while, I would feel a desire to check it"). No DIF items were found across gender and measurement invariance was supported in multigroup CFA across gender. Conclusions Due to the satisfactory psychometric properties, it is concluded that the Persian NMP-Q can be used to assess nomophobia among adolescents. Moreover, NMP-Q users may compare its scores between genders in the knowledge that there are no score differences contributed by different understandings of NMP-Q items.
Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

Science.gov (United States)

Gierl, Mark J.; Lai, Hollis

2013-01-01

Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

Measuring organizational effectiveness in information and communication technology companies using item response theory.

Science.gov (United States)

Trierweiller, Andréa Cristina; Peixe, Blênio César Severo; Tezza, Rafael; Pereira, Vera Lúcia Duarte do Valle; Pacheco, Waldemar; Bornia, Antonio Cezar; de Andrade, Dalton Francisco

2012-01-01

The aim of this paper is to measure the effectiveness of the organizations Information and Communication Technology (ICT) from the point of view of the manager, using Item Response Theory (IRT). There is a need to verify the effectiveness of these organizations which are normally associated to complex, dynamic, and competitive environments. In academic literature, there is disagreement surrounding the concept of organizational effectiveness and its measurement. A construct was elaborated based on dimensions of effectiveness towards the construction of the items of the questionnaire which submitted to specialists for evaluation. It demonstrated itself to be viable in measuring organizational effectiveness of ICT companies under the point of view of a manager through using Two-Parameter Logistic Model (2PLM) of the IRT. This modeling permits us to evaluate the quality and property of each item placed within a single scale: items and respondents, which is not possible when using other similar tools.
Language-related differential item functioning between English and German PROMIS Depression items is negligible.

Science.gov (United States)

Fischer, H Felix; Wahl, Inka; Nolte, Sandra; Liegl, Gregor; Brähler, Elmar; Löwe, Bernd; Rose, Matthias

2017-12-01

To investigate differential item functioning (DIF) of PROMIS Depression items between US and German samples we compared data from the US PROMIS calibration sample (n = 780), a German general population survey (n = 2,500) and a German clinical sample (n = 621). DIF was assessed in an ordinal logistic regression framework, with 0.02 as criterion for R 2 -change and 0.096 for Raju's non-compensatory DIF. Item parameters were initially fixed to the PROMIS Depression metric; we used plausible values to account for uncertainty in depression estimates. Only four items showed DIF. Accounting for DIF led to negligible effects for the full item bank as well as a post hoc simulated computer-adaptive test (German general population sample was considerably lower compared to the US reference value of 50. Overall, we found little evidence for language DIF between US and German samples, which could be addressed by either replacing the DIF items by items not showing DIF or by scoring the short form in German samples with the corrected item parameters reported. Copyright © 2016 John Wiley & Sons, Ltd.
Re-evaluating a vision-related quality of life questionnaire with item response theory (IRT and differential item functioning (DIF analyses

Directory of Open Access Journals (Sweden)

Knol Dirk L

2011-09-01

Full Text Available Abstract Background For the Low Vision Quality Of Life questionnaire (LVQOL it is unknown whether the psychometric properties are satisfactory when an item response theory (IRT perspective is considered. This study evaluates some essential psychometric properties of the LVQOL questionnaire in an IRT model, and investigates differential item functioning (DIF. Methods Cross-sectional data were used from an observational study among visually-impaired patients (n = 296. Calibration was performed for every dimension of the LVQOL in the graded response model. Item goodness-of-fit was assessed with the S-X2-test. DIF was assessed on relevant background variables (i.e. age, gender, visual acuity, eye condition, rehabilitation type and administration type with likelihood-ratio tests for DIF. The magnitude of DIF was interpreted by assessing the largest difference in expected scores between subgroups. Measurement precision was assessed by presenting test information curves; reliability with the index of subject separation. Results All items of the LVQOL dimensions fitted the model. There was significant DIF on several items. For two items the maximum difference between expected scores exceeded one point, and DIF was found on multiple relevant background variables. Item 1 'Vision in general' from the "Adjustment" dimension and item 24 'Using tools' from the "Reading and fine work" dimension were removed. Test information was highest for the "Reading and fine work" dimension. Indices for subject separation ranged from 0.83 to 0.94. Conclusions The items of the LVQOL showed satisfactory item fit to the graded response model; however, two items were removed because of DIF. The adapted LVQOL with 21 items is DIF-free and therefore seems highly appropriate for use in heterogeneous populations of visually impaired patients.
Image-Based Collection and Measurements for Construction Pay Items

Science.gov (United States)

2017-07-01

Prior to each payment to contractors and suppliers, measurements are made to document the actual amount of pay items placed at the site. This manual process has substantial risk for personnel, and could be made more efficient and less prone to human ...
Improving a measure of mobility-related fatigue (the mobility-tiredness scale) by establishing item intensity

DEFF Research Database (Denmark)

Fieo, Robert A; Mortensen, Erik L; Rantanen, Taina

2013-01-01

To improve the construct validity of self-reported fatigue by establishing a formal hierarchy of scale items and to determine whether such a hierarchy could be maintained across time (aged 75-80), sex, and nationality.......To improve the construct validity of self-reported fatigue by establishing a formal hierarchy of scale items and to determine whether such a hierarchy could be maintained across time (aged 75-80), sex, and nationality....
Decree of the State Office for Nuclear Safety No. 147/1997 of 17 June 1997 specifying lists of selected nuclear-related items and dual-use nuclear-related items

International Nuclear Information System (INIS)

1997-01-01

The core of the Decree consists of 2 lists, viz (a) the List of Selected Items (selected nuclear-related materials, equipment and technologies) which are subject to control regimes during imports, exports and transit; and (b) the List of Dual-Use Items (dual-use nuclear-related materials, equipment and technologies) which are subject to control regimes during imports and exports. Both Lists are based on the IAEA document INFCIRC/254/Rev.2/Part 2/Mod.1. (P.A.)
Measurement Properties of Two Innovative Item Formats in a Computer-Based Test

Science.gov (United States)

Wan, Lei; Henly, George A.

2012-01-01

Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…
Why Consumers Misattribute Sponsorships to Non-Sponsor Brands: Differential Roles of Item and Relational Communications.

Science.gov (United States)

Weeks, Clinton S; Humphreys, Michael S; Cornwell, T Bettina

2018-02-01

Brands engaged in sponsorship of events commonly have objectives that depend on consumer memory for the sponsor-event relationship (e.g., sponsorship awareness). Consumers however, often misattribute sponsorships to nonsponsor competitor brands, indicating erroneous memory for these relationships. The current research uses an item and relational memory framework to reveal sponsor brands may inadvertently foster this misattribution when they communicate relational linkages to events. Effects can be explained via differential roles of communicating item information (information that supports processing item distinctiveness) versus relational information (information that supports processing relationships among items) in contributing to memory outcomes. Experiment 1 uses event-cued brand recall to show that correct memory retrieval is best supported by communicating relational information when sponsorship relationships are not obvious (low congruence). In contrast, correct retrieval is best supported by communicating item information when relationships are obvious (high congruence). Experiment 2 uses brand-cued event recall to show that, against conventional marketing recommendations, relational information increases misattribution, whereas item information guards against misattribution. Results suggest sponsor brands must distinguish between item and relational communications to enhance correct retrieval and limit misattribution. Methodologically, the work shows that choice of cueing direction is critical in differentially revealing patterns of correct and incorrect retrieval with pair relationships. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Improving Measurement Efficiency of the Inner EAR Scale with Item Response Theory.

Science.gov (United States)

Jessen, Annika; Ho, Andrew D; Corrales, C Eduardo; Yueh, Bevan; Shin, Jennifer J

2018-02-01

Objectives (1) To assess the 11-item Inner Effectiveness of Auditory Rehabilitation (Inner EAR) instrument with item response theory (IRT). (2) To determine whether the underlying latent ability could also be accurately represented by a subset of the items for use in high-volume clinical scenarios. (3) To determine whether the Inner EAR instrument correlates with pure tone thresholds and word recognition scores. Design IRT evaluation of prospective cohort data. Setting Tertiary care academic ambulatory otolaryngology clinic. Subjects and Methods Modern psychometric methods, including factor analysis and IRT, were used to assess unidimensionality and item properties. Regression methods were used to assess prediction of word recognition and pure tone audiometry scores. Results The Inner EAR scale is unidimensional, and items varied in their location and information. Information parameter estimates ranged from 1.63 to 4.52, with higher values indicating more useful items. The IRT model provided a basis for identifying 2 sets of items with relatively lower information parameters. Item information functions demonstrated which items added insubstantial value over and above other items and were removed in stages, creating a 8- and 3-item Inner EAR scale for more efficient assessment. The 8-item version accurately reflected the underlying construct. All versions correlated moderately with word recognition scores and pure tone averages. Conclusion The 11-, 8-, and 3-item versions of the Inner EAR scale have strong psychometric properties, and there is correlational validity evidence for the observed scores. Modern psychometric methods can help streamline care delivery by maximizing relevant information per item administered.
The Single-Item Math Anxiety Scale: An Alternative Way of Measuring Mathematical Anxiety

Science.gov (United States)

Núñez-Peña, M. Isabel; Guilera, Georgina; Suárez-Pellicioni, Macarena

2014-01-01

This study examined whether the Single-Item Math Anxiety Scale (SIMA), based on the item suggested by Ashcraft, provided valid and reliable scores of mathematical anxiety. A large sample of university students (n = 279) was administered the SIMA and the 25-item Shortened Math Anxiety Rating Scale (sMARS) to evaluate the relation between the scores…
Psychometric properties of the Centers for Disease Control and Prevention Health-Related Quality of Life (CDC HRQOL items in adults with arthritis

Directory of Open Access Journals (Sweden)

DeVellis Robert

2006-09-01

Full Text Available Abstract Background Measuring health-related quality of life (HRQOL is important in arthritis and the SF-36v2 is the current state-of-the-art. It is only emerging how well the Centers for Disease Control and Prevention (CDC HRQOL measures HRQOL for people with arthritis. This study's purpose is to assess the psychometric properties of the 9-item CDC HRQOL (4-item Healthy Days Core Module and 5-item Healthy Days Symptoms Module in an arthritis sample using the SF-36v2 as a comparison. Methods In Fall 2002, a cross-sectional study acquired survey data including the CDC HRQOL and SF-36v2 from 2 North Carolina populations of adult patients reporting osteoarthritis, rheumatoid arthritis, and fibromyalgia; 2182 (52% responded. The first item of both the CDC HRQOL and the SF-36v2 was general health (GEN. All 8 other CDC HRQOL items ask for the number of days in the past 30 days that respondents experienced various aspects of HRQOL. Exploratory principal components analyses (PCA were conducted on each sample and the combined samples of the CDC HRQOL. The multitrait-multimethod matrix (MTMM was used to compute correlations between each trait (physical health and mental health and between each method of measurement (CDC HRQOL and SF36v2. The relative contribution of the CDC HRQOL in predicting the physical component summary (PCS and the mental component summary (MCS was determined by regressing the CDC HRQOL items on the PCS and MCS scales. Results All 9 CDC HRQOL items loaded primarily onto 1 factor (explaining 57% of the item variance representing a reasonable solution for capturing overall HRQOL. After rotation a 2 factor interpretation for the 9 items was clear, with 4 items capturing physical health (physical, activity, pain, and energy days and 3 items capturing mental health (mental, depression, and anxiety days. All of the loadings for these two factors were greater than 0.70. The CDC HRQOL physical health factor correlated with PCS (r = -.78, p 2
A psychometric comparison of three scales and a single-item measure to assess sexual satisfaction.

Science.gov (United States)

Mark, Kristen P; Herbenick, Debby; Fortenberry, J Dennis; Sanders, Stephanie; Reece, Michael

2014-01-01

This study was designed to systematically compare and contrast the psychometric properties of three scales developed to measure sexual satisfaction and a single-item measure of sexual satisfaction. The Index of Sexual Satisfaction (ISS), Global Measure of Sexual Satisfaction (GMSEX), and the New Sexual Satisfaction Scale-Short (NSSS-S) were compared to one another and to a single-item measure of sexual satisfaction. Conceptualization of the constructs, distribution of scores, internal consistency, convergent validity, test-retest reliability, and factor structure were compared between the measures. A total of 211 men and 214 women completed the scales and a measure of relationship satisfaction, with 33% (n = 139) of the sample reassessed two months later. All scales demonstrated appropriate distribution of scores and adequate internal consistency. The GMSEX, NSSS-S, and the single-item measure demonstrated convergent validity. Test-retest reliability was demonstrated by the ISS, GMSEX, and NSSS-S, but not the single-item measure. Taken together, the GMSEX received the strongest psychometric support in this sample for a unidimensional measure of sexual satisfaction and the NSSS-S received the strongest psychometric support in this sample for a bidimensional measure of sexual satisfaction.
The processing of inter-item relations as a moderating factor of retrieval-induced forgetting

OpenAIRE

Tempel, Tobias; Wippich, Werner

2012-01-01

We investigated influences of item generation and emotional valence on retrieval-induced forgetting. Drawing on postulates of the three-factor theory of generation effects, generation tasks differentially affecting the processing of inter-item relations were applied. Whereas retrieval-induced forgetting of freely generated items was moderated by the emotional valence as well as retrieval-induced forgetting of read items, even though in the reverse direction (Experiment 1), fragment completion...
The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency.

Science.gov (United States)

Rose, Matthias; Bjorner, Jakob B; Gandek, Barbara; Bruce, Bonnie; Fries, James F; Ware, John E

2014-05-01

To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. The items were evaluated using qualitative and quantitative methods. A total of 16,065 adults answered item subsets (n>2,200/item) on the Internet, with oversampling of the chronically ill. Classical test and item response theory methods were used to evaluate 149 PROMIS PF items plus 10 Short Form-36 and 20 Health Assessment Questionnaire-Disability Index items. A graded response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living. In simulations, a 10-item computerized adaptive test (CAT) eliminated floor and decreased ceiling effects, achieving higher measurement precision than any comparable length static tool across four SDs of the measurement range. Improved psychometric properties were transferred to the CAT's superior ability to identify differences between age and disease groups. The item bank provides a common metric and can improve the measurement of PF by facilitating the standardization of patient-reported outcome measures and implementation of CATs for more efficient PF assessments over a larger range. Copyright © 2014. Published by Elsevier Inc.
The Role of Item Models in Automatic Item Generation

Science.gov (United States)

Gierl, Mark J.; Lai, Hollis

2012-01-01

Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…
Robustness of two single-item self-esteem measures: cross-validation with a measure of stigma in a sample of psychiatric patients.

Science.gov (United States)

Bagley, Christopher

2005-08-01

Robins' Single-item Self-esteem Inventory was compared with a single item from the Coopersmith Self-esteem. Although a new scoring format was used, there was good evidence of cross-validation in 83 current and former psychiatric patients who completed Harvey's adapted measure of stigma felt and experienced by users of mental health services. Scores on the two single-item self-esteem measures correlated .76 (p self-esteem in users of mental health services.
Measuring the quality of life in hypertension according to Item Response Theory.

Science.gov (United States)

Borges, José Wicto Pereira; Moreira, Thereza Maria Magalhães; Schmitt, Jeovani; Andrade, Dalton Francisco de; Barbetta, Pedro Alberto; Souza, Ana Célia Caetano de; Lima, Daniele Braz da Silva; Carvalho, Irialda Saboia

2017-05-04

To analyze the Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL - Mini-questionnaire of Quality of Life in Hypertension) using the Item Response Theory. This is an analytical study conducted with 712 persons with hypertension treated in thirteen primary health care units of Fortaleza, State of Ceará, Brazil, in 2015. The steps of the analysis by the Item Response Theory were: evaluation of dimensionality, estimation of parameters of items, and construction of scale. The study of dimensionality was carried out on the polychoric correlation matrix and confirmatory factor analysis. To estimate the item parameters, we used the Gradual Response Model of Samejima. The analyses were conducted using the free software R with the aid of psych and mirt. The analysis has allowed the visualization of item parameters and their individual contributions in the measurement of the latent trait, generating more information and allowing the construction of a scale with an interpretative model that demonstrates the evolution of the worsening of the quality of life in five levels. Regarding the item parameters, the items related to the somatic state have had a good performance, as they have presented better power to discriminate individuals with worse quality of life. The items related to mental state have been those which contributed with less psychometric data in the MINICHAL. We conclude that the instrument is suitable for the identification of the worsening of the quality of life in hypertension. The analysis of the MINICHAL using the Item Response Theory has allowed us to identify new sides of this instrument that have not yet been addressed in previous studies. Analisar o Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL) por meio da Teoria da Resposta ao Item. Estudo analítico realizado com 712 pessoas com hipertensão arterial atendidas em 13 unidades de atenção primária em saúde de Fortaleza, CE, em 2015. As etapas da an
Screening for HIV-related PTSD: sensitivity and specificity of the 17-item Posttraumatic Stress Diagnostic Scale (PDS) in identifying HIV-related PTSD among a South African sample.

Science.gov (United States)

Martin, L; Fincham, D; Kagee, A

2009-11-01

The identification of HIV-positive patients who exhibit criteria for Posttraumatic Stress Disorder (PTSD) and related trauma symptomatology is of clinical importance in the maintenance of their overall wellbeing. This study assessed the sensitivity and specificity of the 17-item Posttraumatic Stress Diagnostic Scale (PDS), a self-report instrument, in the detection of HIV-related PTSD. An adapted version of the PTSD module of the Composite International Diagnostic Interview (CIDI) served as the gold standard. 85 HIV-positive patients diagnosed with HIV within the year preceding data collection were recruited by means of convenience sampling from three HIV clinics within primary health care facilities in the Boland region of South Africa. A significant association was found between the 17-item PDS and the adapted PTSD module of the CIDI. A ROC curve analysis indicated that the 17-item PDS correctly discriminated between PTSD caseness and non-caseness 74.9% of the time. Moreover, a PDS cut-off point of > or = 15 yielded adequate sensitivity (68%) and 1-specificity (65%). The 17-item PDS demonstrated a PPV of 76.0% and a NPV of 56.7%. The 17-item PDS can be used as a brief screening measure for the detection of HIV-related PTSD among HIV-positive patients in South Africa.
Bayesian modeling of measurement error in predictor variables using item response theory

NARCIS (Netherlands)

Fox, Gerardus J.A.; Glas, Cornelis A.W.

2000-01-01

This paper focuses on handling measurement error in predictor variables using item response theory (IRT). Measurement error is of great important in assessment of theoretical constructs, such as intelligence or the school climate. Measurement error is modeled by treating the predictors as unobserved
Item validity vs. item discrimination index: a redundancy?

Science.gov (United States)

Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.

2018-03-01

In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.

Psychometric evaluation of the pediatric and parent-proxy Patient-Reported Outcomes Measurement Information System and the Neurology and Traumatic Brain Injury Quality of Life measurement item banks in pediatric traumatic brain injury.

Science.gov (United States)

Bertisch, Hilary; Rivara, Frederick P; Kisala, Pamela A; Wang, Jin; Yeates, Keith Owen; Durbin, Dennis; Zonfrillo, Mark R; Bell, Michael J; Temkin, Nancy; Tulsky, David S

2017-07-01

The primary objective is to provide evidence of convergent and discriminant validity for the pediatric and parent-proxy versions of the Patient-Reported Outcomes Measurement Information System (PROMIS) Anxiety, Depression, Anger, Peer Relations, Mobility, Pain Interference, and Fatigue item banks, the Neurology Quality of Life measurement system (Neuro-QOL) Cognition-General Concerns and Stigma item banks, and the Traumatic Brain Injury Quality of Life (TBI-QOL) Executive Function and Headache item banks in a pediatric traumatic brain injury (TBI) sample. Participants were 134 parent-child (ages 8-18 years) days. Children all sustained TBI and the dyads completed outcome ratings 6 months after injury at one of six medical centers across the United States. Ratings included PROMIS, Neuro-QOL, and TBI-QOL item banks, as well as the Pediatric Quality of Life inventory (PedsQL), the Health Behavior Inventory (HBI), and the Strengths and Difficulties Questionnaire (SDQ) as legacy criterion measures against which these item banks were validated. The PROMIS, Neuro-QOL, and TBI-QOL item banks demonstrated good convergent validity, as evidenced by moderate to strong correlations with comparable scales on the legacy measures. PROMIS, Neuro-QOL, and TBI-QOL item banks showed weaker correlations with ratings of unrelated constructs on legacy measures, providing evidence of discriminant validity. Our results indicate that the constructs measured by the PROMIS, Neuro-QOL, and TBI-QOL item banks are valid in our pediatric TBI sample and that it is appropriate to use these standardized scores for our primary study analyses.
The importance of rating scale design in the measurement of patient-reported outcomes using questionnaires or item banks.

Science.gov (United States)

Khadka, Jyoti; McAlinden, Colm; Gothwal, Vijaya K; Lamoureux, Ecosse L; Pesudovs, Konrad

2012-06-26

To investigate the effect of rating scale designs (question formats and response categories) on item difficulty calibrations and assess the impact that rating scale differences have on overall vision-related activity limitation (VRAL) scores. Sixteen existing patient-reported outcome instruments (PROs) suitable for cataract assessment, with different rating scales, were self-administered by patients on a cataract surgery waiting list. A total of 226 VRAL items from these PROs in their native rating scales were included in an item bank and calibrated using Rasch analysis. Fifteen item/content areas (e.g., reading newspapers) appearing in at least three different PROs were identified. Within each content area, item calibrations were compared and their range calculated. Similarly, five PROs having at least three items in common with the Visual Function (VF-14) were compared in terms of average item measures. A total of 614 patients (mean age ± SD, 74.1 ± 9.4 years) participated. Items with the same content varied in their calibration by as much as two logits; "reading the small print" had the largest range (1.99 logits) followed by "watching TV" (1.60). Compared with the VF-14 (0.00 logits), the rating scale of the Visual Disability Assessment (1.13 logits) produced the most difficult items and the Cataract Symptom Scale (0.24 logits) produced the least difficult items. The VRAL item bank was suboptimally targeted to the ability level of the participants (2.00 logits). Rating scale designs have a significant effect on item calibrations. Therefore, constructing item banks from existing items in their native formats carries risks to face validity and transmission of problems inherent in existing instruments, such as poor targeting.
Item information and discrimination functions for trinary PCM items

NARCIS (Netherlands)

Akkermans, Wies; Muraki, Eiji

1997-01-01

For trinary partial credit items the shape of the item information and the item discrimination function is examined in relation to the item parameters. In particular, it is shown that these functions are unimodal if δ2 – δ1 < 4 ln 2 and bimodal otherwise. The locations and values of the maxima are
Item response theory - A first approach

Science.gov (United States)

Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

2017-07-01

The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations.

Science.gov (United States)

Teresi, Jeanne A; Ocepek-Welikson, Katja; Cook, Karon F; Kleinman, Marjorie; Ramirez, Mildred; Reid, M Carrington; Siu, Albert

2016-01-01

Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System ® (PROMIS ® ) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, "How much did pain interfere with enjoyment of social activities?" was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and sensitivity
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations

Science.gov (United States)

Teresi, Jeanne A.; Ocepek-Welikson, Katja; Cook, Karon F.; Kleinman, Marjorie; Ramirez, Mildred; Reid, M. Carrington; Siu, Albert

2017-01-01

Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System® (PROMIS®) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. Methods DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. Results The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, “How much did pain interfere with enjoyment of social activities?” was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and
77 FR 75187 - Certain Food Containers, Cups, Plates, Cutlery, and Related Items and Packaging Thereof...

Science.gov (United States)

2012-12-19

... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-835] Certain Food Containers, Cups, Plates, Cutlery, and Related Items and Packaging Thereof; Commission Determination Not To Review an... containers, cups, plates, cutlery, and related items and packaging thereof by reason of infringement of U.S...
Development of an assessment tool to measure students′ perceptions of respiratory care education programs: Item generation, item reduction, and preliminary validation

Directory of Open Access Journals (Sweden)

Ghazi Alotaibi

2013-01-01

Full Text Available Objectives: Students who perceived their learning environment positively are more likely to develop effective learning strategies, and adopt a deep learning approach. Currently, there is no validated instrument for measuring the educational environment of educational programs on respiratory care (RC. The aim of this study was to develop an instrument to measure students′ perception of the RC educational environment. Materials and Methods: Based on the literature review and an assessment of content validity by multiple focus groups of RC educationalists, potential items of the instrument relevant to RC educational environment construct were generated by the research group. The initial 71 item questionnaire was then field-tested on all students from the 3 RC programs in Saudi Arabia and was subjected to multi-trait scaling analysis. Cronbach′s alpha was used to assess internal consistency reliabilities. Results: Two hundred and twelve students (100% completed the survey. The initial instrument of 71 items was reduced to 65 across 5 scales. Convergent and discriminant validity assessment demonstrated that the majority of items correlated more highly with their intended scale than a competing one. Cronbach′s alpha exceeded the standard criterion of >0.70 in all scales except one. There was no floor or ceiling effect for scale or overall score. Conclusions: This instrument is the first assessment tool developed to measure the RC educational environment. There was evidence of its good feasibility, validity, and reliability. This first validation of the instrument supports its use by RC students to evaluate educational environment.
Gender-Based Differential Item Performance in Mathematics Achievement Items.

Science.gov (United States)

Doolittle, Allen E.; Cleary, T. Anne

1987-01-01

Eight randomly equivalent samples of high school seniors were each given a unique form of the ACT Assessment Mathematics Usage Test (ACTM). Signed measures of differential item performance (DIP) were obtained for each item in the eight ACTM forms. DIP estimates were analyzed and a significant item category effect was found. (Author/LMO)
Memory in pregnancy and post-partum: Item specific and relational encoding processes in recall and recognition.

Science.gov (United States)

Spataro, Pietro; Saraulli, Daniele; Oriolo, Debora; Costanzi, Marco; Zanetti, Humberto; Cestari, Vincenzo; Rossi-Arnaud, Clelia

2016-08-01

It has been recently proposed that pregnant women would perform memory tasks by focusing more on item-specific processes and less on relational processing, compared to post-partum women (Mickes, Wixted, Shapiro & Scarff, ). The present cross-sectional study tested this hypothesis by directly manipulating the type of encoding employed in the study phase. Pregnant, post-partum and control women either rated the pleasantness of word meaning (which induced item-specific elaboration) or named the semantic category to which they belonged (which induced relational elaboration). Memory for the encoded words was later tested in free recall (which emphasizes relational processing) and in recognition (which emphasizes item-specific processing). In line with Mickes et al.'s () conclusions, pregnant women in the item-specific condition performed worse than post-partum women in the relational condition in free recall, but not in recognition. However, compared to the other two groups, pregnant women also exhibited lower recognition accuracy in the item-specific condition. Overall, these results confirm that pregnant women rely on relational encoding less than post-partum women, but additionally suggest that the former group might use item-specific processes less efficiently than post-partum and control women. © 2016 Scandinavian Psychological Associations and John Wiley & Sons Ltd.
DRD4 long allele carriers show heightened attention to high-priority items relative to low-priority items.

Science.gov (United States)

Gorlick, Marissa A; Worthy, Darrell A; Knopik, Valerie S; McGeary, John E; Beevers, Christopher G; Maddox, W Todd

2015-03-01

Humans with seven or more repeats in exon III of the DRD4 gene (long DRD4 carriers) sometimes demonstrate impaired attention, as seen in attention-deficit hyperactivity disorder, and at other times demonstrate heightened attention, as seen in addictive behavior. Although the clinical effects of DRD4 are the focus of much work, this gene may not necessarily serve as a "risk" gene for attentional deficits, but as a plasticity gene where attention is heightened for priority items in the environment and impaired for minor items. Here we examine the role of DRD4 in two tasks that benefit from selective attention to high-priority information. We examine a category learning task where performance is supported by focusing on features and updating verbal rules. Here, selective attention to the most salient features is associated with good performance. In addition, we examine the Operation Span (OSPAN) task, a working memory capacity task that relies on selective attention to update and maintain items in memory while also performing a secondary task. Long DRD4 carriers show superior performance relative to short DRD4 homozygotes (six or less tandem repeats) in both the category learning and OSPAN tasks. These results suggest that DRD4 may serve as a "plasticity" gene where individuals with the long allele show heightened selective attention to high-priority items in the environment, which can be beneficial in the appropriate context.
Further Examination of Job-Related Social Skills Measures for Adolescents and Young Adults with Emotional and Behavioral Disorders.

Science.gov (United States)

Bullis, Michael; Davis, Cheryl

1996-01-01

This study conducted item reduction analyses on two measures of job-related social behavior for adolescents and young adults with emotional/behavioral disorders (Scale of Job-Related Social Skill Knowledge and Scale of Job-Related Social Skill Performance). The shortened measures contained 40 and 94 items, respectively. Reliability was…
Dutch-Flemish translation of nine pediatric item banks from the Patient-Reported Outcomes Measurement Information System (PROMIS)®.

Science.gov (United States)

Haverman, Lotte; Grootenhuis, Martha A; Raat, Hein; van Rossum, Marion A J; van Dulmen-den Broeder, Eline; Hoppenbrouwers, Karel; Correia, Helena; Cella, David; Roorda, Leo D; Terwee, Caroline B

2016-03-01

The Patient-Reported Outcomes Measurement Information System (PROMIS(®)) is a new, state-of-the-art assessment system for measuring patient-reported health and well-being of adults and children. It has the potential to be more valid, reliable, and responsive than existing PROMs. The items banks are designed to be self-reported and completed by children aged 8-18 years. The PROMIS items can be administered in short forms or through computerized adaptive testing. This paper describes the translation and cultural adaption of nine PROMIS item banks (151 items) for children in Dutch-Flemish. The translation was performed by FACITtrans using standardized PROMIS methodology and approved by the PROMIS Statistical Center. The translation included four forward translations, two back-translations, three independent reviews (at least two Dutch, one Flemish), and pretesting in 24 children from the Netherlands and Flanders. For some items, it was necessary to have separate translations for Dutch and Flemish: physical function-mobility (three items), anger (one item), pain interference (two items), and asthma impact (one item). Challenges faced in the translation process included scarcity or overabundance of possible translations, unclear item descriptions, constructs broader/smaller in the target language, difficulties in rank ordering items, differences in unit of measurement, irrelevant items, or differences in performance of activities. By addressing these challenges, acceptable translations were obtained for all items. The Dutch-Flemish PROMIS items are linguistically equivalent to the original USA version. Short forms are now available for use, and entire item banks are ready for cross-cultural validation in the Netherlands and Flanders.
A comparison of Rasch item-fit and Cronbach's alpha item reduction analysis for the development of a Quality of Life scale for children and adolescents.

Science.gov (United States)

Erhart, M; Hagquist, C; Auquier, P; Rajmil, L; Power, M; Ravens-Sieberer, U

2010-07-01

This study compares item reduction analysis based on classical test theory (maximizing Cronbach's alpha - approach A), with analysis based on the Rasch Partial Credit Model item-fit (approach B), as applied to children and adolescents' health-related quality of life (HRQoL) items. The reliability and structural, cross-cultural and known-group validity of the measures were examined. Within the European KIDSCREEN project, 3019 children and adolescents (8-18 years) from seven European countries answered 19 HRQoL items of the Physical Well-being dimension of a preliminary KIDSCREEN instrument. The Cronbach's alpha and corrected item total correlation (approach A) were compared with infit mean squares and the Q-index item-fit derived according to a partial credit model (approach B). Cross-cultural differential item functioning (DIF ordinal logistic regression approach), structural validity (confirmatory factor analysis and residual correlation) and relative validity (RV) for socio-demographic and health-related factors were calculated for approaches (A) and (B). Approach (A) led to the retention of 13 items, compared with 11 items with approach (B). The item overlap was 69% for (A) and 78% for (B). The correlation coefficient of the summated ratings was 0.93. The Cronbach's alpha was similar for both versions [0.86 (A); 0.85 (B)]. Both approaches selected some items that are not strictly unidimensional and items displaying DIF. RV ratios favoured (A) with regard to socio-demographic aspects. Approach (B) was superior in RV with regard to health-related aspects. Both types of item reduction analysis should be accompanied by additional analyses. Neither of the two approaches was universally superior with regard to cultural, structural and known-group validity. However, the results support the usability of the Rasch method for developing new HRQoL measures for children and adolescents.
Commercial grade item (CGI) dedication of generators for nuclear safety related applications

International Nuclear Information System (INIS)

Das, R.K.; Hajos, L.G.

1993-01-01

The number of nuclear safety related equipment suppliers and the availability of spare and replacement parts designed specifically for nuclear safety related application are shrinking rapidly. These have made it necessary for utilities to apply commercial grade spare and replacement parts in nuclear safety related applications after implementing proper acceptance and dedication process to verify that such items conform with the requirements of their use in nuclear safety related application. The general guidelines for the commercial grade item (CGI) acceptance and dedication are provided in US Nuclear Regulatory Commission (NRC) Generic Letters and Electric Power Research Institute (EPRI) Report NP-5652, Guideline for the Utilization of Commercial Grade Items in Nuclear Safety Related Applications. This paper presents an application of these generic guidelines for procurement, acceptance, and dedication of a commercial grade generator for use as a standby generator at Salem Generating Station Units 1 and 2. The paper identifies the critical characteristics of the generator which once verified, will provide reasonable assurance that the generator will perform its intended safety function. The paper also delineates the method of verification of the critical characteristics through tests and provide acceptance criteria for the test results. The methodology presented in this paper may be used as specific guidelines for reliable and cost effective procurement and dedication of commercial grade generators for use as standby generators at nuclear power plants
Validation of a 10-item care-related regret intensity scale (RIS-10) for health care professionals.

Science.gov (United States)

Courvoisier, Delphine S; Cullati, Stéphane; Haller, Chiara S; Schmidt, Ralph E; Haller, Guy; Agoritsas, Thomas; Perneger, Thomas V

2013-03-01

Regret after one of the many decisions and interventions that health care professionals make every day can have an impact on their own health and quality of life, and on their patient care practices. To validate a new care-related regret intensity scale (RIS) for health care professionals. Retrospective cross-sectional cohort study with a 1-month follow-up (test-retest) in a French-speaking University Hospital. A total of 469 nurses and physicians responded to the survey, and 175 answered the retest. RIS, self-report questions on the context of the regret-inducing event, its consequences for the patient, involvement of the health care professionals, and changes in patient care practices after the event. We measured the impact of regret intensity on health care professionals with the satisfaction with life scale, the SF-36 first question (self-reported health), and a question on self-esteem. On the basis of factor analysis and item response analysis, the initial 19-item scale was shortened to 10 items. The resulting scale (RIS-10) was unidimensional and had high internal consistency (α=0.87) and acceptable test-retest reliability (0.70). Higher regret intensity was associated with (a) more consequences for the patient; (b) lower life satisfaction and poorer self-reported health in health care professionals; and (c) changes in patient care practices. Nurses reported analyzing the event and apologizing, whereas physicians reported talking preferentially to colleagues, rather than to their supervisor, about changing practices. The RIS is a valid and reliable measure of care-related regret intensity for hospital-based physicians and nurses.
Item wording and internal consistency of a measure of cohesion: the group environment questionnaire.

Science.gov (United States)

Eys, Mark A; Carron, Albert V; Bray, Steven R; Brawley, Lawrence R

2007-06-01

A common practice for counteracting response acquiescence in psychological measures has been to employ both negatively and positively worded items. However, previous research has highlighted that the reliability of measures can be affected by this practice (Spector, 1992). The purpose of the present study was to examine the effect that the presence of negatively worded items has on the internal reliability of the Group Environment Questionnaire (GEQ). Two samples (N = 276) were utilized, and participants were asked to complete the GEQ (original and revised) on separate occasions. Results demonstrated that the revised questionnaire (containing all positively worded items) had significantly higher Cronbach alpha values for three of the four dimensions of the GEQ. Implications, alternatives, and future directions are discussed.
Development and Validation of a Novel Generic Health-related Quality of Life Instrument With 20 Items (HINT-20

Directory of Open Access Journals (Sweden)

Min-Woo Jo

2017-01-01

Full Text Available Objectives Few attempts have been made to develop a generic health-related quality of life (HRQoL instrument and to examine its validity and reliability in Korea. We aimed to do this in our present study. Methods After a literature review of existing generic HRQoL instruments, a focus group discussion, in-depth interviews, and expert consultations, we selected 30 tentative items for a new HRQoL measure. These items were evaluated by assessing their ceiling effects, difficulty, and redundancy in the first survey. To validate the HRQoL instrument that was developed, known-groups validity and convergent/discriminant validity were evaluated and its test-retest reliability was examined in the second survey. Results Of the 30 items originally assessed for the HRQoL instrument, four were excluded due to high ceiling effects and six were removed due to redundancy. We ultimately developed a HRQoL instrument with a reduced number of 20 items, known as the Health-related Quality of Life Instrument with 20 items (HINT-20, incorporating physical, mental, social, and positive health dimensions. The results of the HINT-20 for known-groups validity were poorer in women, the elderly, and those with a low income. For convergent/discriminant validity, the correlation coefficients of items (except vitality in the physical health dimension with the physical component summary of the Short Form 36 version 2 (SF-36v2 were generally higher than the correlations of those items with the mental component summary of the SF-36v2, and vice versa. Regarding test-retest reliability, the intraclass correlation coefficient of the total HINT-20 score was 0.813 (p<0.001. Conclusions A novel generic HRQoL instrument, the HINT-20, was developed for the Korean general population and showed acceptable validity and reliability.
77 FR 14423 - Certain Food Containers, Cups, Plates, Cutlery, and Related Items, and Packaging Thereof; Notice...

Science.gov (United States)

2012-03-09

... INTERNATIONAL TRADE COMMISSION [DN 2883] Certain Food Containers, Cups, Plates, Cutlery, and... Containers, Cups, Plates, Cutlery, and Related Items, and Packaging Thereof, DN 2883; the Commission is... importation of certain food containers, cups, plates, cutlery, and related items, and packaging thereof. The...
The PROMIS fatigue item bank has good measurement properties in patients with fibromyalgia and severe fatigue.

Science.gov (United States)

Yost, Kathleen J; Waller, Niels G; Lee, Minji K; Vincent, Ann

2017-06-01

Efficient management of fibromyalgia (FM) requires precise measurement of FM-specific symptoms. Our objective was to assess the measurement properties of the Patient-Reported Outcome Measurement Information System (PROMIS) fatigue item bank (FIB) in people with FM. We applied classical psychometric and item response theory methods to cross-sectional PROMIS-FIB data from two samples. Data on the clinical FM sample were obtained at a tertiary medical center. Data for the U.S. general population sample were obtained from the PROMIS network. The full 95-item bank was administered to both samples. We investigated dimensionality of the item bank in both samples by separately fitting a bifactor model with two group factors; experience and impact. We assessed measurement invariance between samples, and we explored an alternate factor structure with the normative sample and subsequently confirmed that structure in the clinical sample. Finally, we assessed whether reporting FM subdomain scores added value over reporting a single total score. The item bank was dominated by a general fatigue factor. The fit of the initial bifactor model and evidence of measurement invariance indicated that the same constructs were measured across the samples. An alternative bifactor model with three group factors demonstrated slightly improved fit. Subdomain scores add value over a total score. We demonstrated that the PROMIS-FIB is appropriate for measuring fatigue in clinical samples of FM patients. The construct can be presented by a single score; however, subdomain scores for the three group factors identified in the alternative model may also be reported.

What’s hampering measurement invariance: Detecting non-invariant items using clusterwise simultaneous component analysis

Directory of Open Access Journals (Sweden)

Kim eDe Roover

2014-06-01

Full Text Available The issue of measurement invariance is ubiquitous in the behavioral sciences nowadays as more and more studies yield multivariate multigroup data. When measurement invariance cannot be established across groups, this is often due to different loadings on only a few items. Within the multigroup CFA framework, methods have been proposed to trace such non-invariant items, but these methods have some disadvantages in that they require researchers to run a multitude of analyses and in that they imply assumptions that are often questionable. In this paper, we propose an alternative strategy which builds on clusterwise simultaneous component analysis (SCA. Clusterwise SCA, being an exploratory technique, assigns the groups under study to a few clusters based on differences and similarities in the covariance matrices, and thus based on the component structure of the items. Non-invariant items can then be traced by comparing the cluster-specific component loadings via congruence coefficients, which is far more parsimonious than comparing the component structure of all separate groups. In this paper we present a heuristic for this procedure. Afterwards, one can return to the multigroup CFA framework and check whether removing the non-invariant items or removing some of the equality restrictions for these items, yields satisfactory invariance test results. An empirical application concerning cross-cultural emotion data is used to demonstrate that this novel approach is useful and can co-exist with the traditional CFA approaches.
The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency

DEFF Research Database (Denmark)

Rose, Matthias; Bjørner, Jakob; Gandek, Barbara

2014-01-01

OBJECTIVE: To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. STUDY DESIGN AND SETTING: The items were evaluated using qualitative and quantitative methods. A total...... response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. RESULTS: The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living...... to identify differences between age and disease groups. CONCLUSION: The item bank provides a common metric and can improve the measurement of PF by facilitating the standardization of patient-reported outcome measures and implementation of CATs for more efficient PF assessments over a larger range....
A photographic method to measure food item intake. Validation in geriatric institutions.

Science.gov (United States)

Pouyet, Virginie; Cuvelier, Gérard; Benattar, Linda; Giboreau, Agnès

2015-01-01

From both a clinical and research perspective, measuring food intake is an important issue in geriatric institutions. However, weighing food in this context can be complex, particularly when the items remaining on a plate (side dish, meat or fish and sauce) need to be weighed separately following consumption. A method based on photography that involves taking photographs after a meal to determine food intake consequently seems to be a good alternative. This method enables the storage of raw data so that unhurried analyses can be performed to distinguish the food items present in the images. Therefore, the aim of this paper was to validate a photographic method to measure food intake in terms of differentiating food item intake in the context of a geriatric institution. Sixty-six elderly residents took part in this study, which was performed in four French nursing homes. Four dishes of standardized portions were offered to the residents during 16 different lunchtimes. Three non-trained assessors then independently estimated both the total and specific food item intakes of the participants using images of their plates taken after the meal (photographic method) and a reference image of one plate taken before the meal. Total food intakes were also recorded by weighing the food. To test the reliability of the photographic method, agreements between different assessors and agreements among various estimates made by the same assessor were evaluated. To test the accuracy and specificity of this method, food intake estimates for the four dishes were compared with the food intakes determined using the weighed food method. To illustrate the added value of the photographic method, food consumption differences between the dishes were explained by investigating the intakes of specific food items. Although they were not specifically trained for this purpose, the results demonstrated that the assessor estimates agreed between assessors and among various estimates made by the same
The Aphasia Communication Outcome Measure (ACOM): Dimensionality, Item Bank Calibration, and Initial Validation

Science.gov (United States)

Hula, William D.; Doyle, Patrick J.; Stone, Clement A.; Hula, Shannon N. Austermann; Kellough, Stacey; Wambaugh, Julie L.; Ross, Katherine B.; Schumacher, James G.; St. Jacque, Ann

2015-01-01

Purpose: The purpose of this study is to investigate the structure and measurement properties of the Aphasia Communication Outcome Measure (ACOM), a patient-reported outcome measure of communicative functioning for persons with aphasia. Method: Three hundred twenty-nine participants with aphasia responded to 177 items asking about communicative…
Using automatic item generation to create multiple-choice test items.

Science.gov (United States)

Gierl, Mark J; Lai, Hollis; Turner, Simon R

2012-08-01

Many tests of medical knowledge, from the undergraduate level to the level of certification and licensure, contain multiple-choice items. Although these are efficient in measuring examinees' knowledge and skills across diverse content areas, multiple-choice items are time-consuming and expensive to create. Changes in student assessment brought about by new forms of computer-based testing have created the demand for large numbers of multiple-choice items. Our current approaches to item development cannot meet this demand. We present a methodology for developing multiple-choice items based on automatic item generation (AIG) concepts and procedures. We describe a three-stage approach to AIG and we illustrate this approach by generating multiple-choice items for a medical licensure test in the content area of surgery. To generate multiple-choice items, our method requires a three-stage process. Firstly, a cognitive model is created by content specialists. Secondly, item models are developed using the content from the cognitive model. Thirdly, items are generated from the item models using computer software. Using this methodology, we generated 1248 multiple-choice items from one item model. Automatic item generation is a process that involves using models to generate items using computer technology. With our method, content specialists identify and structure the content for the test items, and computer technology systematically combines the content to generate new test items. By combining these outcomes, items can be generated automatically. © Blackwell Publishing Ltd 2012.
Validation of the MOS Social Support Survey 6-item (MOS-SSS-6) measure with two large population-based samples of Australian women.

Science.gov (United States)

Holden, Libby; Lee, Christina; Hockey, Richard; Ware, Robert S; Dobson, Annette J

2014-12-01

This study aimed to validate a 6-item 1-factor global measure of social support developed from the Medical Outcomes Study Social Support Survey (MOS-SSS) for use in large epidemiological studies. Data were obtained from two large population-based samples of participants in the Australian Longitudinal Study on Women's Health. The two cohorts were aged 53-58 and 28-33 years at data collection (N = 10,616 and 8,977, respectively). Items selected for the 6-item 1-factor measure were derived from the factor structure obtained from unpublished work using an earlier wave of data from one of these cohorts. Descriptive statistics, including polychoric correlations, were used to describe the abbreviated scale. Cronbach's alpha was used to assess internal consistency and confirmatory factor analysis to assess scale validity. Concurrent validity was assessed using correlations between the new 6-item version and established 19-item version, and other concurrent variables. In both cohorts, the new 6-item 1-factor measure showed strong internal consistency and scale reliability. It had excellent goodness-of-fit indices, similar to those of the established 19-item measure. Both versions correlated similarly with concurrent measures. The 6-item 1-factor MOS-SSS measures global functional social support with fewer items than the established 19-item measure.
Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression

DEFF Research Database (Denmark)

Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.

2010-01-01

Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise...
Random Item Generation Is Affected by Age

Science.gov (United States)

Multani, Namita; Rudzicz, Frank; Wong, Wing Yiu Stephanie; Namasivayam, Aravind Kumar; van Lieshout, Pascal

2016-01-01

Purpose: Random item generation (RIG) involves central executive functioning. Measuring aspects of random sequences can therefore provide a simple method to complement other tools for cognitive assessment. We examine the extent to which RIG relates to specific measures of cognitive function, and whether those measures can be estimated using RIG…
What's hampering measurement invariance : Detecting non-invariant items using clusterwise simultaneous component analysis

NARCIS (Netherlands)

De Roover, K.; Timmerman, Marieke; De Leersnyder, J.; Mesquita, B.; Ceulemans, Eva

2014-01-01

The issue of measurement invariance is ubiquitous in the behavioral sciences nowadays as more and more studies yield multivariate multigroup data. When measurement invariance cannot be established across groups, this is often due to different loadings on only a few items. Within the multigroup CFA
Cross-cultural measurement invariance in the satisfaction with food-related life scale in older adults from two developing countries.

Science.gov (United States)

Schnettler, Berta; Miranda-Zapata, Edgardo; Lobos, Germán; Lapo, María; Grunert, Klaus G; Adasme-Berríos, Cristian; Hueche, Clementina

2017-05-30

Nutrition is one of the major determinants of successful aging. The Satisfaction with Food-related Life (SWFL) scale measures a person's overall assessment regarding their food and eating habits. The SWFL scale has been used in older adult samples across different countries in Europe, Asia and America, however, there are no studies that have evaluated the cross-cultural measurement invariance of the scale in older adult samples. Therefore, we evaluated the measurement invariance of the SWFL scale across older adults from Chile and Ecuador. Stratified random sampling was used to recruit a sample of older adults of both genders from Chile (mean age = 71.38, SD = 6.48, range = 60-92) and from Ecuador (mean age = 73.70, SD = 7.45, range = 60-101). Participants reported their levels of satisfaction with food-related life by completing the SWFL scale, which consists of five items grouped into a single dimension. Confirmatory factor analysis (CFA) was used to examine cross-cultural measurement invariance of the SWFL scale. Results showed that the SWFL scale exhibited partial measurement invariance, with invariance of all factor loadings, invariance in all but one item's threshold (item 1) and invariance in all items' uniqueness (residuals), which leads us to conclude that there is a reasonable level of partial measurement invariance for the CFA model of the SWFL scale, when comparing the Chilean and Ecuadorian older adult samples. The lack of invariance in item 1 confirms previous studies with adults and emerging adults in Chile that suggest this item is culture-sensitive. We recommend revising the wording of the first item of the SWFL in order to relate the statement with the person's life. The SWFL scale shows partial measurement invariance across older adults from Chile and Ecuador. A 4-item version of the scale (excluding item 1) provides the basis for international comparisons of satisfaction with food-related life in older adults from developing
Benthic marine debris, with an emphasis on fishery-related items, surrounding Kodiak Island, Alaska, 1994-1996

Science.gov (United States)

Hess, N.A.; Ribic, C.A.; Vining, I.

1999-01-01

Composition and abundance of benthic marine debris were investigated during three bottom trawl surveys in inlet and offshore locations surrounding Kodiak Island, Alaska, 1994-1996. Debris items were primarily plastic and metal regardless of trawl location. Plastic bait jars, fishing line, and crab pots were the most common fishery-related debris items and were encountered in large amounts in inlets (20-25 items km-2), but were less abundant outside of inlets (4.5-11 items km-2). Overall density of debris was also significantly greater in inlets than outside of inlets. Plastic debris densities in inlets ranged 22-31.5 items km-2, 7.8-18.8 items km-2 outside of inlets. Trawls in inlets contained almost as much metal debris as plastic debris. Density of metal debris ranged from 21.2 to 23.7 items km-2 in inlets, a maximum of 2.7 items km-2 outside of inlets. Inlets around the town of Kodiak had the highest densities of fishery-related and total benthic debris. Differences in benthic debris density between inlets and outside of inlets and differences by area may be due to differences in fishing activity and water circulation patterns. At the current reduced levels of fishing activity, however, yearly monitoring of benthic debris appears unnecessary. Copyright (C) 1999.
Item response theory scoring and the detection of curvilinear relationships.

Science.gov (United States)

Carter, Nathan T; Dalal, Dev K; Guan, Li; LoPilato, Alexander C; Withrow, Scott A

2017-03-01

Psychologists are increasingly positing theories of behavior that suggest psychological constructs are curvilinearly related to outcomes. However, results from empirical tests for such curvilinear relations have been mixed. We propose that correctly identifying the response process underlying responses to measures is important for the accuracy of these tests. Indeed, past research has indicated that item responses to many self-report measures follow an ideal point response process-wherein respondents agree only to items that reflect their own standing on the measured variable-as opposed to a dominance process, wherein stronger agreement, regardless of item content, is always indicative of higher standing on the construct. We test whether item response theory (IRT) scoring appropriate for the underlying response process to self-report measures results in more accurate tests for curvilinearity. In 2 simulation studies, we show that, regardless of the underlying response process used to generate the data, using the traditional sum-score generally results in high Type 1 error rates or low power for detecting curvilinearity, depending on the distribution of item locations. With few exceptions, appropriate power and Type 1 error rates are achieved when dominance-based and ideal point-based IRT scoring are correctly used to score dominance and ideal point response data, respectively. We conclude that (a) researchers should be theory-guided when hypothesizing and testing for curvilinear relations; (b) correctly identifying whether responses follow an ideal point versus dominance process, particularly when items are not extreme is critical; and (c) IRT model-based scoring is crucial for accurate tests of curvilinearity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Validation of a 15-item care-related regret coping scale for health-care professionals (RCS-HCP).

Science.gov (United States)

Courvoisier, Delphine Sophie; Cullati, Stephane; Ouchi, Rieko; Schmidt, Ralph Eric; Haller, Guy; Chopard, Pierre; Agoritsas, Thomas; Perneger, Thomas V

2014-01-01

Coping with difficult care-related situations is a common challenge for health-care professionals. How these professionals deal with the regrets they may experience following one of the many decisions and interventions they must make every day can have an impact on their own health and quality of life, and also on their patient care practices. To identify professionals most at need for extra support, development and validation of a tool measuring coping style are needed. We performed a survey of physicians and nurses of a French-speaking University hospital; 469 health-care professionals responded to the survey, and 175 responded to the same survey one-month later. Regret was assessed with the regret coping scale developed for this study, self-report questions on the frequency of regretted situations and the intensity of regret. Construct validity was assessed using measures of health-care professionals' quality of life (including job and life satisfaction, and self-reported health) as well as sleep problems and depression. Based on factor analysis and item response analysis, the initial 31-item scale was shortened to 15 items, which measured three types of strategies: problem-focused strategies (i.e., trying to find solutions, talking to colleagues) and two types of emotion-focused strategies, A (i.e., self-blame, rumination) and B (e.g., acceptance, emotional distance). All subscales showed high internal consistency (α >0.85). Overall, as expected, problem-focused and emotion-focused B strategies correlated with higher quality of life, fewer sleep problems and less depression, and emotion-focused A strategies showed the opposite pattern. The regret coping scale (RCS-HCP) is a valid and reliable measure of coping abilities of hospital-based health-care professionals.
ITEM LEVEL DIAGNOSTICS AND MODEL - DATA FIT IN ITEM ...

African Journals Online (AJOL)

Global Journal

Item response theory (IRT) is a framework for modeling and analyzing item response ... data. Though, there is an argument that the evaluation of fit in IRT modeling has been ... National Council on Measurement in Education ... model data fit should be based on three types of ... prediction should be assessed through the.
Using an FSDS-R Item to Screen for Sexually Related Distress: A MsFLASH Analysis

Directory of Open Access Journals (Sweden)

Janet S. Carpenter, PhD, RN, FAAN

2015-03-01

Conclusions: A single FSDS-R item may be a useful screening tool to quickly identify midlife women with sexually related distress when it is not feasible to administer the entire scale, though further validation is warranted. Carpenter JS, Reed SD, Guthrie KA, Larson JC, Newton KM, Lau RJ, Learman LA, and Shifren JL. Using an FSDS-R item to screen for sexually related distress: A MsFLASH analysis. Sex Med 2015;3:7–13.
Work-related stress assessed by a text message single-item stress question.

Science.gov (United States)

Arapovic-Johansson, B; Wåhlin, C; Kwak, L; Björklund, C; Jensen, I

2017-12-02

Given the prevalence of work stress-related ill-health in the Western world, it is important to find cost-effective, easy-to-use and valid measures which can be used both in research and in practice. To examine the validity and reliability of the single-item stress question (SISQ), distributed weekly by short message service (SMS) and used for measurement of work-related stress. The convergent validity was assessed through associations between the SISQ and subscales of the Job Demand-Control-Support model, the Effort-Reward Imbalance model and scales measuring depression, exhaustion and sleep. The predictive validity was assessed using SISQ data collected through SMS. The reliability was analysed by the test-retest procedure. Correlations between the SISQ and all the subscales except for job strain and esteem reward were significant, ranging from -0.186 to 0.627. The SISQ could also predict sick leave, depression and exhaustion at 12-month follow-up. The analysis on reliability revealed a satisfactory stability with a weighted kappa between 0.804 and 0.868. The SISQ, administered through SMS, can be used for the screening of stress levels in a working population. © The Author 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Development of an instrument to measure behavioral health function for work disability: item pool construction and factor analysis.

Science.gov (United States)

Marfeo, Elizabeth E; Ni, Pengsheng; Haley, Stephen M; Jette, Alan M; Bogusz, Kara; Meterko, Mark; McDonough, Christine M; Chan, Leighton; Brandt, Diane E; Rasch, Elizabeth K

2013-09-01

To develop a broad set of claimant-reported items to assess behavioral health functioning relevant to the Social Security disability determination processes, and to evaluate the underlying structure of behavioral health functioning for use in development of a new functional assessment instrument. Cross-sectional. Community. Item pools of behavioral health functioning were developed, refined, and field tested in a sample of persons applying for Social Security disability benefits (N=1015) who reported difficulties working because of mental or both mental and physical conditions. None. Social Security Administration Behavioral Health (SSA-BH) measurement instrument. Confirmatory factor analysis (CFA) specified that a 4-factor model (self-efficacy, mood and emotions, behavioral control, social interactions) had the optimal fit with the data and was also consistent with our hypothesized conceptual framework for characterizing behavioral health functioning. When the items within each of the 4 scales were tested in CFA, the fit statistics indicated adequate support for characterizing behavioral health as a unidimensional construct along these 4 distinct scales of function. This work represents a significant advance both conceptually and psychometrically in assessment methodologies for work-related behavioral health. The measurement of behavioral health functioning relevant to the context of work requires the assessment of multiple dimensions of behavioral health functioning. Specifically, we identified a 4-factor model solution that represented key domains of work-related behavioral health functioning. These results guided the development and scale formation of a new SSA-BH instrument. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Qualitative Development and Content Validation of the PROMIS Pediatric Sleep Health Items.

Science.gov (United States)

Bevans, Katherine B; Meltzer, Lisa J; De La Motte, Anna; Kratchman, Amy; Viél, Dominique; Forrest, Christopher B

2018-04-25

To develop the Patient Reported Outcome Measurement Information System (PROMIS) Pediatric Sleep Health item pool and evaluate its content validity. Participants included 8 expert sleep clinician-researchers, 64 children ages 8-17 years, and 54 parents of children ages 5-17 years. We started with item concepts and expressions from the PROMIS Sleep Disturbance and Sleep Related Impairment adult measures. Additional pediatric sleep health concepts were generated by expert (n = 8), child (n = 28), and parent (n = 33) concept elicitation interviews and a systematic review of existing pediatric sleep health questionnaires. Content validity of the item pool was evaluated with item translatability review, readability analysis, and child (n = 36) and parent (n = 21) cognitive interviews. The final pediatric Sleep Health item pool includes 43 items that assess sleep disturbance (children's capacity to fall and stay asleep, sleep quality, dreams, and parasomnias) and sleep-related impairments (daytime sleepiness, low energy, difficulty waking up, and the impact of sleep and sleepiness on cognition, affect, behavior, and daily activities). Items are translatable and relevant and well understood by children ages 8-17 and parents of children ages 5-17. Rigorous qualitative procedures were used to develop and evaluate the content validity of the PROMIS Pediatric Sleep Health item pool. Once the item pool's psychometric properties are established, the scales will be useful for measuring children's subjective experiences of sleep.
Face Validity of the Single Work Ability Item: Comparison with Objectively Measured Heart Rate Reserve over Several Days

Science.gov (United States)

Gupta, Nidhi; Jensen, Bjørn Søvsø; Søgaard, Karen; Carneiro, Isabella Gomes; Christiansen, Caroline Stordal; Hanisch, Christiana; Holtermann, Andreas

2014-01-01

Purpose: The purpose of this study was to investigate the face validity of the self-reported single item work ability with objectively measured heart rate reserve (%HRR) among blue-collar workers. Methods: We utilized data from 127 blue-collar workers (Female = 53; Male = 74) aged 18–65 years from the cross-sectional “New method for Objective Measurements of physical Activity in Daily living (NOMAD)” study. The workers reported their single item work ability and completed an aerobic capacity cycling test and objective measurements of heart rate reserve monitored with Actiheart for 3–4 days with a total of 5,810 h, including 2,640 working hours. Results: A significant moderate correlation between work ability and %HRR was observed among males (R = −0.33, P = 0.005), but not among females (R = 0.11, P = 0.431). In a gender-stratified multi-adjusted logistic regression analysis, males with high %HRR were more likely to report a reduced work ability compared to males with low %HRR [OR = 4.75, 95% confidence interval (95% CI) = 1.31 to 17.25]. However, this association was not found among females (OR = 0.26, 95% CI 0.03 to 2.16), and a significant interaction between work ability, %HRR and gender was observed (P = 0.03). Conclusions: The observed association between work ability and objectively measured %HRR over several days among male blue-collar workers supports the face validity of the single work ability item. It is a useful and valid measure of the relation between physical work demands and resources among male blue-collar workers. The contrasting association among females needs to be further investigated. PMID:24840350
Face Validity of the Single Work Ability Item: Comparison with Objectively Measured Heart Rate Reserve over Several Days

Directory of Open Access Journals (Sweden)

Nidhi Gupta

2014-05-01

Full Text Available Purpose: The purpose of this study was to investigate the face validity of the self-reported single item work ability with objectively measured heart rate reserve (%HRR among blue-collar workers. Methods: We utilized data from 127 blue-collar workers (Female = 53; Male = 74 aged 18–65 years from the cross-sectional “New method for Objective Measurements of physical Activity in Daily living (NOMAD” study. The workers reported their single item work ability and completed an aerobic capacity cycling test and objective measurements of heart rate reserve monitored with Actiheart for 3–4 days with a total of 5,810 h, including 2,640 working hours. Results: A significant moderate correlation between work ability and %HRR was observed among males (R = −0.33, P = 0.005, but not among females (R = 0.11, P = 0.431. In a gender-stratified multi-adjusted logistic regression analysis, males with high %HRR were more likely to report a reduced work ability compared to males with low %HRR [OR = 4.75, 95% confidence interval (95% CI = 1.31 to 17.25]. However, this association was not found among females (OR = 0.26, 95% CI 0.03 to 2.16, and a significant interaction between work ability, %HRR and gender was observed (P = 0.03. Conclusions: The observed association between work ability and objectively measured %HRR over several days among male blue-collar workers supports the face validity of the single work ability item. It is a useful and valid measure of the relation between physical work demands and resources among male blue-collar workers. The contrasting association among females needs to be further investigated.

Single-item measure for assessing quality of life in children with drug-resistant epilepsy.

Science.gov (United States)

Conway, Lauryn; Widjaja, Elysa; Smith, Mary Lou

2018-03-01

The current study investigated the psychometric properties of a single-item quality of life (QOL) measure, the Global Quality of Life in Childhood Epilepsy question (G-QOLCE), in children with drug-resistant epilepsy. Data came from the Impact of Pediatric Epilepsy Surgery on Health-Related Quality of Life Study (PESQOL), a multicenter prospective cohort study (n = 118) with observations collected at baseline and at 6 months of follow-up on children aged 4-18 years. QOL was measured with the QOLCE-76 and KIDSCREEN-27. The G-QOLCE was an overall QOL question derived from the QOLCE-76. Construct validity and reliability were assessed with Spearman's correlation and intraclass correlation coefficient (ICC). Responsiveness was examined through distribution-based and anchor-based methods. The G-QOLCE showed moderate (r ≥ 0.30) to strong (r ≥ 0.50) correlations with composite scores, and most subscales of the QOLCE-76 and KIDSCREEN-27 at baseline and 6-month follow-up. The G-QOLCE had moderate test-retest reliability (ICC range: 0.49-0.72) and was able to detect clinically important change in patients' QOL (standardized response mean: 0.38; probability of change: 0.65; Guyatt's responsiveness statistics: 0.62 and 0.78). Caregiver anxiety and family functioning contributed most strongly to G-QOLCE scores over time. Results offer promising preliminary evidence regarding the validity, reliability, and responsiveness of the proposed single-item QOL measure. The G-QOLCE is a potentially useful tool that can be feasibly administered in a busy clinical setting to evaluate clinical status and impact of treatment outcomes in pediatric epilepsy.
Geriatric Anxiety Scale: item response theory analysis, differential item functioning, and creation of a ten-item short form (GAS-10).

Science.gov (United States)

Mueller, Anne E; Segal, Daniel L; Gavett, Brandon; Marty, Meghan A; Yochim, Brian; June, Andrea; Coolidge, Frederick L

2015-07-01

The Geriatric Anxiety Scale (GAS; Segal et al. (Segal, D. L., June, A., Payne, M., Coolidge, F. L. and Yochim, B. (2010). Journal of Anxiety Disorders, 24, 709-714. doi:10.1016/j.janxdis.2010.05.002) is a self-report measure of anxiety that was designed to address unique issues associated with anxiety assessment in older adults. This study is the first to use item response theory (IRT) to examine the psychometric properties of a measure of anxiety in older adults. A large sample of older adults (n = 581; mean age = 72.32 years, SD = 7.64 years, range = 60 to 96 years; 64% women; 88% European American) completed the GAS. IRT properties were examined. The presence of differential item functioning (DIF) or measurement bias by age and sex was assessed, and a ten-item short form of the GAS (called the GAS-10) was created. All GAS items had discrimination parameters of 1.07 or greater. Items from the somatic subscale tended to have lower discrimination parameters than items on the cognitive or affective subscales. Two items were flagged for DIF, but the impact of the DIF was negligible. Women scored significantly higher than men on the GAS and its subscales. Participants in the young-old group (60 to 79 years old) scored significantly higher on the cognitive subscale than participants in the old-old group (80 years old and older). Results from the IRT analyses indicated that the GAS and GAS-10 have strong psychometric properties among older adults. We conclude by discussing implications and future research directions.
Items to be reflected to the nuclear power safety measures in Japan (concerning the examination, design and operation management) (excluding the items to be reflected to the standards)

Energy Technology Data Exchange (ETDEWEB)

1980-10-01

In connection with the Three Mile Island nuclear power accident in March, 1979, in the United States, in order to introduce the lessons from it in the nuclear power safety regulations in Japan, 52 items to be reflected to the nuclear power safety measures were chosen by the Nuclear Safety Commission. Of these, 16 items were examined by the Committee on Examination of Reactor Safety. It was decided that these results would be introduced in the nuclear safety regulations, by the Nuclear Safety Commission. The following 16 items are described. For the examination, four items concerning the automatic operation of safety systems and others; for the design, five items concerning a small rupture accident, the monitoring of the state of primary coolant, control room layout and others; for the operation management, seven items concerning the inspection at the time of repair, the prevention of faulty handlings by operators and others.
Assessing Impact, DIF, and DFF in Accommodated Item Scores: A Comparison of Multilevel Measurement Model Parameterizations

Science.gov (United States)

Beretvas, S. Natasha; Cawthon, Stephanie W.; Lockhart, L. Leland; Kaye, Alyssa D.

2012-01-01

This pedagogical article is intended to explain the similarities and differences between the parameterizations of two multilevel measurement model (MMM) frameworks. The conventional two-level MMM that includes item indicators and models item scores (Level 1) clustered within examinees (Level 2) and the two-level cross-classified MMM (in which item…
An item-response theory approach to safety climate measurement: The Liberty Mutual Safety Climate Short Scales.

Science.gov (United States)

Huang, Yueng-Hsiang; Lee, Jin; Chen, Zhuo; Perry, MacKenna; Cheung, Janelle H; Wang, Mo

2017-06-01

Zohar and Luria's (2005) safety climate (SC) scale, measuring organization- and group- level SC each with 16 items, is widely used in research and practice. To improve the utility of the SC scale, we shortened the original full-length SC scales. Item response theory (IRT) analysis was conducted using a sample of 29,179 frontline workers from various industries. Based on graded response models, we shortened the original scales in two ways: (1) selecting items with above-average discriminating ability (i.e. offering more than 6.25% of the original total scale information), resulting in 8-item organization-level and 11-item group-level SC scales; and (2) selecting the most informative items that together retain at least 30% of original scale information, resulting in 4-item organization-level and 4-item group-level SC scales. All four shortened scales had acceptable reliability (≥0.89) and high correlations (≥0.95) with the original scale scores. The shortened scales will be valuable for academic research and practical survey implementation in improving occupational safety. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Using item response theory to measure extreme response style in marketing research

NARCIS (Netherlands)

de Jong, Martijn G.; Steenkamp, Jan-Benedict E.M.; Fox, Gerardus J.A.; Baumgartner, Hans

2008-01-01

Extreme response style (ERS) is an important threat to the validity of survey-based marketing research. In this article, the authors present a new item response theory–based model for measuring ERS. This model contributes to the ERS literature in two ways. First, the method improves on existing
Science Library of Test Items. Volume Eighteen. A Collection of Multiple Choice Test Items Relating Mainly to Chemistry.

Science.gov (United States)

New South Wales Dept. of Education, Sydney (Australia).

As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
Science Library of Test Items. Volume Seventeen. A Collection of Multiple Choice Test Items Relating Mainly to Biology.

Science.gov (United States)

New South Wales Dept. of Education, Sydney (Australia).

As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
Science Library of Test Items. Volume Nineteen. A Collection of Multiple Choice Test Items Relating Mainly to Geology.

Science.gov (United States)

New South Wales Dept. of Education, Sydney (Australia).

As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
Forced-Choice Assessment of Work-Related Maladaptive Personality Traits: Preliminary Evidence From an Application of Thurstonian Item Response Modeling.

Science.gov (United States)

Guenole, Nigel; Brown, Anna A; Cooper, Andrew J

2018-06-01

This article describes an investigation of whether Thurstonian item response modeling is a viable method for assessment of maladaptive traits. Forced-choice responses from 420 working adults to a broad-range personality inventory assessing six maladaptive traits were considered. The Thurstonian item response model's fit to the forced-choice data was adequate, while the fit of a counterpart item response model to responses to the same items but arranged in a single-stimulus design was poor. Monotrait heteromethod correlations indicated corresponding traits in the two formats overlapped substantially, although they did not measure equivalent constructs. A better goodness of fit and higher factor loadings for the Thurstonian item response model, coupled with a clearer conceptual alignment to the theoretical trait definitions, suggested that the single-stimulus item responses were influenced by biases that the independent clusters measurement model did not account for. Researchers may wish to consider forced-choice designs and appropriate item response modeling techniques such as Thurstonian item response modeling for personality questionnaire applications in industrial psychology, especially when assessing maladaptive traits. We recommend further investigation of this approach in actual selection situations and with different assessment instruments.
The Role of Medial Temporal Lobe Regions in Incidental and Intentional Retrieval of Item and Relational Information in Aging.

Science.gov (United States)

Wang, Wei-Chun; Giovanello, Kelly S

2016-06-01

Considerable neuropsychological and neuroimaging work indicates that the medial temporal lobes are critical for both item and relational memory retrieval. However, there remain outstanding issues in the literature, namely the extent to which medial temporal lobe regions are differentially recruited during incidental and intentional retrieval of item and relational information, and the extent to which aging may affect these neural substrates. The current fMRI study sought to address these questions; participants incidentally encoded word pairs embedded in sentences and incidental item and relational retrieval were assessed through speeded reading of intact, rearranged, and new word-pair sentences, while intentional item and relational retrieval were assessed through old/new associative recognition of a separate set of intact, rearranged, and new word pairs. Results indicated that, in both younger and older adults, anterior hippocampus and perirhinal cortex indexed incidental and intentional item retrieval in the same manner. In contrast, posterior hippocampus supported incidental and intentional relational retrieval in both age groups and an adjacent cluster in posterior hippocampus was recruited during both forms of relational retrieval for older, but not younger, adults. Our findings suggest that while medial temporal lobe regions do not differentiate between incidental and intentional forms of retrieval, there are distinct roles for anterior and posterior medial temporal lobe regions during retrieval of item and relational information, respectively, and further indicate that posterior regions may, under certain conditions, be over-recruited in healthy aging. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Assessing cross-cultural item bias in questionnaires: Acculturation and the Measurement of Social Support and Family Cohesion for Adolescents

OpenAIRE

Hemert, Dianne A. van; Baerveldt, Chris; Vermande, Marjolijn

2001-01-01

Amethod is presented for evaluating the presence and size of cross-cultural item biases. The examined items concern parental support and family cohesion in a Likert-type questionnaire for adolescents in The Netherlands. Each evaluated item has two versions, a collectivist and an individualistic one, that measure the same theoretical construct. The standardized difference between the score means of the item versions, called the ?e score, gives an indication of the cultural bias of the item. As...
Individuals with knee impairments identify items in need of clarification in the Patient Reported Outcomes Measurement Information System (PROMIS®) pain interference and physical function item banks - a qualitative study.

Science.gov (United States)

Lynch, Andrew D; Dodds, Nathan E; Yu, Lan; Pilkonis, Paul A; Irrgang, James J

2016-05-11

The content and wording of the Patient Reported Outcome Measurement Information System (PROMIS) Physical Function and Pain Interference item banks have not been qualitatively assessed by individuals with knee joint impairments. The purpose of this investigation was to identify items in the PROMIS Physical Function and Pain Interference Item Banks that are irrelevant, unclear, or otherwise difficult to respond to for individuals with impairment of the knee and to suggest modifications based on cognitive interviews. Twenty-nine individuals with knee joint impairments qualitatively assessed items in the Pain Interference and Physical Function Item Banks in a mixed-methods cognitive interview. Field notes were analyzed to identify themes and frequency counts were calculated to identify items not relevant to individuals with knee joint impairments. Issues with clarity were identified in 23 items in the Physical Function Item Bank, resulting in the creation of 43 new or modified items, typically changing words within the item to be clearer. Interpretation issues included whether or not the knee joint played a significant role in overall health and age/gender differences in items. One quarter of the original items (31 of 124) in the Physical Function Item Bank were identified as irrelevant to the knee joint. All 41 items in the Pain Interference Item Bank were identified as clear, although individuals without significant pain substituted other symptoms which interfered with their life. The Physical Function Item Bank would benefit from additional items that are relevant to individuals with knee joint impairments and, by extension, to other lower extremity impairments. Several issues in clarity were identified that are likely to be present in other patient cohorts as well.
Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory

Science.gov (United States)

Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi

2016-01-01

High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…
Items to be reflected to the nuclear power safety measures in Japan (concerning the examination, design and operation management) (excluding the items to be reflected to the standards)

International Nuclear Information System (INIS)

1980-01-01

In connection with the Three Mile Island nuclear power accident in March, 1979, in the United States, in order to introduce the lessons from it in the nuclear power safety regulations in Japan, 52 items to be reflected to the nuclear power safety measures were chosen by the Nuclear Safety Commission. Of these, 16 items were examined by the Committee on Examination of Reactor Safety. It was decided that these results would be introduced in the nuclear safety regulations, by the Nuclear Safety Commission. The following 16 items are described. For the examination, four items concerning the automatic operation of safety systems and others; for the design, five items concerning a small rupture accident, the monitoring of the state of primary coolant, control room layout and others; for the operation management, seven items concerning the inspection at the time of repair, the prevention of faulty handlings by operators and others. (J.P.N.)
Racial differences in hypertension knowledge: effects of differential item functioning.

Science.gov (United States)

Ayotte, Brian J; Trivedi, Ranak; Bosworth, Hayden B

2009-01-01

Health-related knowledge is an important component in the self-management of chronic illnesses. The objective of this study was to more accurately assess racial differences in hypertension knowledge by using a latent variable modeling approach that controlled for sociodemographic factors and accounted for measurement issues in the assessment of hypertension knowledge. Cross-sectional data from 1,177 participants (45% African American; 35% female) were analyzed using a multiple indicator multiple causes (MIMIC) modeling approach. Available sociodemographic data included race, education, sex, financial status, and age. All participants completed six items on a hypertension knowledge questionnaire. Overall, the final model suggested that females, Whites, and patients with at least a high school diploma had higher latent knowledge scores than males, African Americans, and patients with less than a high school diploma, respectively. The model also detected differential item functioning (DIF) based on race for two of the items. Specifically, the error rate for African Americans was lower than would be expected given the lower level of latent knowledge on the items, on the questions related to: (a) the association between high blood pressure and kidney disease, and (b) the increased risk African Americans have for developing hypertension. Not accounting for DIF resulted in the difference between Whites and African Americans to be underestimated. These results are discussed in the context of the need for careful measurement of health-related constructs, and how measurement-related issues can result in an inaccurate estimation of racial differences in hypertension knowledge.
An Item Bank for Abuse of Prescription Pain Medication from the Patient-Reported Outcomes Measurement Information System (PROMIS®).

Science.gov (United States)

Pilkonis, Paul A; Yu, Lan; Dodds, Nathan E; Johnston, Kelly L; Lawrence, Suzanne M; Hilton, Thomas F; Daley, Dennis C; Patkar, Ashwin A; McCarty, Dennis

2017-08-01

There is a need to monitor patients receiving prescription opioids to detect possible signs of abuse. To address this need, we developed and calibrated an item bank for severity of abuse of prescription pain medication as part of the Patient-Reported Outcomes Measurement Information System (PROMIS ® ). Comprehensive literature searches yielded an initial bank of 5,310 items relevant to substance use and abuse, including abuse of prescription pain medication, from over 80 unique instruments. After qualitative item analysis (i.e., focus groups, cognitive interviewing, expert review, and item revision), 25 items for abuse of prescribed pain medication were included in field testing. Items were written in a first-person, past-tense format, with a three-month time frame and five response options reflecting frequency or severity. The calibration sample included 448 respondents, 367 from the general population (ascertained through an internet panel) and 81 from community treatment programs participating in the National Drug Abuse Treatment Clinical Trials Network. A final bank of 22 items was calibrated using the two-parameter graded response model from item response theory. A seven-item static short form was also developed. The test information curve showed that the PROMIS ® item bank for abuse of prescription pain medication provided substantial information in a broad range of severity. The initial psychometric characteristics of the item bank support its use as a computerized adaptive test or short form, with either version providing a brief, precise, and efficient measure relevant to both clinical and community samples. © 2016 American Academy of Pain Medicine. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
The 12-item World Health Organization Disability Assessment Schedule II (WHO-DAS II: a nonparametric item response analysis

Directory of Open Access Journals (Sweden)

Fernandez Ana

2010-05-01

Full Text Available Abstract Background Previous studies have analyzed the psychometric properties of the World Health Organization Disability Assessment Schedule II (WHO-DAS II using classical omnibus measures of scale quality. These analyses are sample dependent and do not model item responses as a function of the underlying trait level. The main objective of this study was to examine the effectiveness of the WHO-DAS II items and their options in discriminating between changes in the underlying disability level by means of item response analyses. We also explored differential item functioning (DIF in men and women. Methods The participants were 3615 adult general practice patients from 17 regions of Spain, with a first diagnosed major depressive episode. The 12-item WHO-DAS II was administered by the general practitioners during the consultation. We used a non-parametric item response method (Kernel-Smoothing implemented with the TestGraf software to examine the effectiveness of each item (item characteristic curves and their options (option characteristic curves in discriminating between changes in the underliying disability level. We examined composite DIF to know whether women had a higher probability than men of endorsing each item. Results Item response analyses indicated that the twelve items forming the WHO-DAS II perform very well. All items were determined to provide good discrimination across varying standardized levels of the trait. The items also had option characteristic curves that showed good discrimination, given that each increasing option became more likely than the previous as a function of increasing trait level. No gender-related DIF was found on any of the items. Conclusions All WHO-DAS II items were very good at assessing overall disability. Our results supported the appropriateness of the weights assigned to response option categories and showed an absence of gender differences in item functioning.
A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

Science.gov (United States)

Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

2018-04-10

To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading .3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.
The practical impact of differential item functioning analyses in a health-related quality of life instrument

DEFF Research Database (Denmark)

Scott, Neil W; Fayers, Peter M; Aaronson, Neil K

2009-01-01

Differential item functioning (DIF) analyses are commonly used to evaluate health-related quality of life (HRQoL) instruments. There is, however, a lack of consensus as to how to assess the practical impact of statistically significant DIF results.......Differential item functioning (DIF) analyses are commonly used to evaluate health-related quality of life (HRQoL) instruments. There is, however, a lack of consensus as to how to assess the practical impact of statistically significant DIF results....

Problems with the factor analysis of items: Solutions based on item response theory and item parcelling

Directory of Open Access Journals (Sweden)

Gideon P. De Bruin

2004-10-01

Full Text Available The factor analysis of items often produces spurious results in the sense that unidimensional scales appear multidimensional. This may be ascribed to failure in meeting the assumptions of linearity and normality on which factor analysis is based. Item response theory is explicitly designed for the modelling of the non-linear relations between ordinal variables and provides a strong alternative to the factor analysis of items. Items may also be combined in parcels that are more likely to satisfy the assumptions of factor analysis than do the items. The use of the Rasch rating scale model and the factor analysis of parcels is illustrated with data obtained with the Locus of Control Inventory. The results of these analyses are compared with the results obtained through the factor analysis of items. It is shown that the Rasch rating scale model and the factoring of parcels produce superior results to the factor analysis of items. Recommendations for the analysis of scales are made. Opsomming Die faktorontleding van items lewer dikwels misleidende resultate op, veral in die opsig dat eendimensionele skale as meerdimensioneel voorkom. Hierdie resultate kan dikwels daaraan toegeskryf word dat daar nie aan die aannames van lineariteit en normaliteit waarop faktorontleding berus, voldoen word nie. Itemresponsteorie, wat eksplisiet vir die modellering van die nie-liniêre verbande tussen ordinale items ontwerp is, bied ’n aantreklike alternatief vir die faktorontleding van items. Items kan ook in pakkies gegroepeer word wat meer waarskynlik aan die aannames van faktorontleding voldoen as individuele items. Die gebruik van die Rasch beoordelingskaalmodel en die faktorontleding van pakkies word aan die hand van data wat met die Lokus van Beheervraelys verkry is, gedemonstreer. Die resultate van hierdie ontledings word vergelyk met die resultate wat deur ‘n faktorontleding van die individuele items verkry is. Die resultate dui daarop dat die Rasch
Measuring depression after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Depression item bank and linkage with PHQ-9.

Science.gov (United States)

Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Bombardier, Charles H; Pohlig, Ryan T; Heinemann, Allen W; Carle, Adam; Choi, Seung W

2015-05-01

To develop a calibrated spinal cord injury-quality of life (SCI-QOL) item bank, computer adaptive test (CAT), and short form to assess depressive symptoms experienced by individuals with SCI, transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a crosswalk to the Patient Health Questionnaire (PHQ)-9. We used grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, item response theory (IRT) analyses, and statistical linking techniques to transform scores to a PROMIS metric and to provide a crosswalk with the PHQ-9. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. Spinal Cord Injury--Quality of Life (SCI-QOL) Depression Item Bank Individuals with SCI were involved in all phases of SCI-QOL development. A sample of 716 individuals with traumatic SCI completed 35 items assessing depression, 18 of which were PROMIS items. After removing 7 non-PROMIS items, factor analyses confirmed a unidimensional pool of items. We used a graded response IRT model to estimate slopes and thresholds for the 28 retained items. The SCI-QOL Depression measure correlated 0.76 with the PHQ-9. The SCI-QOL Depression item bank provides a reliable and sensitive measure of depressive symptoms with scores reported in terms of general population norms. We provide a crosswalk to the PHQ-9 to facilitate comparisons between measures. The item bank may be administered as a CAT or as a short form and is suitable for research and clinical applications.
Measuring Constructs in Family Science: How Can Item Response Theory Improve Precision and Validity?

Science.gov (United States)

Gordon, Rachel A.

2015-01-01

This article provides family scientists with an understanding of contemporary measurement perspectives and the ways in which item response theory (IRT) can be used to develop measures with desired evidence of precision and validity for research uses. The article offers a nontechnical introduction to some key features of IRT, including its…
Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items

Science.gov (United States)

Aybek, Eren Can; Demirtasli, R. Nukhet

2017-01-01

This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…
The Spanish version of the Self-Determination Inventory Student Report: application of item response theory to self-determination measurement.

Science.gov (United States)

Mumbardó-Adam, C; Guàrdia-Olmos, J; Giné, C; Raley, S K; Shogren, K A

2018-04-01

A new measure of self-determination, the Self-Determination Inventory: Student Report (Spanish version), has recently been adapted and empirically validated in Spanish language. As it is the first instrument intended to measure self-determination in youth with and without disabilities, there is a need to further explore and strengthen its psychometric analysis based on item response patterns. Through item response theory approach, this study examined item observed distributions across the essential characteristics of self-determination. The results demonstrated satisfactory to excellent item functioning patterns across characteristics, particularly within agentic action domains. Increased variability across items was also found within action-control beliefs dimensions, specifically within the self-realisation subdomain. These findings further support the instrument's psychometric properties and outline future research directions. © 2017 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
Maslach Burnout Inventory and a Self-Defined, Single-Item Burnout Measure Produce Different Clinician and Staff Burnout Estimates.

Science.gov (United States)

Knox, Margae; Willard-Grace, Rachel; Huang, Beatrice; Grumbach, Kevin

2018-06-04

Clinicians and healthcare staff report high levels of burnout. Two common burnout assessments are the Maslach Burnout Inventory (MBI) and a single-item, self-defined burnout measure. Relatively little is known about how the measures compare. To identify the sensitivity, specificity, and concurrent validity of the self-defined burnout measure compared to the more established MBI measure. Cross-sectional survey (November 2016-January 2017). Four hundred forty-four primary care clinicians and 606 staff from three San Francisco Aarea healthcare systems. The MBI measure, calculated from a high score on either the emotional exhaustion or cynicism subscale, and a single-item measure of self-defined burnout. Concurrent validity was assessed using a validated, 7-item team culture scale as reported by Willard-Grace et al. (J Am Board Fam Med 27(2):229-38, 2014) and a standard question about workplace atmosphere as reported by Rassolian et al. (JAMA Intern Med 177(7):1036-8, 2017) and Linzer et al. (Ann Intern Med 151(1):28-36, 2009). Similar to other nationally representative burnout estimates, 52% of clinicians (95% CI: 47-57%) and 46% of staff (95% CI: 42-50%) reported high MBI emotional exhaustion or high MBI cynicism. In contrast, 29% of clinicians (95% CI: 25-33%) and 31% of staff (95% CI: 28-35%) reported "definitely burning out" or more severe symptoms on the self-defined burnout measure. The self-defined measure's sensitivity to correctly identify MBI-assessed burnout was 50.4% for clinicians and 58.6% for staff; specificity was 94.7% for clinicians and 92.3% for staff. Area under the receiver operator curve was 0.82 for clinicians and 0.81 for staff. Team culture and atmosphere were significantly associated with both self-defined burnout and the MBI, confirming concurrent validity. Point estimates of burnout notably differ between the self-defined and MBI measures. Compared to the MBI, the self-defined burnout measure misses half of high-burnout clinicians and more
The Blood Donor Anxiety Scale: a six-item state anxiety measure based on the Spielberger State-Trait Anxiety Inventory.

Science.gov (United States)

Chell, Kathleen; Waller, Daniel; Masser, Barbara

2016-06-01

Research demonstrates that anxiety elevates the risk of blood donors experiencing adverse events, which in turn deters the performance of repeat blood donations. Identifying donors suffering from heightened state anxiety is important to assess the impact of evidence-based interventions. This study analyzed the appropriateness of a shortened version of the state subscale of the State-Trait Anxiety Inventory (STAI) in a blood donation context. STAI-State questionnaire data were collected from two separate samples of Australian blood donors (n = 919 and n = 824 after cleaning). Responses to demographic, donation history, and adverse reaction questions were also obtained. Identification of items and analysis was performed systematically to assess and compare internal reliability and content, construct, convergent, and criterion validity of three potential short-form state anxiety scales. Of the three short-form scales tested, STAI-State six-item scale demonstrated the best metric properties with the least number of items across both sample groups. Cronbach's alpha was acceptable (α = 0.844 and α = 0.820), correlated positively with the original measure (r = 0.927 and r = 0.931) and criterion-related variables, and maintained the two-dimension factorial structure of the original measure. The six-item short version of the STAI-State subscale presented the most reliable and valid scale for use with blood donors. A validated donor anxiety tool provides a standardized assessment and record of donor anxiety to gauge the effectiveness of ongoing efforts to enhance the donation experience. © 2016 AABB.
Teoria da Resposta ao Item Teoria de la respuesta al item Item response theory

Directory of Open Access Journals (Sweden)

Eutalia Aparecida Candido de Araujo

2009-12-01

Full Text Available A preocupação com medidas de traços psicológicos é antiga, sendo que muitos estudos e propostas de métodos foram desenvolvidos no sentido de alcançar este objetivo. Entre os trabalhos propostos, destaca-se a Teoria da Resposta ao Item (TRI que, a princípio, veio completar limitações da Teoria Clássica de Medidas, empregada em larga escala até hoje na medida de traços psicológicos. O ponto principal da TRI é que ela leva em consideração o item particularmente, sem relevar os escores totais; portanto, as conclusões não dependem apenas do teste ou questionário, mas de cada item que o compõe. Este artigo propõe-se a apresentar esta Teoria que revolucionou a teoria de medidas.La preocupación con las medidas de los rasgos psicológicos es antigua y muchos estudios y propuestas de métodos fueron desarrollados para lograr este objetivo. Entre estas propuestas de trabajo se incluye la Teoría de la Respuesta al Ítem (TRI que, en principio, vino a completar las limitaciones de la Teoría Clásica de los Tests, ampliamente utilizada hasta hoy en la medida de los rasgos psicológicos. El punto principal de la TRI es que se tiene en cuenta el punto concreto, sin relevar las puntuaciones totales; por lo tanto, los resultados no sólo dependen de la prueba o cuestionario, sino que de cada ítem que lo compone. En este artículo se propone presentar la Teoría que revolucionó la teoría de medidas.The concern with measures of psychological traits is old and many studies and proposals of methods were developed to achieve this goal. Among these proposed methods highlights the Item Response Theory (IRT that, in principle, came to complete limitations of the Classical Test Theory, which is widely used until nowadays in the measurement of psychological traits. The main point of IRT is that it takes into account the item in particular, not relieving the total scores; therefore, the findings do not only depend on the test or questionnaire
Using Linear Equating to Map PROMIS(®) Global Health Items and the PROMIS-29 V2.0 Profile Measure to the Health Utilities Index Mark 3.

Science.gov (United States)

Hays, Ron D; Revicki, Dennis A; Feeny, David; Fayers, Peter; Spritzer, Karen L; Cella, David

2016-10-01

Preference-based health-related quality of life (HR-QOL) scores are useful as outcome measures in clinical studies, for monitoring the health of populations, and for estimating quality-adjusted life-years. This was a secondary analysis of data collected in an internet survey as part of the Patient-Reported Outcomes Measurement Information System (PROMIS(®)) project. To estimate Health Utilities Index Mark 3 (HUI-3) preference scores, we used the ten PROMIS(®) global health items, the PROMIS-29 V2.0 single pain intensity item and seven multi-item scales (physical functioning, fatigue, pain interference, depressive symptoms, anxiety, ability to participate in social roles and activities, sleep disturbance), and the PROMIS-29 V2.0 items. Linear regression analyses were used to identify significant predictors, followed by simple linear equating to avoid regression to the mean. The regression models explained 48 % (global health items), 61 % (PROMIS-29 V2.0 scales), and 64 % (PROMIS-29 V2.0 items) of the variance in the HUI-3 preference score. Linear equated scores were similar to observed scores, although differences tended to be larger for older study participants. HUI-3 preference scores can be estimated from the PROMIS(®) global health items or PROMIS-29 V2.0. The estimated HUI-3 scores from the PROMIS(®) health measures can be used for economic applications and as a measure of overall HR-QOL in research.
Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression

DEFF Research Database (Denmark)

Scott, Neil W; Fayers, Peter M; Aaronson, Neil K

2010-01-01

Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise ...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....
The Relative Importance of Persons, Items, Subtests, and Languages to TOEFL Test Variance.

Science.gov (United States)

Brown, James Dean

1999-01-01

Explored the relative contributions to Test of English as a Foreign Language (TOEFL) score dependability of various numbers of persons, items, subtests, languages, and their various interactions. Sampled 15,000 test takers, 1000 each from 15 different language backgrounds. (Author/VWL)
Science Library of Test Items. Volume Twenty-Two. A Collection of Multiple Choice Test Items Relating Mainly to Skills.

Science.gov (United States)

New South Wales Dept. of Education, Sydney (Australia).

As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
Science Library of Test Items. Volume Twenty. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 1.

Science.gov (United States)

New South Wales Dept. of Education, Sydney (Australia).

As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
Item response theory analyses of the Delis-Kaplan Executive Function System card sorting subtest.

Science.gov (United States)

Spencer, Mercedes; Cho, Sun-Joo; Cutting, Laurie E

2018-02-02

In the current study, we examined the dimensionality of the 16-item Card Sorting subtest of the Delis-Kaplan Executive Functioning System assessment in a sample of 264 native English-speaking children between the ages of 9 and 15 years. We also tested for measurement invariance for these items across age and gender groups using item response theory (IRT). Results of the exploratory factor analysis indicated that a two-factor model that distinguished between verbal and perceptual items provided the best fit to the data. Although the items demonstrated measurement invariance across age groups, measurement invariance was violated for gender groups, with two items demonstrating differential item functioning for males and females. Multigroup analysis using all 16 items indicated that the items were more effective for individuals whose IRT scale scores were relatively high. A single-group explanatory IRT model using 14 non-differential item functioning items showed that for perceptual ability, females scored higher than males and that scores increased with age for both males and females; for verbal ability, the observed increase in scores across age differed for males and females. The implications of these findings are discussed.
Cognitive interviewing methodology in the development of a pediatric item bank: a patient reported outcomes measurement information system (PROMIS study

Directory of Open Access Journals (Sweden)

DeWalt Darren A

2009-01-01

Full Text Available Abstract Background The evaluation of patient-reported outcomes (PROs in health care has seen greater use in recent years, and methods to improve the reliability and validity of PRO instruments are advancing. This paper discusses the cognitive interviewing procedures employed by the Patient Reported Outcomes Measurement Information System (PROMIS pediatrics group for the purpose of developing a dynamic, electronic item bank for field testing with children and adolescents using novel computer technology. The primary objective of this study was to conduct cognitive interviews with children and adolescents to gain feedback on items measuring physical functioning, emotional health, social health, fatigue, pain, and asthma-specific symptoms. Methods A total of 88 cognitive interviews were conducted with 77 children and adolescents across two sites on 318 items. From this initial item bank, 25 items were deleted and 35 were revised and underwent a second round of cognitive interviews. A total of 293 items were retained for field testing. Results Children as young as 8 years of age were able to comprehend the majority of items, response options, directions, recall period, and identify problems with language that was difficult for them to understand. Cognitive interviews indicated issues with item comprehension on several items which led to alternative wording for these items. Conclusion Children ages 8–17 years were able to comprehend most item stems and response options in the present study. Field testing with the resulting items and response options is presently being conducted as part of the PROMIS Pediatric Item Bank development process.
Application of the Commercial Grade Item (CGI) Dedication Process for Procurement of Nuclear Safety Related Items at Nuclear Power Plant Krsko (NEK)

International Nuclear Information System (INIS)

Heruc, Z.; Pozar, J.

1998-01-01

CGI procurement is a process whereby parts are brought without imposing Appendix B Quality Assurance requirements on the supplier, and than dedicated for use in safety-related applications. The dedication process involves 1) based upon required safety function, an engineering evaluation to identify critical characteristic of the item and specification of acceptance criteria; and 2) quality control activities to ensure the item(s) supplied meets the acceptance criteria specified. CGI Dedication supports the supply of certified components/parts for the plant operation in an environment where the number of nuclear qualified suppliers diminishes. It requires a more active role of the plant personnel, therefore presenting an additional burden on human resources, but at the same time increases the technical KNOW-HOW and improves the confidence of test and inspection data presented in the certificates. Very often it is also cost beneficial. This paper is a continuation to last year presentation of the introduction of this method into NEK's procurement process and presents the current approach and some practical examples. (author)
Measuring resilience after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Resilience item bank and short form.

Science.gov (United States)

Victorson, David; Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Weiland, Brian; Choi, Seung W

2015-05-01

To describe the development and psychometric properties of the Spinal Cord Injury--Quality of Life (SCI-QOL) Resilience item bank and short form. Using a mixed-methods design, we developed and tested a resilience item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory based analytic approaches, including tests of model fit and differential item functioning (DIF). We tested a 32-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Department of Veterans Affairs medical center. A total of 717 individuals with SCI completed the Resilience items. A unidimensional model was observed (CFI=0.968; RMSEA=0.074) and measurement precision was good (theta range between -3.1 and 0.9). Ten items were flagged for DIF, however, after examination of effect sizes we found this to be negligible with little practical impact on score estimates. The final calibrated item bank resulted in 21 retained items. This study indicates that the SCI-QOL Resilience item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.
Three controversies over item disclosure in medical licensure examinations

Directory of Open Access Journals (Sweden)

Yoon Soo Park

2015-09-01

Full Text Available In response to views on public's right to know, there is growing attention to item disclosure – release of items, answer keys, and performance data to the public – in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations – 1 fairness and validity, 2 impact on passing levels, and 3 utility of item disclosure – by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers’ right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration.
An emotional functioning item bank of 24 items for computerized adaptive testing (CAT) was established

DEFF Research Database (Denmark)

Petersen, Morten Aa.; Gamper, Eva-Maria; Costantini, Anna

2016-01-01

of the widely used EORTC Quality of Life questionnaire (QLQ-C30). STUDY DESIGN AND SETTING: On the basis of literature search and evaluations by international samples of experts and cancer patients, 38 candidate items were developed. The psychometric properties of the items were evaluated in a large...... international sample of cancer patients. This included evaluations of dimensionality, item response theory (IRT) model fit, differential item functioning (DIF), and of measurement precision/statistical power. RESULTS: Responses were obtained from 1,023 cancer patients from four countries. The evaluations showed...... that 24 items could be included in a unidimensional IRT model. DIF did not seem to have any significant impact on the estimation of EF. Evaluations indicated that the CAT measure may reduce sample size requirements by up to 50% compared to the QLQ-C30 EF scale without reducing power. CONCLUSION...
Does item overlap render measured relationships between pain and challenging behaviour trivial? Results from a multicentre cross-sectional study in 13 German nursing homes.

Science.gov (United States)

Kutschar, Patrick; Bauer, Zsuzsa; Gnass, Irmela; Osterbrink, Jürgen

2017-07-01

Several studies suggest that pain is a trigger for challenging behaviour in older adults with cognitive impairment. However, such measured relationships might be confounded due to item overlap as instruments share similar or identical items. The purpose of this study was to examine whether the frequently observed association between pain and challenging behaviour might be traced back to item overlap. This multicentre cross-sectional study was conducted in 13 nursing homes and examined pain (measure: Pain Assessment in Advanced Dementia Scale) and challenging behaviour (measure: Cohen-Mansfield Agitation Inventory) in 150 residents with severe cognitive impairment. The extent of item overlap was determined by juxtaposition of both measures' original items. As expected, comparison between these instruments revealed an extensive item overlap. The statistical relationship between the two phenomena can be traced back mainly to the contribution of the overlapping items, which renders the frequently stated relationship between pain and challenging behaviour trivial. The status quo of measuring such associations must be contested: constructs' discrimination and instruments' discrimination have to be discussed critically as item overlap may lead to biased conclusions and assumptions in research as well as to inadequate care measures in nursing practice. © 2017 John Wiley & Sons Ltd.

Desenvolvimento de uma escala para medir o potencial empreendedor utilizando a Teoria da Resposta ao Item (TRI Development of a scale to measure the entrepreneurial potential using the Item Response Theory (IRT

Directory of Open Access Journals (Sweden)

Luciano Ricardo Rath Alves

2011-01-01

Full Text Available Diversas variáveis estão relacionadas ao desenvolvimento da atividade empreendedora, verifica-se, entre elas, a importância do agente empreendedor. Dos estudos que contribuem para o seu entendimento, este segue a linha que defende que o empreendedor tem características e traços de personalidade singulares em relação à população, os quais são propícios ao sucesso do empreendedorismo. O objetivo deste trabalho é desenvolver uma escala para medir o potencial empreendedor utilizando a Teoria da Resposta ao Item. Foi utilizado o modelo logístico de dois parâmetros da TRI. As estimativas dos parâmetros foram obtidas a partir da amostra com 764 pessoas que responderam a um instrumento composto por 103 itens. A curva de informação e do erro padrão do teste e a interpretação qualitativa de níveis da escala permitiram determinar o intervalo mais apropriado para utilização do instrumento. Os resultados mostraram que a escala é mais adequada para avaliar indivíduos com baixo até moderadamente alto potencial empreendedor. Por isso, sugere-se que novos itens sejam incorporados ao instrumento para mensurar e interpretar níveis ainda mais elevados. A Teoria da Resposta ao Item permite que novos itens sejam calibrados a fim de mensurar os empreendedores com alto potencial empreendedor, aproveitando os dados já obtidos.Several variables are related to the development of entrepreneurial activities. An important one among them is the entrepreneurial agent. This study is one of many that contribute to the understanding of the entrepreneurial agent. In its line of thought, it upholds the idea that the entrepreneur has characteristics and personality traits that stand out from the general population and that are favorable to the success of the entrepreneurship. This study aims at developing a measurement scale for entrepreneurial potential using the Item Response Theory. The items were generated by Santos (2008 based on a theoretical model
Item analysis of single-peaked response data : the psychometric evaluation of bipolar measurement scales

NARCIS (Netherlands)

Polak, Maaike Geertruida

2011-01-01

The thesis explains the fundamental difference between unipolar and bipolar measurement scales for psychological characteristics. We explore the use of correspondence analysis (CA), a technique that is similar to principal component analysis and is available in SAS and SPSS, to select items that
Spare Items validation

International Nuclear Information System (INIS)

Fernandez Carratala, L.

1998-01-01

There is an increasing difficulty for purchasing safety related spare items, with certifications by manufacturers for maintaining the original qualifications of the equipment of destination. The main reasons are, on the top of the logical evolution of technology, applied to the new manufactured components, the quitting of nuclear specific production lines and the evolution of manufacturers quality systems, originally based on nuclear codes and standards, to conventional industry standards. To face this problem, for many years different Dedication processes have been implemented to verify whether a commercial grade element is acceptable to be used in safety related applications. In the same way, due to our particular position regarding the spare part supplies, mainly from markets others than the american, C.N. Trillo has developed a methodology called Spare Items Validation. This methodology, which is originally based on dedication processes, is not a single process but a group of coordinated processes involving engineering, quality and management activities. These are to be performed on the spare item itself, its design control, its fabrication and its supply for allowing its use in destinations with specific requirements. The scope of application is not only focussed on safety related items, but also to complex design, high cost or plant reliability related components. The implementation in C.N. Trillo has been mainly curried out by merging, modifying and making the most of processes and activities which were already being performed in the company. (Author)
IDENTIFICATION OF MEASUREMENT ITEMS OF DESIGN REQUIREMENTS FOR LEAN AND AGILE SUPPLY CHAIN-CONFIRMATORY FACTOR ANALYSIS

Directory of Open Access Journals (Sweden)

D.Venkata Ramana

2013-06-01

Full Text Available This study examines the consistency approaches by confirmatory factor analysis that determines the construct validity, convergent validity, construct reliability and internal consistency of the items of strategic design requirements. The design requirements includes use of information technology, sourcing procedures, new product development, flexible manufacturing functions and demand management supply chain net work design, management, commitment and inventory management policies among manufacturers of volatile and unforeseeable products in Andhraadesh, India. This study suggested that the seven factor model with 20 items of the leagile supply chain design requirements had a good fit. Further, the study showed a val id and reliable measurement to identify critical items among the design requirements of leagile supply chains.
The impact of item order on ratings of cancer risk perception.

Science.gov (United States)

Taylor, Kathryn L; Shelby, Rebecca A; Schwartz, Marc D; Ackerman, Josh; LaSalle, V Holland; Gelmann, Edward P; McGuire, Colleen

2002-07-01

Although perceived risk is central to most theories of health behavior, there is little consensus on its measurement with regard to item wording, response set, or the number of items to include. In a methodological assessment of perceived risk, we assessed the impact of changing the order of three commonly used perceived risk items: quantitative personal risk, quantitative population risk, and comparative risk. Participants were 432 men and women enrolled in an ancillary study of the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial. Three groups of consecutively enrolled participants responded to the three items in one of three question orders. Results indicated that item order was related to the perceived risk ratings of both ovarian (P Perceptions of risk were significantly lower when the comparative rating was made first. The findings suggest that compelling participants to consider their own risk relative to the risk of others results in lower ratings of perceived risk. Although the use of multiple items may provide more information than when only a single method is used, different conclusions may be reached depending on the context in which an item is assessed.
Science Library of Test Items. Volume Twenty-One. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 2.

Science.gov (United States)

New South Wales Dept. of Education, Sydney (Australia).

As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
Analyzing force concept inventory with item response theory

Science.gov (United States)

Wang, Jing; Bao, Lei

2010-10-01

Item response theory is a popular assessment method used in education. It rests on the assumption of a probability framework that relates students' innate ability and their performance on test questions. Item response theory transforms students' raw test scores into a scaled proficiency score, which can be used to compare results obtained with different test questions. The scaled score also addresses the issues of ceiling effects and guessing, which commonly exist in quantitative assessment. We used item response theory to analyze the force concept inventory (FCI). Our results show that item response theory can be useful for analyzing physics concept surveys such as the FCI and produces results about the individual questions and student performance that are beyond the capability of classical statistics. The theory yields detailed measurement parameters regarding the difficulty, discrimination features, and probability of correct guess for each of the FCI questions.
Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

Science.gov (United States)

Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina

2015-06-01

This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism. © The Author(s) 2014.
Modeling Item-Level and Step-Level Invariance Effects in Polytomous Items Using the Partial Credit Model

Science.gov (United States)

Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D.

2012-01-01

Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…
Evaluation of item candidates for a diabetic retinopathy quality of life item bank.

Science.gov (United States)

Fenwick, Eva K; Pesudovs, Konrad; Khadka, Jyoti; Rees, Gwyn; Wong, Tien Y; Lamoureux, Ecosse L

2013-09-01

We are developing an item bank assessing the impact of diabetic retinopathy (DR) on quality of life (QoL) using a rigorous multi-staged process combining qualitative and quantitative methods. We describe here the first two qualitative phases: content development and item evaluation. After a comprehensive literature review, items were generated from four sources: (1) 34 previously validated patient-reported outcome measures; (2) five published qualitative articles; (3) eight focus groups and 18 semi-structured interviews with 57 DR patients; and (4) seven semi-structured interviews with diabetes or ophthalmic experts. Items were then evaluated during 3 stages, namely binning (grouping) and winnowing (reduction) based on key criteria and panel consensus; development of item stems and response options; and pre-testing of items via cognitive interviews with patients. The content development phase yielded 1,165 unique items across 7 QoL domains. After 3 sessions of binning and winnowing, items were reduced to a minimally representative set (n = 312) across 9 domains of QoL: visual symptoms; ocular surface symptoms; activity limitation; mobility; emotional; health concerns; social; convenience; and economic. After 8 cognitive interviews, 42 items were amended resulting in a final set of 314 items. We have employed a systematic approach to develop items for a DR-specific QoL item bank. The psychometric properties of the nine QoL subscales will be assessed using Rasch analysis. The resulting validated item bank will allow clinicians and researchers to better understand the QoL impact of DR and DR therapies from the patient's perspective.
Development and validation of an item response theory-based Social Responsiveness Scale short form.

Science.gov (United States)

Sturm, Alexandra; Kuhfeld, Megan; Kasari, Connie; McCracken, James T

2017-09-01

Research and practice in autism spectrum disorder (ASD) rely on quantitative measures, such as the Social Responsiveness Scale (SRS), for characterization and diagnosis. Like many ASD diagnostic measures, SRS scores are influenced by factors unrelated to ASD core features. This study further interrogates the psychometric properties of the SRS using item response theory (IRT), and demonstrates a strategy to create a psychometrically sound short form by applying IRT results. Social Responsiveness Scale analyses were conducted on a large sample (N = 21,426) of youth from four ASD databases. Items were subjected to item factor analyses and evaluation of item bias by gender, age, expressive language level, behavior problems, and nonverbal IQ. Item selection based on item psychometric properties, DIF analyses, and substantive validity produced a reduced item SRS short form that was unidimensional in structure, highly reliable (α = .96), and free of gender, age, expressive language, behavior problems, and nonverbal IQ influence. The short form also showed strong relationships with established measures of autism symptom severity (ADOS, ADI-R, Vineland). Degree of association between all measures varied as a function of expressive language. Results identified specific SRS items that are more vulnerable to non-ASD-related traits. The resultant 16-item SRS short form may possess superior psychometric properties compared to the original scale and emerge as a more precise measure of ASD core symptom severity, facilitating research and practice. Future research using IRT is needed to further refine existing measures of autism symptomatology. © 2017 Association for Child and Adolescent Mental Health.
The child play behavior and activity questionnaire: a parent-report measure of childhood gender-related behavior in China.

Science.gov (United States)

Yu, Lu; Winter, Sam; Xie, Dong

2010-06-01

Boys and girls establish relatively stable gender stereotyped behavior patterns by middle childhood. Parent-report questionnaires measuring children's gender-related behavior enable researchers to conduct large-scale screenings of community samples of children. For school-aged children, two parent-report instruments, the Child Game Participation Questionnaire (CGPQ) and the Child Behavior and Attitude Questionnaire (CBAQ), have long been used for measuring children's sex-dimorphic behaviors in Western societies, but few studies have been conducted using these measures for Chinese populations. The current study aimed to empirically examine and modify the two instruments for their applications to Chinese society. Parents of 486 Chinese boys and 417 Chinese girls (6-12 years old) completed a questionnaire comprising items from the CGPQ and CBAQ, and an additional 14 items specifically related to Chinese gender-specific games. Items revealing gender differences in a Chinese sample were identified and used to construct a Child Play Behavior and Activity Questionnaire (CPBAQ). Four new scales were generated through factor analysis: a Gender Scale, a Girl Typicality Scale, a Boy Typicality Scale, and a Cross-Gender Scale (CGS). These scales had satisfactory internal reliabilities and large effect sizes for gender. The CPBAQ is believed to be a promising instrument for measuring children's gender-related behavior in China.
Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank

NARCIS (Netherlands)

Oude Voshaar, Martijn A.H.; Ten Klooster, Peter M.; Vonkeman, Harald E.; van de Laar, Mart A.F.J.

2017-01-01

Objective: Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Study
Examining the Psychometric Quality of Multiple-Choice Assessment Items using Mokken Scale Analysis.

Science.gov (United States)

Wind, Stefanie A

The concept of invariant measurement is typically associated with Rasch measurement theory (Engelhard, 2013). Concerned with the appropriateness of the parametric transformation upon which the Rasch model is based, Mokken (1971) proposed a nonparametric procedure for evaluating the quality of social science measurement that is theoretically and empirically related to the Rasch model. Mokken's nonparametric procedure can be used to evaluate the quality of dichotomous and polytomous items in terms of the requirements for invariant measurement. Despite these potential benefits, the use of Mokken scaling to examine the properties of multiple-choice (MC) items in education has not yet been fully explored. A nonparametric approach to evaluating MC items is promising in that this approach facilitates the evaluation of assessments in terms of invariant measurement without imposing potentially inappropriate transformations. Using Rasch-based indices of measurement quality as a frame of reference, data from an eighth-grade physical science assessment are used to illustrate and explore Mokken-based techniques for evaluating the quality of MC items. Implications for research and practice are discussed.
Do Self Concept Tests Test Self Concept? An Evaluation of the Validity of Items on the Piers Harris and Coopersmith Measures.

Science.gov (United States)

Lynch, Mervin D.; Chaves, John

Items from Peirs-Harris and Coopersmith self-concept tests were evaluated against independent measures on three self-constructs, idealized, empathic, and worth. Construct measurements were obtained with the semantic differential and D statistic. Ratings were obtained from 381 children, grades 4-6. For each test, item ratings and construct measures…
Initial validation of the Nine Item Avoidant/Restrictive Food Intake disorder screen (NIAS): A measure of three restrictive eating patterns.

Science.gov (United States)

Zickgraf, Hana F; Ellis, Jordan M

2018-04-01

Avoidant/Restrictive Food Intake Disorder (ARFID) is an eating or feeding disorder characterized by inadequate nutritional or caloric intake leading to weight loss, nutritional deficiency, supplement dependence, and/or significant psychosocial impairment. DSM-5 lists three different eating patterns that can lead to symptoms of ARFID: avoidance of foods due to their sensory properties (e.g., picky eating), poor appetite or limited interest in eating, or fear of negative consequences from eating. Research on the prevalence and psychopathology of ARFID is limited by the lack of validated instruments to measure these eating behaviors. The present study describes the development and validation of the nine-item ARFID screen (NIAS), a brief multidimensional instrument to measure ARFID-associated eating behaviors. Participants were 455 adults recruited on Amazon's Mechanical Turk, 505 adults recruited from a nationally-representative subject pool, and 311 undergraduates participating in research for course credit. Exploratory and confirmatory factor analyses provided evidence for three factors. The NIAS subscales demonstrated high internal consistency, test-retest reliability, invariant item loadings between two samples, and convergent/discriminant validity with other measures of picky eating, appetite, fear of negative consequences, and psychopathology. The scales were also correlated with measures of ARFID-like symptoms (e.g., low BMI, low fruit/vegetable variety and intake, and eating-related psychosocial interference/distress), although the picky eating, appetite, and fear scales had distinct independent relationships with these constructs. The NIAS is a brief, reliable instrument that may be used to further investigate ARFID-related eating behaviors. Copyright © 2017 Elsevier Ltd. All rights reserved.
The role of attention in item-item binding in visual working memory.

Science.gov (United States)

Peterson, Dwight J; Naveh-Benjamin, Moshe

2017-09-01

An important yet unresolved question regarding visual working memory (VWM) relates to whether or not binding processes within VWM require additional attentional resources compared with processing solely the individual components comprising these bindings. Previous findings indicate that binding of surface features (e.g., colored shapes) within VWM is not demanding of resources beyond what is required for single features. However, it is possible that other types of binding, such as the binding of complex, distinct items (e.g., faces and scenes), in VWM may require additional resources. In 3 experiments, we examined VWM item-item binding performance under no load, articulatory suppression, and backward counting using a modified change detection task. Binding performance declined to a greater extent than single-item performance under higher compared with lower levels of concurrent load. The findings from each of these experiments indicate that processing item-item bindings within VWM requires a greater amount of attentional resources compared with single items. These findings also highlight an important distinction between the role of attention in item-item binding within VWM and previous studies of long-term memory (LTM) where declines in single-item and binding test performance are similar under divided attention. The current findings provide novel evidence that the specific type of binding is an important determining factor regarding whether or not VWM binding processes require attention. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Comparison on Computed Tomography using industrial items

DEFF Research Database (Denmark)

Angel, Jais Andreas Breusch; De Chiffre, Leonardo

2014-01-01

In a comparison involving 27 laboratories from 8 countries, measurements on two common industrial items, a polymer part and a metal part, were carried out using X-ray Computed Tomography. All items were measured using coordinate measuring machines before and after circulation, with reference...
Practice of radiation dose control for tech-modification items in Qinshan Nuclear Power Plant

International Nuclear Information System (INIS)

Zhang Yong; Chen Zhongyu; Xu Hongming; Fan Liguang; Jiang Jianqi; Bu Weidong

2006-01-01

In order to improve the safety and reliability of nuclear power plant operation, many tech-modifications related to system or equipment have been completed since operation in Qinshan NPP. this paper introduces radiation dose control for mainly tech-modifications items related to radiation, including radiation protection optimization measures and experience in aspects of item planning, program writing, process control, etc. (authors)
Measuring Integration of Information and Communication Technology in Education: An Item Response Modeling Approach

Science.gov (United States)

Peeraer, Jef; Van Petegem, Peter

2012-01-01

This research describes the development and validation of an instrument to measure integration of Information and Communication Technology (ICT) in education. After literature research on definitions of integration of ICT in education, a comparison is made between the classical test theory and the item response modeling approach for the…

Measuring anxiety after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Anxiety item bank and linkage with GAD-7.

Science.gov (United States)

Kisala, Pamela A; Tulsky, David S; Kalpakjian, Claire Z; Heinemann, Allen W; Pohlig, Ryan T; Carle, Adam; Choi, Seung W

2015-05-01

To develop a calibrated item bank and computer adaptive test to assess anxiety symptoms in individuals with spinal cord injury (SCI), transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a statistical linkage with the Generalized Anxiety Disorder (GAD)-7, a widely used anxiety measure. Grounded-theory based qualitative item development methods; large-scale item calibration field testing; confirmatory factor analysis; graded response model item response theory analyses; statistical linking techniques to transform scores to a PROMIS metric; and linkage with the GAD-7. Setting Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Participants Adults with traumatic SCI. Spinal Cord Injury-Quality of Life (SCI-QOL) Anxiety Item Bank Seven hundred sixteen individuals with traumatic SCI completed 38 items assessing anxiety, 17 of which were PROMIS items. After 13 items (including 2 PROMIS items) were removed, factor analyses confirmed unidimensionality. Item response theory analyses were used to estimate slopes and thresholds for the final 25 items (15 from PROMIS). The observed Pearson correlation between the SCI-QOL Anxiety and GAD-7 scores was 0.67. The SCI-QOL Anxiety item bank demonstrates excellent psychometric properties and is available as a computer adaptive test or short form for research and clinical applications. SCI-QOL Anxiety scores have been transformed to the PROMIS metric and we provide a method to link SCI-QOL Anxiety scores with those of the GAD-7.
Item-focussed Trees for the Identification of Items in Differential Item Functioning.

Science.gov (United States)

Tutz, Gerhard; Berger, Moritz

2016-09-01

A novel method for the identification of differential item functioning (DIF) by means of recursive partitioning techniques is proposed. We assume an extension of the Rasch model that allows for DIF being induced by an arbitrary number of covariates for each item. Recursive partitioning on the item level results in one tree for each item and leads to simultaneous selection of items and variables that induce DIF. For each item, it is possible to detect groups of subjects with different item difficulties, defined by combinations of characteristics that are not pre-specified. The way a DIF item is determined by covariates is visualized in a small tree and therefore easily accessible. An algorithm is proposed that is based on permutation tests. Various simulation studies, including the comparison with traditional approaches to identify items with DIF, show the applicability and the competitive performance of the method. Two applications illustrate the usefulness and the advantages of the new method.
A Comparison of the 27-Item and 12-Item Intolerance of Uncertainty Scales

Science.gov (United States)

Khawaja, Nigar G.; Yu, Lai Ngo Heidi

2010-01-01

The 27-item Intolerance of Uncertainty Scale (IUS) has become one of the most frequently used measures of Intolerance of Uncertainty. More recently, an abridged, 12-item version of the IUS has been developed. The current research used clinical (n = 50) and non-clinical (n = 56) samples to examine and compare the psychometric properties of both…
Measurement invariance across educational levels and gender in 12-item Zarit Burden Interview (ZBI) on caregivers of people with dementia.

Science.gov (United States)

Lin, Chung-Ying; Ku, Li-Jung Elizabeth; Pakpour, Amir H

2017-11-01

The Zarit Burden Interview (ZBI) is a commonly used self-report to assess caregiver burden. A 12-item short form of the ZBI has been developed; however, its measurement invariance has not been examined across some different demographics. It is unclear whether different genders and educational levels of a population interpret the ZBI items similarly. Therefore, this study aimed to examine the measurement invariance of the 12-item ZBI across gender and educational levels in a Taiwanese sample. Caregivers who had a family member with dementia (n = 270) completed the ZBI through telephone interviews. Three confirmatory factor analysis (CFA) models were conducted: Model 1 was the configural model, Model 2 constrained all factor loadings, Model 3 constrained all factor loadings and item intercepts. Multiple group CFAs and the differential item functioning (DIF) contrast under Rasch analyses were used to detect measurement invariance across males (n = 100) and females (n = 170) and across educational levels of junior high schools and below (n = 86) and senior high schools and above (n = 183). The fit index differences between models supported the measurement invariance across gender and across educational levels (∆ comparative fit index (CFI) = -0.010 and 0.003; ∆ root mean square error of approximation (RMSEA) = -0.006 to 0.004). No substantial DIF contrast was found across gender and educational levels (value = -0.36 to 0.29). The ZBI is appropriate for combined use and for comparisons in caregivers across gender and different educational levels in Taiwan.
Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form.

Science.gov (United States)

Kalpakjian, Claire Z; Tate, Denise G; Kisala, Pamela A; Tulsky, David S

2015-05-01

To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory-(IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. A total of 717 individuals with SCI completed the self-esteem items. A unidimensional model was observed (CFI=0.946; RMSEA=0.087) and measurement precision was good (theta range between -2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.
A Music-Related Quality of Life Measure to Guide Music Rehabilitation for Adult Cochlear Implant Users.

Science.gov (United States)

Dritsakis, Giorgos; van Besouw, Rachel M; Kitterick, Pádraig; Verschuur, Carl A

2017-09-18

A music-related quality of life (MuRQoL) questionnaire was developed for the evaluation of music rehabilitation for adult cochlear implant (CI) users. The present studies were aimed at refinement and validation. Twenty-four experts reviewed the MuRQoL items for face validity. A refined version was completed by 147 adult CI users, and psychometric techniques were used for item selection, assessment of reliability, and definition of the factor structure. The same participants completed the Short Form Health Survey for construct validation. MuRQoL responses from 68 CI users were compared with those of a matched group of adults with normal hearing. Eighteen items measuring music perception and engagement and 18 items measuring their importance were selected; they grouped together into 2 domains. The final questionnaire has high internal consistency and repeatability. Significant differences between CI users and adults with normal hearing and a correlation between music engagement and quality of life support construct validity. Scores of music perception and engagement and importance for the 18 items can be combined to assess the impact of music on the quality of life. The MuRQoL questionnaire is a reliable and valid measure of self-reported music perception, engagement, and their importance for adult CI users with potential to guide music aural rehabilitation.
A measure of smoking abstinence-related motivational engagement: development and initial validation.

Science.gov (United States)

Simmons, Vani N; Heckman, Bryan W; Ditre, Joseph W; Brandon, Thomas H

2010-04-01

Although a great deal of research has focused on measuring motivation and readiness to quit smoking, little research has assessed gross motivational changes after a smoker has made an attempt to quit smoking. Unlike previous single-item global measures of motivation to remain abstinent, we developed the abstinence-related motivational engagement (ARME) scale to evaluate the degree to which abstinence motivation is reflected by an ex-smoker's daily experience in areas that include cognitive effort, priority, vigilance, and excitement. The aim of this study was to collect reliability and initial construct validity data on this new measure. Participants were 199 ex-smokers recruited from the community and smoking cessation Web sites. Participants completed online measures including a global motivation measure, the ARME scale, demographic questionnaire, and a measure of cessation self-efficacy. The 16-item ARME questionnaire demonstrated high internal consistency reliability (alpha = .89). Analyses provided support for convergent, discriminant, and construct validity of the scale. ARME demonstrated the predicted correlation with a traditional measure of global cessation motivation, yet, also as predicted, only the ARME was negatively associated with length of abstinence. Moreover, as hypothesized, ex-smokers engaged in the quitting process via ongoing smoking Web site participation showed higher ARME scores than a comparison community sample. A five-item short form demonstrated similar psychometric properties. This study provided initial support for the ARME construct and offers two versions of a reliable instrument for assessing this construct. Future research will examine the ARME as a predictor of cessation outcome and a potential target for relapse prevention.
Using classical test theory, item response theory, and Rasch measurement theory to evaluate patient-reported outcome measures: a comparison of worked examples.

Science.gov (United States)

Petrillo, Jennifer; Cano, Stefan J; McLeod, Lori D; Coon, Cheryl D

2015-01-01

To provide comparisons and a worked example of item- and scale-level evaluations based on three psychometric methods used in patient-reported outcome development-classical test theory (CTT), item response theory (IRT), and Rasch measurement theory (RMT)-in an analysis of the National Eye Institute Visual Functioning Questionnaire (VFQ-25). Baseline VFQ-25 data from 240 participants with diabetic macular edema from a randomized, double-masked, multicenter clinical trial were used to evaluate the VFQ at the total score level. CTT, RMT, and IRT evaluations were conducted, and results were assessed in a head-to-head comparison. Results were similar across the three methods, with IRT and RMT providing more detailed diagnostic information on how to improve the scale. CTT led to the identification of two problematic items that threaten the validity of the overall scale score, sets of redundant items, and skewed response categories. IRT and RMT additionally identified poor fit for one item, many locally dependent items, poor targeting, and disordering of over half the response categories. Selection of a psychometric approach depends on many factors. Researchers should justify their evaluation method and consider the intended audience. If the instrument is being developed for descriptive purposes and on a restricted budget, a cursory examination of the CTT-based psychometric properties may be all that is possible. In a high-stakes situation, such as the development of a patient-reported outcome instrument for consideration in pharmaceutical labeling, however, a thorough psychometric evaluation including IRT or RMT should be considered, with final item-level decisions made on the basis of both quantitative and qualitative results. Copyright © 2015. Published by Elsevier Inc.
Measuring health-related problem solving among African Americans with multiple chronic conditions: application of Rasch analysis.

Science.gov (United States)

Fitzpatrick, Stephanie L; Hill-Briggs, Felicia

2015-10-01

Identification of patients with poor chronic disease self-management skills can facilitate treatment planning, determine effectiveness of interventions, and reduce disease complications. This paper describes the use of a Rasch model, the Rating Scale Model, to examine psychometric properties of the 50-item Health Problem-Solving Scale (HPSS) among 320 African American patients with high risk for cardiovascular disease. Items on the positive/effective HPSS subscales targeted patients at low, moderate, and high levels of positive/effective problem solving, whereas items on the negative/ineffective problem solving subscales mostly targeted those at moderate or high levels of ineffective problem solving. Validity was examined by correlating factor scores on the measure with clinical and behavioral measures. Items on the HPSS show promise in the ability to assess health-related problem solving among high risk patients. However, further revisions of the scale are needed to increase its usability and validity with large, diverse patient populations in the future.
What Do You Think You Are Measuring? A Mixed-Methods Procedure for Assessing the Content Validity of Test Items and Theory-Based Scaling

Science.gov (United States)

Koller, Ingrid; Levenson, Michael R.; Glück, Judith

2017-01-01

The valid measurement of latent constructs is crucial for psychological research. Here, we present a mixed-methods procedure for improving the precision of construct definitions, determining the content validity of items, evaluating the representativeness of items for the target construct, generating test items, and analyzing items on a theoretical basis. To illustrate the mixed-methods content-scaling-structure (CSS) procedure, we analyze the Adult Self-Transcendence Inventory, a self-report measure of wisdom (ASTI, Levenson et al., 2005). A content-validity analysis of the ASTI items was used as the basis of psychometric analyses using multidimensional item response models (N = 1215). We found that the new procedure produced important suggestions concerning five subdimensions of the ASTI that were not identifiable using exploratory methods. The study shows that the application of the suggested procedure leads to a deeper understanding of latent constructs. It also demonstrates the advantages of theory-based item analysis. PMID:28270777
Using the Rasch Measurement Model in Psychometric Analysis of the Family Effectiveness Measure

Science.gov (United States)

McCreary, Linda L.; Conrad, Karen M.; Conrad, Kendon J.; Scott, Christy K; Funk, Rodney R.; Dennis, Michael L.

2013-01-01

Background Valid assessment of family functioning can play a vital role in optimizing client outcomes. Because family functioning is influenced by family structure, socioeconomic context, and culture, existing measures of family functioning--primarily developed with nuclear, middle class European American families--may not be valid assessments of families in diverse populations. The Family Effectiveness Measure was developed to address this limitation. Objectives To test the Family Effectiveness Measure with data from a primarily low-income African American convenience sample, using the Rasch measurement model. Method A sample of 607 adult women completed the measure. Rasch analysis was used to assess unidimensionality, response category functioning, item fit, person reliability, differential item functioning by race and parental status, and item hierarchy. Criterion-related validity was tested using correlations with five other variables related to family functioning. Results The Family Effectiveness Measure measures two separate constructs: The effective family functioning construct was a psychometrically sound measure of the target construct that was more efficient due to the deletion of 22 items. The ineffective family functioning construct consisted of 16 of those deleted items but was not as strong psychometrically. Items in both constructs evidenced no differential item functioning by race. Criterion-related validity was supported for both. Discussion In contrast to the prevailing conceptualization that family functioning is a single construct, assessed by positively and negatively worded items, use of the Rasch analysis suggested the existence of two constructs. While the effective family functioning is a strong and efficient measure of family functioning, the ineffective family functioning will require additional item development and psychometric testing. PMID:23636342
Capturing and missing the patient's story through outcome measures: A thematic comparison of patient-generated items in PSYCHLOPS with CORE-OM and PHQ-9.

Science.gov (United States)

Sales, Célia Md; Neves, Inês Td; Alves, Paula G; Ashworth, Mark

2017-11-22

There is increasing interest in individualized patient-reported outcome measures (I-PROMS), where patients themselves indicate the specific problems they want to address in therapy and these problems are used as items within the outcome measurement tool. This paper examined the extent to which 279 items reported in an I-PROM (PSYCHLOPS) added qualitative information which was not captured by two well-established outcome measures (CORE-OM and PHQ-9). Comparison of items was only conducted for patients scoring above the "caseness" threshold on the standardized measures. 107 patients were participating in therapy within addiction and general psychiatric clinical settings. Almost every patient (95%) reported at least one item whose content was not covered by PHQ-9, and 71% reported at least one item not covered by CORE-OM. Results demonstrate the relevance of individualized outcome assessment for capturing data describing the issues of greatest concern to patients, as nomothetic measures do not always seem to capture the whole story. © 2017 The Authors Health Expectations Published by John Wiley & Sons Ltd.
Comparison of Alternate and Original Items on the Montreal Cognitive Assessment.

Science.gov (United States)

Lebedeva, Elena; Huang, Mei; Koski, Lisa

2016-03-01

The Montreal Cognitive Assessment (MoCA) is a screening tool for mild cognitive impairment (MCI) in elderly individuals. We hypothesized that measurement error when using the new alternate MoCA versions to monitor change over time could be related to the use of items that are not of comparable difficulty to their corresponding originals of similar content. The objective of this study was to compare the difficulty of the alternate MoCA items to the original ones. Five selected items from alternate versions of the MoCA were included with items from the original MoCA administered adaptively to geriatric outpatients (N = 78). Rasch analysis was used to estimate the difficulty level of the items. None of the five items from the alternate versions matched the difficulty level of their corresponding original items. This study demonstrates the potential benefits of a Rasch analysis-based approach for selecting items during the process of development of parallel forms. The results suggest that better match of the items from different MoCA forms by their difficulty would result in higher sensitivity to changes in cognitive function over time.
The validity of the Satisfaction with Life Scale in adolescents and a comparison with single-item life satisfaction measures: a preliminary study.

Science.gov (United States)

Jovanović, Veljko

2016-12-01

The validity of the life satisfaction measures commonly used among adults has been rarely examined in adolescent samples. The present research had two main goals: (1) to evaluate the structural validity of the Satisfaction with Life Scale (SWLS) among adolescents and to test measurement invariance across gender; (2) to compare the criterion and convergent validity of the SWLS and single-item life satisfaction measures among adolescents. Three samples of Serbian adolescents were recruited for the present research. Study 1 (N = 481, M age = 17.01 years) examined the structure of the SWLS via confirmatory factor analysis (CFA) and evaluated measurement invariance of the SWLS across gender by a multi-group CFA. Study 2 (N = 283, M age = 17.34 years) and Study 3 (N = 220, M age = 16.73 years) compared the convergent validity of the SWLS and single-item life satisfaction measures. The results of Study 1 supported the original one-factor model of the SWLS among adolescents and provided evidence for strong measurement invariance of the SWLS across gender. The findings of Study 2 and Study 3 showed that the SWLS and single-item measures were equally valid and strongly associated (r = .734 in Study 2 and r = .668 in Study 3). No substantial differences in correlations with school success and well-being indicators were found between the SWLS and single-item measures. Our findings support the use of the SWLS among adolescents and indicate that single-item life satisfaction measures perform as well as the SWLS in adolescent samples.
Loglinear multidimensional IRT models for polytomously scired Items

NARCIS (Netherlands)

Kelderman, Henk

1988-01-01

A loglinear item response theory (IRT) model is proposed that relates polytomously scored item responses to a multidimensional latent space. Each item may have a different response function where each item response may be explained by one or more latent traits. Item response functions may follow a
Location Indices for Ordinal Polytomous Items Based on Item Response Theory. Research Report. ETS RR-15-20

Science.gov (United States)

Ali, Usama S.; Chang, Hua-Hua; Anderson, Carolyn J.

2015-01-01

Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index is needed to describe an item's location along a latent trait continuum. Situations in which a single index would be needed include item selection in computerized adaptive testing or test assembly. Therefore single…
Cleaning and disinfection of patient care items, in relation to small animals.

Science.gov (United States)

Weese, J Scott

2015-03-01

Patient care involves several medical and surgical items, including those that come into contact with sterile or other high-risk body sites and items that have been used on other patients. These situations create a risk for infection if items are contaminated, and the implications can range from single infections to large outbreaks. To minimize the risk, proper equipment cleaning, disinfection/sterilization, storage, and monitoring practices are required. Risks posed by different items; the required level of cleaning, disinfection, or sterilization; the methods that are available and appropriate; and how to ensure efficacy, must be considered when designing and implementing an infection control program. Copyright © 2015 Elsevier Inc. All rights reserved.
Single-item measures for depression and anxiety: Validation of the Screening Tool for Psychological Distress in an inpatient cardiology setting.

Science.gov (United States)

Young, Quincy-Robyn; Nguyen, Michelle; Roth, Susan; Broadberry, Ann; Mackay, Martha H

2015-12-01

Depression and anxiety are common among patients with cardiovascular disease (CVD) and confer significant cardiac risk, contributing to CVD morbidity and mortality. Unfortunately, due to the lack of screening tools that address the specific needs of hospitalized patients, few cardiac inpatient programs offer routine screening for these forms of psychological distress, despite recommendations to do so. The purpose of this study was to validate single-item measures for depression and anxiety among cardiac inpatients. Consecutive inpatients were recruited from the cardiology and cardiac surgery step-down units at a university-affiliated, quaternary-care hospital. Subjects completed a questionnaire that included: (a) demographics, (b) single-item-measures for depression and anxiety (from the Screening Tool for Psychological Distress (STOP-D)), and (c) Hospital Anxiety and Depression Scale (HADS). One hundred and five participants were recruited with a wide variety of cardiac diagnoses, having a mean age of 66 years, and 28% were women. Both STOP-D items were highly correlated with their corresponding validated measures and demonstrated robust receiver-operator characteristic curves. Severity scores on both items correlated well with established severity cut-off scores on the corresponding subscales of the HADS. The STOP-D is a self-administered, self-report measure using two independent items that provide severity scores for depression and anxiety. The tool performs very well compared with other previously validated measures. Requiring no additional scoring and being free, STOP-D offers a simple and valid method for identifying hospitalized cardiac patients who are experiencing psychological distress. This crucial first step triggers initiation of appropriate monitoring and intervention, thus reducing the likelihood of the adverse cardiac outcomes associated with psychological distress. © The European Society of Cardiology 2014.
Item response theory analysis of the mechanics baseline test

Science.gov (United States)

Cardamone, Caroline N.; Abbott, Jonathan E.; Rayyan, Saif; Seaton, Daniel T.; Pawl, Andrew; Pritchard, David E.

2012-02-01

Item response theory is useful in both the development and evaluation of assessments and in computing standardized measures of student performance. In item response theory, individual parameters (difficulty, discrimination) for each item or question are fit by item response models. These parameters provide a means for evaluating a test and offer a better measure of student skill than a raw test score, because each skill calculation considers not only the number of questions answered correctly, but the individual properties of all questions answered. Here, we present the results from an analysis of the Mechanics Baseline Test given at MIT during 2005-2010. Using the item parameters, we identify questions on the Mechanics Baseline Test that are not effective in discriminating between MIT students of different abilities. We show that a limited subset of the highest quality questions on the Mechanics Baseline Test returns accurate measures of student skill. We compare student skills as determined by item response theory to the more traditional measurement of the raw score and show that a comparable measure of learning gain can be computed.
Measuring psychological trauma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Psychological Trauma item bank and short form.

Science.gov (United States)

Kisala, Pamela A; Victorson, David; Pace, Natalie; Heinemann, Allen W; Choi, Seung W; Tulsky, David S

2015-05-01

To describe the development and psychometric properties of the SCI-QOL Psychological Trauma item bank and short form. Using a mixed-methods design, we developed and tested a Psychological Trauma item bank with patient and provider focus groups, cognitive interviews, and item response theory based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a 31-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Veterans Administration hospital. A total of 716 individuals with SCI completed the trauma items The 31 items fit a unidimensional model (CFI=0.952; RMSEA=0.061) and demonstrated good precision (theta range between 0.6 and 2.5). Nine items demonstrated negligible DIF with little impact on score estimates. The final calibrated item bank contains 19 items The SCI-QOL Psychological Trauma item bank is a psychometrically robust measurement tool from which a short form and a computer adaptive test (CAT) version are available.

Cross-cultural measurement equivalence of the EQ-5D-5L items for English-speaking Asians in Singapore.

Science.gov (United States)

Luo, N; Wang, Y; How, C H; Wong, K Y; Shen, L; Tay, E G; Thumboo, J; Herdman, M

2015-06-01

To investigate how the response labels of the 5-level EQ-5D (EQ-5D-5L) items are interpreted and used by English-speaking Chinese and non-Chinese Singaporeans, as a means to assessing whether those items are cross-culturally equivalent health-status measures in this Asian population. In face-to-face interviews, Chinese, Malay and Indian visitors to a primary care institution in Singapore were asked to rate the relative severity conveyed by EQ-5D-5L response labels, each containing the keyword of 'no(t),' 'slight(ly),' 'moderate(ly),' 'severe(ly),' or 'unable'/'extreme(ly),' using a 0-100 numerical rating scale. Participants were also asked to describe 25 hypothetical health states using the EQ-5D-5L response labels. Differences between Chinese and Malay/Indian participants in label interpretation and selection were examined using multivariate regression analysis to adjust for participant characteristics. The differences in adjusted mean severity scores for individual EQ-5D-5L labels between Chinese (n = 148) and non-Chinese (Malay: n = 53; Indian: n = 56) participants ranged from 0.0 to 9.0. The relative severity of the labels to the participants supported the ordinality of the EQ-5D-5L response labels and was similar across ethnic groups. Chinese and non-Chinese participants selected similar response labels to describe each hypothetical health state, with the adjusted odds ratios of selecting any type of the five response labels for non-Chinese versus Chinese participants ranging from 0.92 to 1.15 (p > 0.05 for all). The EQ-5D-5L items are likely to generate equivalent health outcomes between English-speaking Chinese and non-Chinese Singaporeans.
Exploring differential item functioning (DIF) with the Rasch model: a comparison of gender differences on eighth grade science items in the United States and Spain.

Science.gov (United States)

Babiar, Tasha Calvert

2011-01-01

Traditionally, women and minorities have not been fully represented in science and engineering. Numerous studies have attributed these differences to gaps in science achievement as measured by various standardized tests. Rather than describe mean group differences in science achievement across multiple cultures, this study focused on an in-depth item-level analysis across two countries: Spain and the United States. This study investigated eighth-grade gender differences on science items across the two countries. A secondary purpose of the study was to explore the nature of gender differences using the many-faceted Rasch Model as a way to estimate gender DIF. A secondary analysis of data from the Third International Mathematics and Science Study (TIMSS) was used to address three questions: 1) Does gender DIF in science achievement exist? 2) Is there a relationship between gender DIF and characteristics of the science items? 3) Do the relationships between item characteristics and gender DIF in science items replicate across countries. Participants included 7,087 eight grade students from the United States and 3,855 students from Spain who participated in TIMSS. The Facets program (Linacre and Wright, 1992) was used to estimate gender DIF. The results of the analysis indicate that the content of the item seemed to be related to gender DIF. The analysis also suggests that there is a relationship between gender DIF and item format. No pattern of gender DIF related to cognitive demand was found. The general pattern of gender DIF was similar across the two countries used in the analysis. The strength of item-level analysis as opposed to group mean difference analysis is that gender differences can be detected at the item level, even when no mean differences can be detected at the group level.
26 CFR 48.4216(a)-3 - Other items relating to tax on sale price.

Science.gov (United States)

2010-04-01

... 26 Internal Revenue 16 2010-04-01 2010-04-01 true Other items relating to tax on sale price. 48.4216(a)-3 Section 48.4216(a)-3 Internal Revenue INTERNAL REVENUE SERVICE, DEPARTMENT OF THE TREASURY... reason of the failure of the article under a warranty as to its quality or service, and a new article is...
The Deaf Acculturation Scale (DAS): Development and Validation of a 58-Item Measure

Science.gov (United States)

Maxwell-McCaw, Deborah; Zea, Maria Cecilia

2011-01-01

This study involved the development and validation of the Deaf Acculturation Scale (DAS), a new measure of cultural identity for Deaf and hard-of-hearing (hh) populations. Data for this study were collected online and involved a nation-wide sample of 3,070 deaf/hh individuals. Results indicated strong internal reliabilities for all the subscales, and construct validity was established by demonstrating that the DAS could discriminate groups based on parental hearing status, school background, and use of self-labels. Construct validity was further demonstrated through factorial analyses, and findings resulted in a final 58-item measure. Directions for future research are discussed. PMID:21263041
Adaptive memory: the survival scenario enhances item-specific processing relative to a moving scenario.

Science.gov (United States)

Burns, Daniel J; Hart, Joshua; Griffith, Samantha E; Burns, Amy D

2013-01-01

Nairne, Thompson, and Pandeirada (2007) found that retention of words rated for their relevance to survival is superior to that of words encoded under numerous other deep processing conditions. They suggested that our memory systems might have evolved to confer an advantage for survival-relevant information. Burns, Burns, and Hwang (2011) suggested a two-process explanation of the proximate mechanisms responsible for the survival advantage. Whereas most control tasks encourage only one type of processing, the survival task encourages both item-specific and relational processing. They found that when control tasks encouraged both types of processing, the survival processing advantage was eliminated. However, none of their control conditions included non-survival scenarios (e.g., moving, vacation, etc.), so it is not clear how this two-process explanation would explain the survival advantage when scenarios are used as control conditions. The present experiments replicated the finding that the survival scenario improves recall relative to a moving scenario in both a between-lists and within-list design and also provided evidence that this difference was accompanied by an item-specific processing difference, not a difference in relational processing. The implications of these results for several existing accounts of the survival processing effect are discussed.
Using Item Response Theory to Develop a 60-Item Representation of the NEO PI-R Using the International Personality Item Pool: Development of the IPIP-NEO-60.

Science.gov (United States)

Maples-Keller, Jessica L; Williamson, Rachel L; Sleep, Chelsea E; Carter, Nathan T; Campbell, W Keith; Miller, Joshua D

2017-10-31

Given advantages of freely available and modifiable measures, an increase in the use of measures developed from the International Personality Item Pool (IPIP), including the 300-item representation of the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992a ) has occurred. The focus of this study was to use item response theory to develop a 60-item, IPIP-based measure of the Five-Factor Model (FFM) that provides equal representation of the FFM facets and to test the reliability and convergent and criterion validity of this measure compared to the NEO Five Factor Inventory (NEO-FFI). In an undergraduate sample (n = 359), scores from the NEO-FFI and IPIP-NEO-60 demonstrated good reliability and convergent validity with the NEO PI-R and IPIP-NEO-300. Additionally, across criterion variables in the undergraduate sample as well as a community-based sample (n = 757), the NEO-FFI and IPIP-NEO-60 demonstrated similar nomological networks across a wide range of external variables (r ICC = .96). Finally, as expected, in an MTurk sample the IPIP-NEO-60 demonstrated advantages over the Big Five Inventory-2 (Soto & John, 2017 ; n = 342) with regard to the Agreeableness domain content. The results suggest strong reliability and validity of the IPIP-NEO-60 scores.
Projective Item Response Model for Test-Independent Measurement

Science.gov (United States)

Ip, Edward Hak-Sing; Chen, Shyh-Huei

2012-01-01

The problem of fitting unidimensional item-response models to potentially multidimensional data has been extensively studied. The focus of this article is on response data that contains a major dimension of interest but that may also contain minor nuisance dimensions. Because fitting a unidimensional model to multidimensional data results in…
Measuring Corporate Social Responsibility in Gambling Industry: Multi-Items Stakeholder Based Scales

Directory of Open Access Journals (Sweden)

Jian Ming Luo

2017-11-01

Full Text Available Macau gambling companies included Corporate Social Responsibility (CSR information in their annual reports and websites as a marketing tool. Responsible Gambling (RG had been a recurring issue in Macau’s chief executive report since 2007 and in many of the major gambling operators’ annual report. The purpose of this study was to develop a measurement scale on CSR activities in Macau. Items on the measurement scale were based on qualitative research with data collected from employees in Macau’s gambling industry and academic literature. First and Second Order confirmatory factor analysis (CFA were used to verify the reliability and validity of the measurement scale. The results of this study were satisfactory and were supported by empirical evidence. This study provided recommendations to gambling stakeholders, including practitioners, government officers, customers and shareholders, and implications to promote CSR practice in Macau gambling industry.
On Studying Common Factor Dominance and Approximate Unidimensionality in Multicomponent Measuring Instruments with Discrete Items

Science.gov (United States)

Raykov, Tenko; Marcoulides, George A.

2018-01-01

This article outlines a procedure for examining the degree to which a common factor may be dominating additional factors in a multicomponent measuring instrument consisting of binary items. The procedure rests on an application of the latent variable modeling methodology and accounts for the discrete nature of the manifest indicators. The method…
Item Information in the Rasch Model

NARCIS (Netherlands)

Engelen, Ron J.H.; van der Linden, Willem J.; Oosterloo, Sebe J.

1988-01-01

Fisher's information measure for the item difficulty parameter in the Rasch model and its marginal and conditional formulations are investigated. It is shown that expected item information in the unconditional model equals information in the marginal model, provided the assumption of sampling
Differential Weighting of Items to Improve University Admission Test Validity

Directory of Open Access Journals (Sweden)

Eduardo Backhoff Escudero

2001-05-01

Full Text Available This paper gives an evaluation of different ways to increase university admission test criterion-related validity, by differentially weighting test items. We compared four methods of weighting multiple-choice items of the Basic Skills and Knowledge Examination (EXHCOBA: (1 punishing incorrect responses by a constant factor, (2 weighting incorrect responses, considering the levels of error, (3 weighting correct responses, considering the item’s difficulty, based on the Classic Measurement Theory, and (4 weighting correct responses, considering the item’s difficulty, based on the Item Response Theory. Results show that none of these methods increased the instrument’s predictive validity, although they did improve its concurrent validity. It was concluded that it is appropriate to score the test by simply adding up correct responses.
Development of a psychological test to measure ability-based emotional intelligence in the Indonesian workplace using an item response theory.

Science.gov (United States)

Fajrianthi; Zein, Rizqy Amelia

2017-01-01

This study aimed to develop an emotional intelligence (EI) test that is suitable to the Indonesian workplace context. Airlangga Emotional Intelligence Test (Tes Kecerdasan Emosi Airlangga [TKEA]) was designed to measure three EI domains: 1) emotional appraisal, 2) emotional recognition, and 3) emotional regulation. TKEA consisted of 120 items with 40 items for each subset. TKEA was developed based on the Situational Judgment Test (SJT) approach. To ensure its psychometric qualities, categorical confirmatory factor analysis (CCFA) and item response theory (IRT) were applied to test its validity and reliability. The study was conducted on 752 participants, and the results showed that test information function (TIF) was 3.414 (ability level = 0) for subset 1, 12.183 for subset 2 (ability level = -2), and 2.398 for subset 3 (level of ability = -2). It is concluded that TKEA performs very well to measure individuals with a low level of EI ability. It is worth to note that TKEA is currently at the development stage; therefore, in this study, we investigated TKEA's item analysis and dimensionality test of each TKEA subset.
Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

Science.gov (United States)

Wang, Wei

2013-01-01

Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…
Reliability of the Core Items in the General Social Survey: Estimates from the Three-Wave Panels, 2006–2014

Directory of Open Access Journals (Sweden)

Michael Hout

2016-11-01

Full Text Available We used standard and multilevel models to assess the reliability of core items in the General Social Survey panel studies spanning 2006 to 2014. Most of the 293 core items scored well on the measure of reliability: 62 items (21 percent had reliability measures greater than 0.85; another 71 (24 percent had reliability measures between 0.70 and 0.85. Objective items, especially facts about demography and religion, were generally more reliable than subjective items. The economic recession of 2007–2009, the slow recovery afterward, and the election of Barack Obama in 2008 altered the social context in ways that may look like unreliability of items. For example, unemployment status, hours worked, and weeks worked have lower reliability than most work-related items, reflecting the consequences of the recession on the facts of peoples lives. Items regarding racial and gender discrimination and racial stereotypes scored as particularly unreliable, accounting for most of the 15 items with reliability coefficients less than 0.40. Our results allow scholars to more easily take measurement reliability into consideration in their own research, while also highlighting the limitations of these approaches.
Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions

Directory of Open Access Journals (Sweden)

Yoon Soo ePark

2016-02-01

Full Text Available This study investigates the impact of item parameter drift (IPD on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effect on item parameters and examinee ability.
Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions.

Science.gov (United States)

Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

2016-01-01

This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.
Work ability as prognostic risk marker of disability pension : Single-item work ability score versus multi-item work ability index

NARCIS (Netherlands)

Roelen, C.A.M.; Rhenen, van W.; Groothoff, J.W.; Klink, van der J.J.L.; Twisk, W.R.; Heymans, M.W.

2014-01-01

Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP.
Measuring social science concepts in pharmacy education research: From definition to item analysis of self-report instruments.

Science.gov (United States)

Cor, M Ken

Interpreting results from quantitative research can be difficult when measures of concepts are constructed poorly, something that can limit measurement validity. Social science steps for defining concepts, guidelines for limiting construct-irrelevant variance when writing self-report questions, and techniques for conducting basic item analysis are reviewed to inform the design of instruments to measure social science concepts in pharmacy education research. Based on a review of the literature, four main recommendations emerge: These include: (1) employ a systematic process of conceptualization to derive nominal definitions; (2) write exact and detailed operational definitions for each concept, (3) when creating self-report questionnaires, write statements and select scales to avoid introducing construct-irrelevant variance (CIV); and (4) use basic item analysis results to inform instrument revision. Employing recommendations that emerge from this review will strengthen arguments to support measurement validity which in turn will support the defensibility of study finding interpretations. An example from pharmacy education research is used to contextualize the concepts introduced. Copyright © 2017 Elsevier Inc. All rights reserved.
P2-19: The Effect of item Repetition on Item-Context Association Depends on the Prior Exposure of Items

Directory of Open Access Journals (Sweden)

Hongmi Lee

2012-10-01

Full Text Available Previous studies have reported conflicting findings on whether item repetition has beneficial or detrimental effects on source memory. To reconcile such contradictions, we investigated whether the degree of pre-exposure of items can be a potential modulating factor. The experimental procedures spanned two consecutive days. On Day 1, participants were exposed to a set of unfamiliar faces. On Day 2, the same faces presented on the previous day were used again in half of the participants, whereas novel faces were used for the other half. Day 2 procedures consisted of three successive phases: item repetition, source association, and source memory test. In the item repetition phase, half of the face stimuli were repeatedly presented while participants were making male/female judgments. During the source association phase, both the repeated and the unrepeated faces appeared in one of the four locations on the screen. Finally, participants were tested on the location in which a given face was presented during the previous phase and reported the confidence of their memory. Source memory accuracy was measured as the percentage of correct non-guess trials. As results, we found a significant interaction between prior exposure and repetition. Repetition impaired source memory when the items had been pre-exposed on Day 1, while it led to greater accuracy in novel ones. These results show that pre-experimental exposure can modulate the effects of repetition on associative binding between an item and its contextual information, suggesting that pre-existing representation and novelty signal interact to form new episodic memory.
Measuring Experiential Avoidance: Reliability and Validity of the Dutch 9-item Acceptance and Action Questionnaire (AAQ)

NARCIS (Netherlands)

Boelen, P.A.; Reijntjes, A.H.A.

2008-01-01

Three studies evaluated psychometric properties of the Dutch version of the 9-item Acceptance and Action Questionnaire (AAQ)—a self-report measure designed to assess experiential avoidance as conceptualized inAcceptance and Commitment Therapy (ACT). Study 1, among bereaved adults, showed that a

The emotion dysregulation inventory: Psychometric properties and item response theory calibration in an autism spectrum disorder sample.

Science.gov (United States)

Mazefsky, Carla A; Yu, Lan; White, Susan W; Siegel, Matthew; Pilkonis, Paul A

2018-04-06

Individuals with autism spectrum disorder (ASD) often present with prominent emotion dysregulation that requires treatment but can be difficult to measure. The Emotion Dysregulation Inventory (EDI) was created using methods developed by the Patient-Reported Outcomes Measurement Information System (PROMIS ® ) to capture observable indicators of poor emotion regulation. Caregivers of 1,755 youth with ASD completed 66 candidate EDI items, and the final 30 items were selected based on classical test theory and item response theory (IRT) analyses. The analyses identified two factors: (a) Reactivity, characterized by intense, rapidly escalating, sustained, and poorly regulated negative emotional reactions, and (b) Dysphoria, characterized by anhedonia, sadness, and nervousness. The final items did not show differential item functioning (DIF) based on gender, age, intellectual ability, or verbal ability. Because the final items were calibrated using IRT, even a small number of items offers high precision, minimizing respondent burden. IRT co-calibration of the EDI with related measures demonstrated its superiority in assessing the severity of emotion dysregulation with as few as seven items. Validity of the EDI was supported by expert review, its association with related constructs (e.g., anxiety and depression symptoms, aggression), higher scores in psychiatric inpatients with ASD compared to a community ASD sample, and demonstration of test-retest stability and sensitivity to change. In sum, the EDI provides an efficient and sensitive method to measure emotion dysregulation for clinical assessment, monitoring, and research in youth with ASD of any level of cognitive or verbal ability. Autism Res 2018. © 2018 International Society for Autism Research, Wiley Periodicals, Inc. This paper describes a new measure of poor emotional control called the Emotion Dysregulation Inventory (EDI). Caregivers of 1,755 youth with ASD completed candidate items, and advanced statistical
Using an FSDS-R Item to Screen for Sexually Related Distress: A MsFLASH Analysis

Science.gov (United States)

Carpenter, Janet S; Reed, Susan D; Guthrie, Katherine A; Larson, Joseph C; Newton, Katherine M; Lau, R Jane; Learman, Lee A; Shifren, Jan L

2015-01-01

Introduction The Female Sexual Distress Scale-Revised (FSDS-R) was created and validated to assess distress associated with impaired sexual function, but it is lengthy for use in clinical practice and research when assessing sexual function is not a primary objective. Aim The study aims to evaluate whether a single item from the FSDS-R could be identified to use to screen midlife women for bothersome diminution in sexual function based on three criteria: (i) highly correlated with total scores; (ii) correlated with commonly assessed domains of female sexual functioning; and (iii) able to differentiate between women reporting high and low sexual concerns during the prior month. Methods Data from 93 midlife women were collected by the Menopause Strategies Finding Lasting Answers to Symptoms and Health (MsFLASH) research network. Main Outcome Measures Women completed the FSDS-R, Female Sexual Function Index (FSFI), and Menopausal Quality of Life Scale (MENQOL). Those who reported a change in the past month on the MENQOL sexual were categorized into a high sexual concerns group, while all others were categorized into a low sexual concerns group. Results Women were an average of 54.6 years old (SD 3.1) and mostly Caucasian (77.4%), college educated (60.2%), married/living as married (64.5%), and postmenopausal (79.6%). The FSDS-R item number 1 “Distressed about sex life” was: (i) highly correlated with FSDS-R total scores (r = 0.90); (ii) moderately correlated with FSFI total scores (r = −0.38) and FSFI desire (r = −0.37) and satisfaction domains (r = −0.40); and (iii) showed one of the largest mean differences between high and low sexual concerns groups (P Guthrie KA, Larson JC, Newton KM, Lau RJ, Learman LA, and Shifren JL. Using an FSDS-R item to screen for sexually related distress: A MsFLASH analysis. Sex Med 2015;3:7–13. PMID:25844170
Work ability as prognostic risk marker of disability pension: single-item work ability score versus multi-item work ability index

NARCIS (Netherlands)

Roelen, C.A.M.; van Rhenen, W.; Groothoff, J.W.; van der Klink, J.J.L.; Twisk, J.W.R.; Heymans, M.W.

2014-01-01

Objectives Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. Methods This
Work ability as prognostic risk marker of disability pension : single-item work ability score versus multi-item work ability index

NARCIS (Netherlands)

Roelen, Corne A. M.; van Rhenen, Willem; Groothoff, Johan W.; van der Klink, Jac J. L.; Twisk, Jos W. R.; Heymans, Martijn W.

Objectives Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. Methods This
The role of relational binding in item memory: evidence from face recognition in a case of developmental amnesia.

Science.gov (United States)

Olsen, Rosanna K; Lee, Yunjo; Kube, Jana; Rosenbaum, R Shayna; Grady, Cheryl L; Moscovitch, Morris; Ryan, Jennifer D

2015-04-01

Current theories state that the hippocampus is responsible for the formation of memory representations regarding relations, whereas extrahippocampal cortical regions support representations for single items. However, findings of impaired item memory in hippocampal amnesics suggest a more nuanced role for the hippocampus in item memory. The hippocampus may be necessary when the item elements need to be bound within and across episodes to form a lasting representation that can be used flexibly. The current investigation was designed to test this hypothesis in face recognition. H.C., an individual who developed with a compromised hippocampal system, and control participants incidentally studied individual faces that either varied in presentation viewpoint across study repetitions or remained in a fixed viewpoint across the study repetitions. Eye movements were recorded during encoding and participants then completed a surprise recognition memory test. H.C. demonstrated altered face viewing during encoding. Although the overall number of fixations made by H.C. was not significantly different from that of controls, the distribution of her viewing was primarily directed to the eye region. Critically, H.C. was significantly impaired in her ability to subsequently recognize faces studied from variable viewpoints, but demonstrated spared performance in recognizing faces she encoded from a fixed viewpoint, implicating a relationship between eye movement behavior in the service of a hippocampal binding function. These findings suggest that a compromised hippocampal system disrupts the ability to bind item features within and across study repetitions, ultimately disrupting recognition when it requires access to flexible relational representations. Copyright © 2015 the authors 0270-6474/15/355342-09$15.00/0.
An introduction to Item Response Theory and Rasch Analysis of the Eating Assessment Tool (EAT-10).

Science.gov (United States)

Kean, Jacob; Brodke, Darrel S; Biber, Joshua; Gross, Paul

2018-03-01

Item response theory has its origins in educational measurement and is now commonly applied in health-related measurement of latent traits, such as function and symptoms. This application is due in large part to gains in the precision of measurement attributable to item response theory and corresponding decreases in response burden, study costs, and study duration. The purpose of this paper is twofold: introduce basic concepts of item response theory and demonstrate this analytic approach in a worked example, a Rasch model (1PL) analysis of the Eating Assessment Tool (EAT-10), a commonly used measure for oropharyngeal dysphagia. The results of the analysis were largely concordant with previous studies of the EAT-10 and illustrate for brain impairment clinicians and researchers how IRT analysis can yield greater precision of measurement.
Collaborative Filtering Based on Sequential Extraction of User-Item Clusters

Science.gov (United States)

Honda, Katsuhiro; Notsu, Akira; Ichihashi, Hidetomo

Collaborative filtering is a computational realization of “word-of-mouth” in network community, in which the items prefered by “neighbors” are recommended. This paper proposes a new item-selection model for extracting user-item clusters from rectangular relation matrices, in which mutual relations between users and items are denoted in an alternative process of “liking or not”. A technique for sequential co-cluster extraction from rectangular relational data is given by combining the structural balancing-based user-item clustering method with sequential fuzzy cluster extraction appraoch. Then, the tecunique is applied to the collaborative filtering problem, in which some items may be shared by several user clusters.
Development of a psychological test to measure ability-based emotional intelligence in the Indonesian workplace using an item response theory

Directory of Open Access Journals (Sweden)

Fajrianthi

2017-11-01

Full Text Available Fajrianthi,1 Rizqy Amelia Zein2 1Department of Industrial and Organizational Psychology, 2Department of Personality and Social Psychology, Faculty of Psychology, Universitas Airlangga, Surabaya, East Java, Indonesia Abstract: This study aimed to develop an emotional intelligence (EI test that is suitable to the Indonesian workplace context. Airlangga Emotional Intelligence Test (Tes Kecerdasan Emosi Airlangga [TKEA] was designed to measure three EI domains: 1 emotional appraisal, 2 emotional recognition, and 3 emotional regulation. TKEA consisted of 120 items with 40 items for each subset. TKEA was developed based on the Situational Judgment Test (SJT approach. To ensure its psychometric qualities, categorical confirmatory factor analysis (CCFA and item response theory (IRT were applied to test its validity and reliability. The study was conducted on 752 participants, and the results showed that test information function (TIF was 3.414 (ability level = 0 for subset 1, 12.183 for subset 2 (ability level = -2, and 2.398 for subset 3 (level of ability = -2. It is concluded that TKEA performs very well to measure individuals with a low level of EI ability. It is worth to note that TKEA is currently at the development stage; therefore, in this study, we investigated TKEA’s item analysis and dimensionality test of each TKEA subset. Keywords: categorical confirmatory factor analysis, emotional intelligence, item response theory
Differential item functioning of the patient-reported outcomes information system (PROMIS®) pain interference item bank by language (Spanish versus English).

Science.gov (United States)

Paz, Sylvia H; Spritzer, Karen L; Reise, Steven P; Hays, Ron D

2017-06-01

About 70% of Latinos, 5 years old or older, in the United States speak Spanish at home. Measurement equivalence of the PROMIS ® pain interference (PI) item bank by language of administration (English versus Spanish) has not been evaluated. A sample of 527 adult Spanish-speaking Latinos completed the Spanish version of the 41-item PROMIS ® pain interference item bank. We evaluate dimensionality, monotonicity and local independence of the Spanish-language items. Then we evaluate differential item functioning (DIF) using ordinal logistic regression with item response theory scores estimated from DIF-free "anchor" items. One of the 41 items in the Spanish version of the PROMIS ® PI item bank was identified as having significant uniform DIF. English- and Spanish-speaking subjects with the same level of pain interference responded differently to 1 of the 41 items in the PROMIS ® PI item bank. This item was not retained due to proprietary issues. The original English language item parameters can be used when estimating PROMIS ® PI scores.
Efficient Algorithms for Segmentation of Item-Set Time Series

Science.gov (United States)

Chundi, Parvathi; Rosenkrantz, Daniel J.

We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets—Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.
Measuring Health-Related Quality of Life in Strabismus: A Modification of the Adult Strabismus-20 (AS-20 Questionnaire Using Rasch Analysis.

Directory of Open Access Journals (Sweden)

Vijaya K Gothwal

Full Text Available To evaluate the psychometric properties of the Adult Strabismus-20 (AS-20- a health-related quality of life (HRQoL questionnaire in adults with strabismus, and if flawed, to revise the AS-20 and its subscales creating valid measurement scales.584 adults (meanage, 27.5 years with strabismus were recruited from an outpatient clinic at a South Indian tertiary eye care centre and were administered the AS-20 questionnaire.The AS-20 was translated and back translated into two Indian languages. The AS-20 and its two 10-item subscales - 'psychosocial' and 'function'were assessed separately for fit to the Rasch model, including an assessment of the rating scale, unidimensionality (by principal components analysis, measurement precision by person separation reliability, PSR, targeting, and differential item functioning (DIF; notable > 1.0 logits.Response categories were not used as intended, thereby, required re-organization and reducing their number from 5 to 3. The AS-20 had adequate measurement precision (PSR = 0.87 but lacked unidimensionality; however, deletion of the six multi-dimensionality causing items and an additional three misfitting items resulted in 11-item unidimensional questionnaire (AS-11. Two items failed to satisfy the model expectations in the 'psychosocial' subscale and were deleted - resulting in an 8-item unidimensional scale with adequate PSR (0.81 and targeting (0.23 logits. One item misfit in the 'function' subscale and was deleted-resulting in a 9 item Rasch-revised unidimensional subscale with acceptable PSR (0.80 and targeting (0.97 logits.None of the items displayed notable DIF by age, gender and level of education.The AS-11 and its two Rasch-revised subscales - 8-item psychosocial and 9-item function subscale may be more appropriate than the original AS-20 and its two 10-item subscales for use as unidimensional measures of HRQoL in adults with strabismus in India. Further work is required to establish the validity of the
Gender Invariance of the Gambling Behavior Scale for Adolescents (GBS-A): An Analysis of Differential Item Functioning Using Item Response Theory.

Science.gov (United States)

Donati, Maria Anna; Chiesi, Francesca; Izzo, Viola A; Primi, Caterina

2017-01-01

As there is a lack of evidence attesting the equivalent item functioning across genders for the most employed instruments used to measure pathological gambling in adolescence, the present study was aimed to test the gender invariance of the Gambling Behavior Scale for Adolescents (GBS-A), a new measurement tool to assess the severity of Gambling Disorder (GD) in adolescents. The equivalence of the items across genders was assessed by analyzing Differential Item Functioning within an Item Response Theory framework. The GBS-A was administered to 1,723 adolescents, and the graded response model was employed. The results attested the measurement equivalence of the GBS-A when administered to male and female adolescent gamblers. Overall, findings provided evidence that the GBS-A is an effective measurement tool of the severity of GD in male and female adolescents and that the scale was unbiased and able to relieve truly gender differences. As such, the GBS-A can be profitably used in educational interventions and clinical treatments with young people.
Difference in method of administration did not significantly impact item response

DEFF Research Database (Denmark)

Bjorner, Jakob B; Rose, Matthias; Gandek, Barbara

2014-01-01

assistant (PDA), or personal computer (PC) on the Internet, and a second form by PC, in the same administration. Structural invariance, equivalence of item responses, and measurement precision were evaluated using confirmatory factor analysis and item response theory methods. RESULTS: Multigroup...... levels in IVR, PQ, or PDA administration as compared to PC. Availability of large item response theory-calibrated PROMIS item banks allowed for innovations in study design and analysis.......PURPOSE: To test the impact of method of administration (MOA) on the measurement characteristics of items developed in the Patient-Reported Outcomes Measurement Information System (PROMIS). METHODS: Two non-overlapping parallel 8-item forms from each of three PROMIS domains (physical function...
Reliability and validity of the International Spinal Cord Injury Basic Pain Data Set items as self-report measures

DEFF Research Database (Denmark)

Jensen, M P; Widerström-Noga, E; Richards, J S

2010-01-01

To evaluate the psychometric properties of a subset of International Spinal Cord Injury Basic Pain Data Set (ISCIBPDS) items that could be used as self-report measures in surveys, longitudinal studies and clinical trials....
Item Response Theory Analysis of the Psychopathic Personality Inventory-Revised.

Science.gov (United States)

Eichenbaum, Alexander E; Marcus, David K; French, Brian F

2017-06-01

This study examined item and scale functioning in the Psychopathic Personality Inventory-Revised (PPI-R) using an item response theory analysis. PPI-R protocols from 1,052 college student participants (348 male, 704 female) were analyzed. Analyses were conducted on the 131 self-report items comprising the PPI-R's eight content scales, using a graded response model. Scales collected a majority of their information about respondents possessing higher than average levels of the traits being measured. Each scale contained at least some items that evidenced limited ability to differentiate between respondents with differing levels of the trait being measured. Moreover, 80 items (61.1%) yielded significantly different responses between men and women presumably possessing similar levels of the trait being measured. Item performance was also influenced by the scoring format (directly scored vs. reverse-scored) of the items. Overall, the results suggest that the PPI-R, despite identifying psychopathic personality traits in individuals possessing high levels of those traits, may not identify these traits equally well for men and women, and scores are likely influenced by the scoring format of the individual item and scale.
Item Banks for Substance Use from the Patient-Reported Outcomes Measurement Information System (PROMIS®): Severity of Use and Positive Appeal of Use*

Science.gov (United States)

Pilkonis, Paul A.; Yu, Lan; Dodds, Nathan E.; Johnston, Kelly L.; Lawrence, Suzanne; Hilton, Thomas F.; Daley, Dennis C.; Patkar, Ashwin A.; McCarty, Dennis

2015-01-01

Background Two item banks for substance use were developed as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®): severity of substance use and positive appeal of substance use. Methods Qualitative item analysis (including focus groups, cognitive interviewing, expert review, and item revision) reduced an initial pool of more than 5,300 items for substance use to 119 items included in field testing. Items were written in a first-person, past-tense format, with 5 response options reflecting frequency or severity. Both 30-day and 3-month time frames were tested. The calibration sample of 1,336 respondents included 875 individuals from the general population (ascertained through an internet panel) and 461patients from addiction treatment centers participating in the National Drug Abuse Treatment Clinical Trials Network. Results Final banks of 37 and 18 items were calibrated for severity of substance use and positive appeal of substance use, respectively, using the two-parameter graded response model from item response theory (IRT). Initial calibrations were similar for the 30-day and 3-month time frames, and final calibrations used data combined across the time frames, making the items applicable with either interval. Seven-item static short forms were also developed from each item bank. Conclusions Test information curves showed that the PROMIS item banks provided substantial information in a broad range of severity, making them suitable for treatment, observational, and epidemiological research in both clinical and community settings. PMID:26423364
"Detecting Differential Item Functioning and Differential Step Functioning due to Differences that ""Should"" Matter"

Directory of Open Access Journals (Sweden)

Tess Miller

2010-07-01

Full Text Available This study illustrates the use of differential item functioning (DIF and differential step functioning (DSF analyses to detect differences in item difficulty that are related to experiences of examinees, such as their teachers' instructional practices, that are relevant to the knowledge, skill, or ability the test is intended to measure. This analysis is in contrast to the typical use of DIF or DSF to detect differences related to characteristics of examinees, such as gender, language, or cultural knowledge, that should be irrelevant. Using data from two forms of Ontario's Grade 9 Assessment of Mathematics, analyses were performed comparing groups of students defined by their teachers' instructional practices. All constructed-response items were tested for DIF using the Mantel Chi-Square, standardized Liu Agresti cumulative common log-odds ratio, and standardized Cox's noncentrality parameter. Items exhibiting moderate to large DIF were subsequently tested for DSF. In contrast to typical DIF or DSF analyses, which inform item development, these analyses have the potential to inform instructional practice.
Developing a measure of medication-related quality of life for people with polypharmacy.

Science.gov (United States)

Tseng, Hsu-Min; Lee, Chia-Hui; Chen, Yin-Jen; Hsu, Hsiang-Hao; Huang, Li-Yueh; Huang, Jing-Long

2016-05-01

To develop a measure of medication-related quality of life (MRQoL) and to validate the measure in a hospital-based population of patients with polypharmacy. The Medication-Related Quality of Life Scale version 1.0 (MRQoLS-v1.0) included 14 items developed on the basis of interviews with elderly patients with polypharmacy, defined as taking five or more medications simultaneously. This scale was tested in 219 outpatients (99 with polypharmacy and 120 without polypharmacy). Two measures were used to establish construct validity the Psychological Distress Checklist, for convergent validity, and the Medication Adherence Behavior Scale (MABS), for discriminant validity. The 14-item scale was found to be both reliable and valid. Internal consistency reliability evaluated using Cronbach's alpha for this scale was 0.91. Scores on the MRQoLS-v1.0 correlated statistically significantly and negatively with those on the Psychological Distress Checklist. Discriminant validity was demonstrated by low correlation with MABS, indicating that the MRQoLS-v1.0 measured concepts different from medication adherence. Significant differences in the MRQoLS-v1.0 between patients with polypharmacy and those without polypharmacy provided evidence for known-group validity. The study presents a psychometric evaluation of a measure used to assess MRQoL of patients with polypharmacy. The instrument is practical to administer in clinics and provides a valuable adjunct to the outcome measurement for patients with polypharmacy. Further research on the sensitivity of this instrument to medication change in multi-medicated patients is warranted.
A review of the effects on IRT item parameter estimates with a focus on misbehaving common items in test equating

Directory of Open Access Journals (Sweden)

Michalis P Michaelides

2010-10-01

Full Text Available Many studies have investigated the topic of change or drift in item parameter estimates in the context of Item Response Theory. Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.
A Review of the Effects on IRT Item Parameter Estimates with a Focus on Misbehaving Common Items in Test Equating.

Science.gov (United States)

Michaelides, Michalis P

2010-01-01

Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.

Development and psychometric characteristics of the SCI-QOL Ability to Participate and Satisfaction with Social Roles and Activities item banks and short forms.

Science.gov (United States)

Heinemann, Allen W; Kisala, Pamela A; Hahn, Elizabeth A; Tulsky, David S

2015-05-01

To develop a spinal cord injury (SCI)-focused version of PROMIS and Neuro-QOL social domain item banks; evaluate the psychometric properties of items developed for adults with SCI; and report information to facilitate clinical and research use. We used a mixed-methods design to develop and evaluate Ability to Participate in Social Roles and Activities and Satisfaction with Social Roles and Activities items. Focus groups helped define the constructs; cognitive interviews helped revise items; and confirmatory factor analysis and item response theory methods helped calibrate item banks and evaluate differential item functioning related to demographic and injury characteristics. Five SCI Model System sites and one Veterans Administration medical center. The calibration sample consisted of 641 individuals; a reliability sample consisted of 245 individuals residing in the community. A subset of 27 Ability to Participate and 35 Satisfaction items demonstrated good measurement properties and negligible differential item functioning related to demographic and injury characteristics. The SCI-specific measures correlate strongly with the PROMIS and Neuro-QOL versions. Ten item short forms correlate >0.96 with the full banks. Variable-length CATs with a minimum of 4 items, variable-length CATs with a minimum of 8 items, fixed-length CATs of 10 items, and the 10-item short forms demonstrate construct coverage and measurement error that is comparable to the full item bank. The Ability to Participate and Satisfaction with Social Roles and Activities CATs and short forms demonstrate excellent psychometric properties and are suitable for clinical and research applications.
An empirical comparison of Item Response Theory and Classical Test Theory

Directory of Open Access Journals (Sweden)

Špela Progar

2008-11-01

Full Text Available Based on nonlinear models between the measured latent variable and the item response, item response theory (IRT enables independent estimation of item and person parameters and local estimation of measurement error. These properties of IRT are also the main theoretical advantages of IRT over classical test theory (CTT. Empirical evidence, however, often failed to discover consistent differences between IRT and CTT parameters and between invariance measures of CTT and IRT parameter estimates. In this empirical study a real data set from the Third International Mathematics and Science Study (TIMSS 1995 was used to address the following questions: (1 How comparable are CTT and IRT based item and person parameters? (2 How invariant are CTT and IRT based item parameters across different participant groups? (3 How invariant are CTT and IRT based item and person parameters across different item sets? The findings indicate that the CTT and the IRT item/person parameters are very comparable, that the CTT and the IRT item parameters show similar invariance property when estimated across different groups of participants, that the IRT person parameters are more invariant across different item sets, and that the CTT item parameters are at least as much invariant in different item sets as the IRT item parameters. The results furthermore demonstrate that, with regards to the invariance property, IRT item/person parameters are in general empirically superior to CTT parameters, but only if the appropriate IRT model is used for modelling the data.
Do people with and without medical conditions respond similarly to the short health anxiety inventory? An assessment of differential item functioning using item response theory.

Science.gov (United States)

LeBouthillier, Daniel M; Thibodeau, Michel A; Alberts, Nicole M; Hadjistavropoulos, Heather D; Asmundson, Gordon J G

2015-04-01

Individuals with medical conditions are likely to have elevated health anxiety; however, research has not demonstrated how medical status impacts response patterns on health anxiety measures. Measurement bias can undermine the validity of a questionnaire by overestimating or underestimating scores in groups of individuals. We investigated whether the Short Health Anxiety Inventory (SHAI), a widely-used measure of health anxiety, exhibits medical condition-based bias on item and subscale levels, and whether the SHAI subscales adequately assess the health anxiety continuum. Data were from 963 individuals with diabetes, breast cancer, or multiple sclerosis, and 372 healthy individuals. Mantel-Haenszel tests and item characteristic curves were used to classify the severity of item-level differential item functioning in all three medical groups compared to the healthy group. Test characteristic curves were used to assess scale-level differential item functioning and whether the SHAI subscales adequately assess the health anxiety continuum. Nine out of 14 items exhibited differential item functioning. Two items exhibited differential item functioning in all medical groups compared to the healthy group. In both Thought Intrusion and Fear of Illness subscales, differential item functioning was associated with mildly deflated scores in medical groups with very high levels of the latent traits. Fear of Illness items poorly discriminated between individuals with low and very low levels of the latent trait. While individuals with medical conditions may respond differentially to some items, clinicians and researchers can confidently use the SHAI with a variety of medical populations without concern of significant bias. Copyright © 2015 Elsevier Inc. All rights reserved.
Development of the Oxford Participation and Activities Questionnaire: constructing an item pool

Directory of Open Access Journals (Sweden)

Kelly L

2015-05-01

Full Text Available Laura Kelly, Crispin Jenkinson, Sarah Dummett, Jill Dawson, Ray Fitzpatrick, David Morley Health Services Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK Purpose: The Oxford Participation and Activities Questionnaire is a patient-reported outcome measure in development that is grounded on the World Health Organization International Classification of Functioning, Disability, and Health (ICF. The study reported here aimed to inform and generate an item pool for the new measure, which is specifically designed for the assessment of participation and activity in patients experiencing a range of health conditions. Methods: Items were informed through in-depth interviews conducted with 37 participants spanning a range of conditions. Interviews aimed to identify how their condition impacted their ability to participate in meaningful activities. Conditions included arthritis, cancer, chronic back pain, diabetes, motor neuron disease, multiple sclerosis, Parkinson's disease, and spinal cord injury. Transcripts were analyzed using the framework method. Statements relating to ICF themes were recast as questionnaire items and shown for review to an expert panel. Cognitive debrief interviews (n=13 were used to assess items for face and content validity. Results: ICF themes relevant to activities and participation in everyday life were explored, and a total of 222 items formed the initial item pool. This item pool was refined by the research team and 28 generic items were mapped onto all nine chapters of the ICF construct, detailing activity and participation. Cognitive interviewing confirmed the questionnaire instructions, items, and response options were acceptable to participants. Conclusion: Using a clear conceptual basis to inform item generation, 28 items have been identified as suitable to undergo further psychometric testing. A large-scale postal survey will follow in order to refine the instrument further and
Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

Science.gov (United States)

Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

2013-07-01

Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.
Validation of a mobility item bank for older patients in primary care.

Science.gov (United States)

Cabrero-García, Julio; Ramos-Pichardo, Juan Diego; Muñoz-Mendoza, Carmen Luz; Cabañero-Martínez, María José; González-Llopis, Lorena; Reig-Ferrer, Abilio

2012-12-05

To develop and validate an item bank to measure mobility in older people in primary care and to analyse differential item functioning (DIF) and differential bundle functioning (DBF) by sex. A pool of 48 mobility items was administered by interview to 593 older people attending primary health care practices. The pool contained four domains based on the International Classification of Functioning: changing and maintaining body position, carrying, lifting and pushing, walking and going up and down stairs. The Late Life Mobility item bank consisted of 35 items, and measured with a reliability of 0.90 or more across the full spectrum of mobility, except at the higher end of better functioning. No evidence was found of non-uniform DIF but uniform DIF was observed, mainly for items in the changing and maintaining body position and carrying, lifting and pushing domains. The walking domain did not display DBF, but the other three domains did, principally the carrying, lifting and pushing items. During the design and validation of an item bank to measure mobility in older people, we found that strength (carrying, lifting and pushing) items formed a secondary dimension that produced DBF. More research is needed to determine how best to include strength items in a mobility measure, or whether it would be more appropriate to design separate measures for each construct.
The Protective Behavioral Strategies for Marijuana Scale: Further examination using item response theory.

Science.gov (United States)

Pedersen, Eric R; Huang, Wenjing; Dvorak, Robert D; Prince, Mark A; Hummer, Justin F

2017-08-01

Given recent state legislation legalizing marijuana for recreational purposes and majority popular opinion favoring these laws, we developed the Protective Behavioral Strategies for Marijuana scale (PBSM) to identify strategies that may mitigate the harms related to marijuana use among those young people who choose to use the drug. In the current study, we expand on the initial exploratory study of the PBSM to further validate the measure with a large and geographically diverse sample (N = 2,117; 60% women, 30% non-White) of college students from 11 different universities across the United States. We sought to develop a psychometrically sound item bank for the PBSM and to create a short assessment form that minimizes respondent burden and time. Quantitative item analyses, including exploratory and confirmatory factor analyses with item response theory (IRT) and evaluation of differential item functioning (DIF), revealed an item bank of 36 items that was examined for unidimensionality and good content coverage, as well as a short form of 17 items that is free of bias in terms of gender (men vs. women), race (White vs. non-White), ethnicity (Hispanic vs. non-Hispanic), and recreational marijuana use legal status (state recreational marijuana was legal for 25.5% of participants). We also provide a scoring table for easy transformation from sum scores to IRT scale scores. The PBSM item bank and short form associated strongly and negatively with past month marijuana use and consequences. The measure may be useful to researchers and clinicians conducting intervention and prevention programs with young adults. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Recovery from eating disorders: psychometric properties of a patient-related measure

Directory of Open Access Journals (Sweden)

Rosenvinge JH

2012-11-01

Full Text Available Gunn Pettersen,1 Kari-Brith Thune-Larsen,2 Jan H Rosenvinge31Department of Health and Care Sciences, Faculty of Health Sciences, University of Tromsø, Tromsø, Norway; 2Oslo University Hospital, Oslo, Norway; 3Department of Psychology, Faculty of Health Sciences, University of Tromsø, Tromsø, NorwayAbstract: Although there are numerous lists of items covering clinically valid aspects of recovery from eating disorders, these lists are on the nominal level: the potential for multidimensional development has not been explored. Such exploration is the purpose of the present study. The subjects included in the study were 152 female clinicians, 1052 females randomly selected from the general population, and 184 eating-disorder patients. All subjects rated 17 recovery items on a 10-point scale in terms of their relevance and importance. They also completed measures of knowledge about eating disorders and their own eating problems, in addition to providing information about their age and personal acquaintance with eating disorders. Fourteen recovery-item scores were sample unspecific, and hence all samples tended to judge the majority of items in a similar manner. The 17 items successfully formed three separate factors covering specific eating-disorder symptoms, as well as social and psychological issues. The clinician and general population sample analyzed together provided a more condensed scale comprising two factors (specific eating-disorder symptoms and psychosocial factors, with each factor having three items. This factor structure was successfully replicated using the patient-validation sample. The findings indicate an empirical basis for a valid recovery measure that may be suitable in future outcome research.Keywords: eating disorders, recovery, outcome, outcome measures
A Method for Individualizing the Prediction of Immunogenicity of Protein Vaccines and Biologic Therapeutics: Individualized T Cell Epitope Measure (iTEM

Directory of Open Access Journals (Sweden)

Tobias Cohen

2010-01-01

Full Text Available The promise of pharmacogenomics depends on advancing predictive medicine. To address this need in the area of immunology, we developed the individualized T cell epitope measure (iTEM tool to estimate an individual's T cell response to a protein antigen based on HLA binding predictions. In this study, we validated prospective iTEM predictions using data from in vitro and in vivo studies. We used a mathematical formula that converts DRB1∗ allele binding predictions generated by EpiMatrix, an epitope-mapping tool, into an allele-specific scoring system. We then demonstrated that iTEM can be used to define an HLA binding threshold above which immune response is likely and below which immune response is likely to be absent. iTEM's predictive power was strongest when the immune response is focused, such as in subunit vaccination and administration of protein therapeutics. iTEM may be a useful tool for clinical trial design and preclinical evaluation of vaccines and protein therapeutics.
78 FR 29392 - Embedded Digital Devices in Safety-Related Systems, Systems Important to Safety, and Items Relied...

Science.gov (United States)

2013-05-20

... NUCLEAR REGULATORY COMMISSION [NRC-2013-0098] Embedded Digital Devices in Safety-Related Systems, Systems Important to Safety, and Items Relied on for Safety AGENCY: Nuclear Regulatory Commission. ACTION... (NRC) is issuing for public comment Draft Regulatory Issue Summary (RIS) 2013-XX, ``Embedded Digital...
The Long-Term Conditions Questionnaire: conceptual framework and item development.

Science.gov (United States)

Peters, Michele; Potter, Caroline M; Kelly, Laura; Hunter, Cheryl; Gibbons, Elizabeth; Jenkinson, Crispin; Coulter, Angela; Forder, Julien; Towers, Ann-Marie; A'Court, Christine; Fitzpatrick, Ray

2016-01-01

To identify the main issues of importance when living with long-term conditions to refine a conceptual framework for informing the item development of a patient-reported outcome measure for long-term conditions. Semi-structured qualitative interviews (n=48) were conducted with people living with at least one long-term condition. Participants were recruited through primary care. The interviews were transcribed verbatim and analyzed by thematic analysis. The analysis served to refine the conceptual framework, based on reviews of the literature and stakeholder consultations, for developing candidate items for a new measure for long-term conditions. Three main organizing concepts were identified: impact of long-term conditions, experience of services and support, and self-care. The findings helped to refine a conceptual framework, leading to the development of 23 items that represent issues of importance in long-term conditions. The 23 candidate items formed the first draft of the measure, currently named the Long-Term Conditions Questionnaire. The aim of this study was to refine the conceptual framework and develop items for a patient-reported outcome measure for long-term conditions, including single and multiple morbidities and physical and mental health conditions. Qualitative interviews identified the key themes for assessing outcomes in long-term conditions, and these underpinned the development of the initial draft of the measure. These initial items will undergo cognitive testing to refine the items prior to further validation in a survey.
Examination of the PROMIS upper extremity item bank.

Science.gov (United States)

Hung, Man; Voss, Maren W; Bounsanga, Jerry; Crum, Anthony B; Tyser, Andrew R

Clinical measurement. The psychometric properties of the PROMIS v1.2 UE item bank were tested on various samples prior to its release, but have not been fully evaluated among the orthopaedic population. This study assesses the performance of the UE item bank within the UE orthopaedic patient population. The UE item bank was administered to 1197 adult patients presenting to a tertiary orthopaedic clinic specializing in hand and UE conditions and was examined using traditional statistics and Rasch analysis. The UE item bank fits a unidimensional model (outfit MNSQ range from 0.64 to 1.70) and has adequate reliabilities (person = 0.84; item = 0.82) and local independence (item residual correlations range from -0.37 to 0.34). Only one item exhibits gender differential item functioning. Most items target low levels of function. The UE item bank is a useful clinical assessment tool. Additional items covering higher functions are needed to enhance validity. Supplemental testing is recommended for patients at higher levels of function until more high function UE items are developed. 2c. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.
Calibration of Automatically Generated Items Using Bayesian Hierarchical Modeling.

Science.gov (United States)

Johnson, Matthew S.; Sinharay, Sandip

For complex educational assessments, there is an increasing use of "item families," which are groups of related items. However, calibration or scoring for such an assessment requires fitting models that take into account the dependence structure inherent among the items that belong to the same item family. C. Glas and W. van der Linden…
[Development and reliability evaluation of an instrument to measure health-related quality of life in independent elderly].

Science.gov (United States)

Lima, Maria José Barbosa de; Portela, Margareth Crisóstomo

2010-08-01

This study presents an instrument, the health-related quality of life (HRQOL) profile for independent elderly, to measure the health-related quality of life of the functionally independent elderly assisted in the outpatient setting, based on the adaptation of four validated scales: Short-Form Health Survey (SF-36), Duke-UNC Health Profile (DUHP), Sickness Impact Profile (SIP), and Nottingham Health Profile (NHP). The study also evaluates the instrument's reliability based on its use by two different observers with a 15-day interval. The instrument includes five dimensions (health perception, symptoms, physical function, psychological function, and social function) and 45 items. Reliability evaluation of the QUASI instrument was based on interviews with 142 elderly outpatients in the city of Rio de Janeiro, Brazil. Prevalence-adjusted kappa statistic was used to assess all 45 items. Correlation was also calculated between overall scores and scores on individual dimensions. In the reliability evaluation, 39 of the 45 items showed prevalence-adjusted kappa greater than 0.60.
Development of the Quantitative Reasoning Items on the National Survey of Student Engagement

Directory of Open Access Journals (Sweden)

Amber D. Dumford

2015-01-01

Full Text Available As society’s needs for quantitative skills become more prevalent, college graduates require quantitative skills regardless of their career choices. Therefore, it is important that institutions assess students’ engagement in quantitative activities during college. This study chronicles the process taken by the National Survey of Student Engagement (NSSE to develop items that measure students’ participation in quantitative reasoning (QR activities. On the whole, findings across the quantitative and qualitative analyses suggest good overall properties for the developed QR items. The items show great promise to explore and evaluate the frequency with which college students participate in QR-related activities. Each year, hundreds of institutions across the United States and Canada participate in NSSE, and, with the addition of these new items on the core survey, every participating institution will have information on this topic. Our hope is that these items will spur conversations on campuses about students’ use of quantitative reasoning activities.
Framing of mobility items: a source of poor agreement between preference-based health-related quality of life instruments in a population of individuals receiving assisted ventilation.

Science.gov (United States)

Hannan, Liam M; Whitehurst, David G T; Bryan, Stirling; Road, Jeremy D; McDonald, Christine F; Berlowitz, David J; Howard, Mark E

2017-06-01

To explore the influence of descriptive differences in items evaluating mobility on index scores generated from two generic preference-based health-related quality of life (HRQoL) instruments. The study examined cross-sectional data from a postal survey of individuals receiving assisted ventilation in two state/province-wide home mechanical ventilation services, one in British Columbia, Canada and the other in Victoria, Australia. The Assessment of Quality of Life 8-dimension (AQoL-8D) and the EQ-5D-5L were included in the data collection. Graphical illustrations, descriptive statistics, and measures of agreement [intraclass correlation coefficients (ICCs) and Bland-Altman plots] were examined using index scores derived from both instruments. Analyses were performed on the full sample as well as subgroups defined according to respondents' self-reported ability to walk. Of 868 individuals receiving assisted ventilation, 481 (55.4%) completed the questionnaire. Mean index scores were 0.581 (AQoL-8D) and 0.566 (EQ-5D-5L) with 'moderate' agreement demonstrated between the two instruments (ICC = 0.642). One hundred fifty-nine (33.1%) reported level 5 ('I am unable to walk about') on the EQ-5D-5L Mobility item. The walking status of respondents had a marked influence on the comparability of index scores, with a larger mean difference (0.206) and 'slight' agreement (ICC = 0.386) observed when the non-ambulant subgroup was evaluated separately. This study provides further evidence that between-measure discrepancies between preference-based HRQoL instruments are related in part to the framing of mobility-related items. Longitudinal studies are necessary to determine the responsiveness of preference-based HRQoL instruments in cohorts that include non-ambulant individuals.
Item level diagnostics and model - data fit in item response theory ...

African Journals Online (AJOL)

Item response theory (IRT) is a framework for modeling and analyzing item response data. Item-level modeling gives IRT advantages over classical test theory. The fit of an item score pattern to an item response theory (IRT) models is a necessary condition that must be assessed for further use of item and models that best fit ...
Measurement and control of bias in patient reported outcomes using multidimensional item response theory.

Science.gov (United States)

Dowling, N Maritza; Bolt, Daniel M; Deng, Sien; Li, Chenxi

2016-05-26

Patient-reported outcome (PRO) measures play a key role in the advancement of patient-centered care research. The accuracy of inferences, relevance of predictions, and the true nature of the associations made with PRO data depend on the validity of these measures. Errors inherent to self-report measures can seriously bias the estimation of constructs assessed by the scale. A well-documented disadvantage of self-report measures is their sensitivity to response style (RS) effects such as the respondent's tendency to select the extremes of a rating scale. Although the biasing effect of extreme responding on constructs measured by self-reported tools has been widely acknowledged and studied across disciplines, little attention has been given to the development and systematic application of methodologies to assess and control for this effect in PRO measures. We review the methodological approaches that have been proposed to study extreme RS effects (ERS). We applied a multidimensional item response theory model to simultaneously estimate and correct for the impact of ERS on trait estimation in a PRO instrument. Model estimates were used to study the biasing effects of ERS on sum scores for individuals with the same amount of the targeted trait but different levels of ERS. We evaluated the effect of joint estimation of multiple scales and ERS on trait estimates and demonstrated the biasing effects of ERS on these trait estimates when used as explanatory variables. A four-dimensional model accounting for ERS bias provided a better fit to the response data. Increasing levels of ERS showed bias in total scores as a function of trait estimates. The effect of ERS was greater when the pattern of extreme responding was the same across multiple scales modeled jointly. The estimated item category intercepts provided evidence of content independent category selection. Uncorrected trait estimates used as explanatory variables in prediction models showed downward bias. A
ITEM Project: Risk Communication on Exposure to Electromagnetic Radiation from Mobile Communications

International Nuclear Information System (INIS)

Oliveira, Carla; Carpinteiro, Goncalo; Correia, Luis M.; Fernandes, Carlos A.; Serralha, Afonso; Marques, Nuno

2004-01-01

The ITEM Project is a pioneer project in Portugal, providing public information on exposure to electromagnetic radiation, essentially due to mobile communication systems. The motivation, the main goals and the Project description are presented in this paper, as well as the website that provides the public dissemination of results and further significant information (www.lx.it.pt/item). This site provides information on different issues related to exposure to radiation, namely results of measurement campaigns conducted by a team on several locations in Portugal, and results of continuous measurements performed by autonomous stations located in public places in collaboration with municipal authorities. The global overview of the results from the measurement campaigns carried out up to present shows that all the analysed locations are in compliance with the radiation thresholds, i.e., all the electric field measured values are below the most restrictive threshold established at European level. (author)
Validation of the alcohol use item banks from the Patient-Reported Outcomes Measurement Information System (PROMIS).

Science.gov (United States)

Pilkonis, Paul A; Yu, Lan; Dodds, Nathan E; Johnston, Kelly L; Lawrence, Suzanne M; Daley, Dennis C

2016-04-01

The Patient-Reported Outcomes Measurement Information System (PROMIS) includes five item banks for alcohol use. There are limited data, however, regarding their validity (e.g., convergent validity, responsiveness to change). To provide such data, we conducted a prospective study with 225 outpatients being treated for substance abuse. Assessments were completed shortly after intake and at 1-month and 3-month follow-ups. The alcohol item banks were administered as computerized adaptive tests (CATs). Fourteen CATs and one six-item short form were also administered from eight other PROMIS domains to generate a comprehensive health status profile. After modeling treatment outcome for the sample as a whole, correlates of outcome from the PROMIS health status profile were examined. For convergent validity, the largest correlation emerged between the PROMIS alcohol use score and the Alcohol Use Disorders Identification Test (r=.79 at intake). Regarding treatment outcome, there were modest changes across the target problem of alcohol use and other domains of the PROMIS health status profile. However, significant heterogeneity was found in initial severity of drinking and in rates of change for both abstinence and severity of drinking during follow-up. This heterogeneity was associated with demographic (e.g., gender) and health-profile (e.g., emotional support, social participation) variables. The results demonstrated the validity of PROMIS CATs, which require only 4-6 items in each domain. This efficiency makes it feasible to use a comprehensive health status profile within the substance use treatment setting, providing important prognostic information regarding abstinence and severity of drinking. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

Method of data mining including determining multidimensional coordinates of each item using a predetermined scalar similarity value for each item pair

Science.gov (United States)

Meyers, Charles E.; Davidson, George S.; Johnson, David K.; Hendrickson, Bruce A.; Wylie, Brian N.

1999-01-01

A method of data mining represents related items in a multidimensional space. Distance between items in the multidimensional space corresponds to the extent of relationship between the items. The user can select portions of the space to perceive. The user also can interact with and control the communication of the space, focusing attention on aspects of the space of most interest. The multidimensional spatial representation allows more ready comprehension of the structure of the relationships among the items.
Development of an instrument to measure patient perception of the quality of nursing care and related hospital services at the national hospital of sri lanka.

Science.gov (United States)

Senarat, Upul; Gunawardena, Nalika S

2011-06-01

This study aimed to develop and validate an instrument to measure patient perception of quality of nursing care and related hospital services in a tertiary care setting. We compiled an instrument with 72 items that patients may perceive as quality of nursing care and related hospital services, following an extensive literature search, discussions with patients and care pro-I viders and a brainstorming session with an expert panel. A cross-sectional study was conducted at the National Hospital of Sri Lanka. A sample (n = 120) of patients stayed in general surgical or medical units responded to the interviewer administered instrument upon discharge. Item analysis and principal component factor analysis were performed to assess validity, and internal consistency was calculated to measure reliability. Of the 72 items, 18 had greater than 20% of responses as 'not relevant'. A further 11 items were eliminated since item-total correlations were less than .2. Factor analysis was performed on remaining 43 items which resulted in 36 items classifying into eight factors accounting for 71% of the variation. Factor loadings in the final solution after Varimax rotation were interpersonal aspects (.68-.85), efficiency (.62-.79), competency (.66-.68), comfort (.60-.84), physical environment (.65-.82), cleanliness (.81-.85), personalized information (.76-.83), and general instructions (.61-.78). The instrument had high Internal consistency (Cronbach's alpha = .91). We developed a comprehensive, reliable and valid, 36-item instrument that may be used to measure patient perception of quality of nursing care in tertiary care settings. Copyright © 2011 Korean Society of Nursing Science. Published by Elsevier B.V. All rights reserved.
Validity and usefulness of a single-item measure of patient-reported bother from side effects of cancer therapy.

Science.gov (United States)

Pearman, Timothy P; Beaumont, Jennifer L; Mroczek, Daniel; O'Connor, Mary; Cella, David

2018-03-01

The improving efficacy of cancer treatment has resulted in an increasing array of treatment-related symptoms and associated burdens imposed on individuals undergoing aggressive treatment of their disease. Often, clinical trials compare therapies that have different types, and severities, of adverse effects. Whether rated by clinicians or patients themselves, it can be difficult to know which side effect profile is more disruptive or bothersome to patients. A simple summary index of bother can help to adjudicate the variability in adverse effects across treatments being compared with each other. Across 4 studies, a total of 5765 patients enrolled in cooperative group studies and industry-sponsored clinical trials were the subjects of the current study. Patients were diagnosed with a range of primary cancer sites, including bladder, brain, breast, colon/rectum, head/neck, hepatobiliary, kidney, lung, ovary, pancreas, and prostate as well as leukemia and lymphoma. All patients were administered the Functional Assessment of Cancer Therapy-General version (FACT-G). The single item "I am bothered by side effects of treatment" (GP5), rated on a 5-point Likert scale, is part of the FACT-G. To determine its validity as a useful summary measure from the patient perspective, it was correlated with individual and aggregated clinician-rated adverse events and patient reports of their general ability to enjoy life. Analyses of pharmaceutical trials demonstrated that mean GP5 scores ("I am bothered by side effects of treatment") significantly differed by maximum adverse event grade (PEffect sizes ranged from 0.13 to 0.46. Analyses of cooperative group trials demonstrated a significant correlation between GP5 and item GF3 ("I am able to enjoy life") in the predicted direction. The single FACT-G item "I am bothered by side effects of treatment" is significantly associated with clinician-reported adverse events and with patients' ability to enjoy their lives. It has promise as an
Item Response Theory analysis of Fagerström Test for Cigarette Dependence.

Science.gov (United States)

Svicher, Andrea; Cosci, Fiammetta; Giannini, Marco; Pistelli, Francesco; Fagerström, Karl

2018-02-01

The Fagerström Test for Cigarette Dependence (FTCD) and the Heaviness of Smoking Index (HSI) are the gold standard measures to assess cigarette dependence. However, FTCD reliability and factor structure have been questioned and HSI psychometric properties are in need of further investigations. The present study examined the psychometrics properties of the FTCD and the HSI via the Item Response Theory. The study was a secondary analysis of data collected in 862 Italian daily smokers. Confirmatory factor analysis was run to evaluate the dimensionality of FTCD. A Grade Response Model was applied to FTCD and HSI to verify the fit to the data. Both item and test functioning were analyzed and item statistics, Test Information Function, and scale reliabilities were calculated. Mokken Scale Analysis was applied to estimate homogeneity and Loevinger's coefficients were calculated. The FTCD showed unidimensionality and homogeneity for most of the items and for the total score. It also showed high sensitivity and good reliability from medium to high levels of cigarette dependence, although problems related to some items (i.e., items 3 and 5) were evident. HSI had good homogeneity, adequate item functioning, and high reliability from medium to high levels of cigarette dependence. Significant Differential Item Functioning was found for items 1, 4, 5 of the FTCD and for both items of HSI. HSI seems highly recommended in clinical settings addressed to heavy smokers while FTCD would be better used in smokers with a level of cigarette dependence ranging between low and high. Copyright © 2017 Elsevier Ltd. All rights reserved.
Quality assurance of sterilized products: verification of a model relating frequency of contaminated items and increasing radiation dose

International Nuclear Information System (INIS)

Khan, A.A.; Tallentire, A.; Dwyer, J.

1977-01-01

Values of the γ-radiation resistance parameters (k and n of the 'multi-hit' expression) for Bacillus pumilus E601 spores and Serratia marcescens cells have been determined and the constancy of values for a given test condition demonstrated. These organisms, differing by a factor of about 50 in k value, have been included in a laboratory test system for use in verification of a model describing the dependence of the proportion of contaminated items in a population of items on radiation dose. The proportions of contaminated units of the test system at various γ-radiation doses have been measured for different initial numbers and types of organisms present in units either singly or together. Using the model, the probabilities of contaminated units for corresponding sets of conditions have been evaluated together with associated variances. Measured proportions and predicted probabilities agree well, showing that the model holds in a laboratory contrived situation. (author)
Polytomous latent scales for the investigation of the ordering of items

NARCIS (Netherlands)

Ligtvoet, R.; van der Ark, L.A.; Bergsma, W. P.; Sijtsma, K.

2011-01-01

We propose three latent scales within the framework of nonparametric item response theory for polytomously scored items. Latent scales are models that imply an invariant item ordering, meaning that the order of the items is the same for each measurement value on the latent scale. This ordering
10 CFR 74.55 - Item monitoring.

Science.gov (United States)

2010-01-01

... Quantities of Strategic Special Nuclear Material § 74.55 Item monitoring. (a) Licensees subject to § 74.51... quantitatively measured, the validity of that measurement independently confirmed, and that additionally have..., except for reactor components measuring at least one meter in length and weighing in excess of 30...
Item response theory, computerized adaptive testing, and PROMIS: assessment of physical function.

Science.gov (United States)

Fries, James F; Witter, James; Rose, Matthias; Cella, David; Khanna, Dinesh; Morgan-DeWitt, Esi

2014-01-01

Patient-reported outcome (PRO) questionnaires record health information directly from research participants because observers may not accurately represent the patient perspective. Patient-reported Outcomes Measurement Information System (PROMIS) is a US National Institutes of Health cooperative group charged with bringing PRO to a new level of precision and standardization across diseases by item development and use of item response theory (IRT). With IRT methods, improved items are calibrated on an underlying concept to form an item bank for a "domain" such as physical function (PF). The most informative items can be combined to construct efficient "instruments" such as 10-item or 20-item PF static forms. Each item is calibrated on the basis of the probability that a given person will respond at a given level, and the ability of the item to discriminate people from one another. Tailored forms may cover any desired level of the domain being measured. Computerized adaptive testing (CAT) selects the best items to sharpen the estimate of a person's functional ability, based on prior responses to earlier questions. PROMIS item banks have been improved with experience from several thousand items, and are calibrated on over 21,000 respondents. In areas tested to date, PROMIS PF instruments are superior or equal to Health Assessment Questionnaire and Medical Outcome Study Short Form-36 Survey legacy instruments in clarity, translatability, patient importance, reliability, and sensitivity to change. Precise measures, such as PROMIS, efficiently incorporate patient self-report of health into research, potentially reducing research cost by lowering sample size requirements. The advent of routine IRT applications has the potential to transform PRO measurement.
Clusters of cultures: diversity in meaning of family value and gender role items across Europe.

Science.gov (United States)

van Vlimmeren, Eva; Moors, Guy B D; Gelissen, John P T M

2017-01-01

Survey data are often used to map cultural diversity by aggregating scores of attitude and value items across countries. However, this procedure only makes sense if the same concept is measured in all countries. In this study we argue that when (co)variances among sets of items are similar across countries, these countries share a common way of assigning meaning to the items. Clusters of cultures can then be observed by doing a cluster analysis on the (co)variance matrices of sets of related items. This study focuses on family values and gender role attitudes. We find four clusters of cultures that assign a distinct meaning to these items, especially in the case of gender roles. Some of these differences reflect response style behavior in the form of acquiescence. Adjusting for this style effect impacts on country comparisons hence demonstrating the usefulness of investigating the patterns of meaning given to sets of items prior to aggregating scores into cultural characteristics.
Assessment of Differential Item Functioning in the Experiences of Discrimination Index

Science.gov (United States)

Cunningham, Timothy J.; Berkman, Lisa F.; Gortmaker, Steven L.; Kiefe, Catarina I.; Jacobs, David R.; Seeman, Teresa E.; Kawachi, Ichiro

2011-01-01

The psychometric properties of instruments used to measure self-reported experiences of discrimination in epidemiologic studies are rarely assessed, especially regarding construct validity. The authors used 2000–2001 data from the Coronary Artery Risk Development in Young Adults (CARDIA) Study to examine differential item functioning (DIF) in 2 versions of the Experiences of Discrimination (EOD) Index, an index measuring self-reported experiences of racial/ethnic and gender discrimination. DIF may confound interpretation of subgroup differences. Large DIF was observed for 2 of 7 racial/ethnic discrimination items: White participants reported more racial/ethnic discrimination for the “at school” item, and black participants reported more racial/ethnic discrimination for the “getting housing” item. The large DIF by race/ethnicity in the index for racial/ethnic discrimination probably reflects item impact and is the result of valid group differences between blacks and whites regarding their respective experiences of discrimination. The authors also observed large DIF by race/ethnicity for 3 of 7 gender discrimination items. This is more likely to have been due to item bias. Users of the EOD Index must consider the advantages and disadvantages of DIF adjustment (omitting items, constructing separate measures, and retaining items). The EOD Index has substantial usefulness as an instrument that can assess self-reported experiences of discrimination. PMID:22038104
Developing and testing items for the South African Personality Inventory (SAPI

Directory of Open Access Journals (Sweden)

Carin Hill

2013-11-01

Research purpose: This article reports on the process of identifying items for, and provides a quantitative evaluation of, the South African Personality Inventory (SAPI items. Motivation for the study: The study intended to develop an indigenous and psychometrically sound personality instrument that adheres to the requirements of South African legislation and excludes cultural bias. Research design, approach and method: The authors used a cross-sectional design. They measured the nine SAPI clusters identified in the qualitative stage of the SAPI project in 11 separate quantitative studies. Convenience sampling yielded 6735 participants. Statistical analysis focused on the construct validity and reliability of items. The authors eliminated items that showed poor performance, based on common psychometric criteria, and selected the best performing items to form part of the final version of the SAPI. Main findings: The authors developed 2573 items from the nine SAPI clusters. Of these, 2268 items were valid and reliable representations of the SAPI facets. Practical/managerial implications: The authors developed a large item pool. It measures personality in South Africa. Researchers can refine it for the SAPI. Furthermore, the project illustrates an approach that researchers can use in projects that aim to develop culturally-informed psychological measures. Contribution/value-add: Personality assessment is important for recruiting, selecting and developing employees. This study contributes to the current knowledge about the early processes researchers follow when they develop a personality instrument that measures personality fairly in different cultural groups, as the SAPI does.
Mathematical-programming approaches to test item pool design

NARCIS (Netherlands)

Veldkamp, Bernard P.; van der Linden, Willem J.; Ariel, A.

2002-01-01

This paper presents an approach to item pool design that has the potential to improve on the quality of current item pools in educational and psychological testing andhence to increase both measurement precision and validity. The approach consists of the application of mathematical programming
Item-cued directed forgetting of related words and pictures in children and adults: selective rehearsal versus cognitive inhibition.

Science.gov (United States)

Lehman, E B; McKinley-Pace, M; Leonard, A M; Thompson, D; Johns, K

2001-01-01

The main purpose of this study was to compare the relative importance of selective rehearsal and cognitive inhibition in accounting for developmental changes in the directed-forgetting paradigm developed by R. A. Bjork (1972). In two experiments, children in Grades 2 and 5 and college students were asked to remember some words or pictures and to forget others when items were categorically related. Their memory for both items and the associated remember or forget cues was then tested with recall and recognition. Fifth graders recognized more of the forget-cued words than college students did. The pattern of results suggested that age differences in rehearsal and source monitoring (i.e., remembering whether a word had been cued remember or forget) were better explanatory mechanisms for children's forgetting inefficiencies than retrieval inhibition was. The results are discussed in terms of a multiple process view of inhibition.
The Body Appreciation Scale-2: item refinement and psychometric evaluation.

Science.gov (United States)

Tylka, Tracy L; Wood-Barcalow, Nichole L

2015-01-01

Considered a positive body image measure, the 13-item Body Appreciation Scale (BAS; Avalos, Tylka, & Wood-Barcalow, 2005) assesses individuals' acceptance of, favorable opinions toward, and respect for their bodies. While the BAS has accrued psychometric support, we improved it by rewording certain BAS items (to eliminate sex-specific versions and body dissatisfaction-based language) and developing additional items based on positive body image research. In three studies, we examined the reworded, newly developed, and retained items to determine their psychometric properties among college and online community (Amazon Mechanical Turk) samples of 820 women and 767 men. After exploratory factor analysis, we retained 10 items (five original BAS items). Confirmatory factor analysis upheld the BAS-2's unidimensionality and invariance across sex and sample type. Its internal consistency, test-retest reliability, and construct (convergent, incremental, and discriminant) validity were supported. The BAS-2 is a psychometrically sound positive body image measure applicable for research and clinical settings. Copyright © 2014 Elsevier Ltd. All rights reserved.
Australian Chemistry Test Item Bank: Years 11 & 12. Volume 1.

Science.gov (United States)

Commons, C., Ed.; Martin, P., Ed.

Volume 1 of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the "ACER…
Enactment versus observation: item-specific and relational processing in goal-directed action sequences (and lists of single actions.

Directory of Open Access Journals (Sweden)

Janette Schult

Full Text Available What are the memory-related consequences of learning actions (such as "apply the patch" by enactment during study, as compared to action observation? Theories converge in postulating that enactment encoding increases item-specific processing, but not the processing of relational information. Typically, in the laboratory enactment encoding is studied for lists of unrelated single actions in which one action execution has no overarching purpose or relation with other actions. In contrast, real-life actions are usually carried out with the intention to achieve such a purpose. When actions are embedded in action sequences, relational information provides efficient retrieval cues. We contrasted memory for single actions with memory for action sequences in three experiments. We found more reliance on relational processing for action-sequences than single actions. To what degree can this relational information be used after enactment versus after the observation of an actor? We found indicators of superior relational processing after observation than enactment in ordered pair recall (Experiment 1A and in emerging subjective organization of repeated recall protocols (recall runs 2-3, Experiment 2. An indicator of superior item-specific processing after enactment compared to observation was recognition (Experiment 1B, Experiment 2. Similar net recall suggests that observation can be as good a learning strategy as enactment. We discuss possible reasons why these findings only partly converge with previous research and theorizing.
Diagnostics of transparent polymer coatings of metal items

Science.gov (United States)

Varepo, L. G.; Ermakova, I. N.; Nagornova, I. V.; Kondratov, A. P.

2017-08-01

The methods of visual and instrumental express diagnostics of safety critical defects and non-uniform thickness of transparent mono- and multilayer polyolefin surface coating of metal items are analyzed in the paper. The instrumental diagnostics method relates to colorimetric measuring based on effects, which appear in the polarized light for extrusion polymer coatings. A color coordinates dependence (in the color system CIE La*b*) on both HDPE / PVC coating thickness fluctuation values (from average ones) and coating interlayer or adhesion layer delaminating is shown. A variation of color characteristics in the polarized light at a liquid penetration into delaminated polymer layers is found. Measuring parameters and critical uncertainties are defined.
A Bifactor Multidimensional Item Response Theory Model for Differential Item Functioning Analysis on Testlet-Based Items

Science.gov (United States)

Fukuhara, Hirotaka; Kamata, Akihito

2011-01-01

A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…
Methodology for the development and calibration of the SCI-QOL item banks.

Science.gov (United States)

Tulsky, David S; Kisala, Pamela A; Victorson, David; Choi, Seung W; Gershon, Richard; Heinemann, Allen W; Cella, David

2015-05-01

To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Individual interviews (n=44) and focus groups (n=65 individuals with SCI and n=42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n=877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n=245) to assess test-retest reliability and stability. A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury--Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM.
Structural and reliability analysis of a patient satisfaction with cancer-related care measure: a multisite patient navigation research program study.

Science.gov (United States)

Jean-Pierre, Pascal; Fiscella, Kevin; Freund, Karen M; Clark, Jack; Darnell, Julie; Holden, Alan; Post, Douglas; Patierno, Steven R; Winters, Paul C

2011-02-15

Patient satisfaction is an important outcome measure of quality of cancer care and 1 of the 4 core study outcomes of the National Cancer Institute (NCI)-sponsored Patient Navigation Research Program to reduce race/ethnicity-based disparities in cancer care. There is no existing patient satisfaction measure that spans the spectrum of cancer-related care. The objective of this study was to develop a Patient Satisfaction With Cancer Care measure that is relevant to patients receiving diagnostic/therapeutic cancer-related care. The authors developed a conceptual framework, an operational definition of Patient Satisfaction With Cancer Care, and an item pool based on literature review, expert feedback, group discussion, and consensus. The 35-item Patient Satisfaction With Cancer Care measure was administered to 891 participants from the multisite NCI-sponsored Patient Navigation Research Program. Principal components analysis (PCA) was conducted for latent structure analysis. Internal consistency was assessed using Cronbach coefficient alpha (α). Divergent analysis was performed using correlation analyses between the Patient Satisfaction With Cancer Care, the Communication and Attitudinal Self-Efficacy-Cancer, and demographic variables. The PCA revealed a 1-dimensional measure with items forming a coherent set explaining 62% of the variance in patient satisfaction. Reliability assessment revealed high internal consistency (α ranging from 0.95 to 0.96). The Patient Satisfaction With Cancer Care demonstrated good face validity, convergent validity, and divergent validity, as indicated by moderate correlations with subscales of the Communication and Attitudinal Self-Efficacy-Cancer (all P .05). The Patient Satisfaction With Cancer Care is a valid tool for assessing satisfaction with cancer-related care for this sample. Copyright © 2010 American Cancer Society.

Measurement equivalence of the KINDL questionnaire across child self-reports and parent proxy-reports: a comparison between item response theory and ordinal logistic regression.

Science.gov (United States)

Jafari, Peyman; Sharafi, Zahra; Bagheri, Zahra; Shalileh, Sara

2014-06-01

Measurement equivalence is a necessary assumption for meaningful comparison of pediatric quality of life rated by children and parents. In this study, differential item functioning (DIF) analysis is used to examine whether children and their parents respond consistently to the items in the KINDer Lebensqualitätsfragebogen (KINDL; in German, Children Quality of Life Questionnaire). Two DIF detection methods, graded response model (GRM) and ordinal logistic regression (OLR), were applied for comparability. The KINDL was completed by 1,086 school children and 1,061 of their parents. While the GRM revealed that 12 out of the 24 items were flagged with DIF, the OLR identified 14 out of the 24 items with DIF. Seven items with DIF and five items without DIF were common across the two methods, yielding a total agreement rate of 50 %. This study revealed that parent proxy-reports cannot be used as a substitute for a child's ratings in the KINDL.
Methods for Assessing Item, Step, and Threshold Invariance in Polytomous Items Following the Partial Credit Model

Science.gov (United States)

Penfield, Randall D.; Myers, Nicholas D.; Wolfe, Edward W.

2008-01-01

Measurement invariance in the partial credit model (PCM) can be conceptualized in several different but compatible ways. In this article the authors distinguish between three forms of measurement invariance in the PCM: step invariance, item invariance, and threshold invariance. Approaches for modeling these three forms of invariance are proposed,…
An Investigation of Item Type in a Standards-Based Assessment.

Directory of Open Access Journals (Sweden)

Liz Hollingworth

2007-12-01

Full Text Available Large-scale state assessment programs use both multiple-choice and open-ended items on tests for accountability purposes. Certainly, there is an intuitive belief among some educators and policy makers that open-ended items measure something different than multiple-choice items. This study examined two item formats in custom-built, standards-based tests of achievement in Reading and Mathematics at grades 3-8. In this paper, we raise questions about the value of including open-ended items, given scoring costs, time constraints, and the higher probability of missing data from test-takers.
Does remembering emotional items impair recall of same-emotion items?

Science.gov (United States)

Sison, Jo Ann G; Mather, Mara

2007-04-01

In the part-set cuing effect, cuing a subset of previously studied items impairs recall of the remaining noncued items. This experiment reveals that cuing participants with previously-studied emotional pictures (e.g., fear-evoking pictures of people) can impair recall of pictures involving the same emotion but different content (e.g., fear-evoking pictures of animals). This indicates that new events can be organized in memory using emotion as a grouping function to create associations. However, whether new information is organized in memory along emotional or nonemotional lines appears to be a flexible process that depends on people's current focus. Mentioning in the instructions that the pictures were either amusement- or fear-related led to memory impairment for pictures with the same emotion as cued pictures, whereas mentioning that the pictures depicted either animals or people led to memory impairment for pictures with the same type of actor.
An approach for estimating item sensitivity to within-person change over time: An illustration using the Alzheimer's Disease Assessment Scale-Cognitive subscale (ADAS-Cog).

Science.gov (United States)

Dowling, N Maritza; Bolt, Daniel M; Deng, Sien

2016-12-01

When assessments are primarily used to measure change over time, it is important to evaluate items according to their sensitivity to change, specifically. Items that demonstrate good sensitivity to between-person differences at baseline may not show good sensitivity to change over time, and vice versa. In this study, we applied a longitudinal factor model of change to a widely used cognitive test designed to assess global cognitive status in dementia, and contrasted the relative sensitivity of items to change. Statistically nested models were estimated introducing distinct latent factors related to initial status differences between test-takers and within-person latent change across successive time points of measurement. Models were estimated using all available longitudinal item-level data from the Alzheimer's Disease Assessment Scale-Cognitive subscale, including participants representing the full-spectrum of disease status who were enrolled in the multisite Alzheimer's Disease Neuroimaging Initiative. Five of the 13 Alzheimer's Disease Assessment Scale-Cognitive items demonstrated noticeably higher loadings with respect to sensitivity to change. Attending to performance change on only these 5 items yielded a clearer picture of cognitive decline more consistent with theoretical expectations in comparison to the full 13-item scale. Items that show good psychometric properties in cross-sectional studies are not necessarily the best items at measuring change over time, such as cognitive decline. Applications of the methodological approach described and illustrated in this study can advance our understanding regarding the types of items that best detect fine-grained early pathological changes in cognition. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Differential item functioning analysis of the Vanderbilt Expertise Test for cars.

Science.gov (United States)

Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W; Van Gulick, Ana Beth; Gauthier, Isabel

2015-01-01

The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.
Item-Level Psychometrics of the Glasgow Outcome Scale: Extended Structured Interviews.

Science.gov (United States)

Hong, Ickpyo; Li, Chih-Ying; Velozo, Craig A

2016-04-01

The Glasgow Outcome Scale-Extended (GOSE) structured interview captures critical components of activities and participation, including home, shopping, work, leisure, and family/friend relationships. Eighty-nine community dwelling adults with mild-moderate traumatic brain injury (TBI) were recruited (average = 2.7 year post injury). Nine items of the 19 items were used for the psychometrics analysis purpose. Factor analysis and item-level psychometrics were investigated using the Rasch partial-credit model. Although the principal components analysis of residuals suggests that a single measurement factor dominates the measure, the instrument did not meet the factor analysis criteria. Five items met the rating scale criteria. Eight items fit the Rasch model. The instrument demonstrated low person reliability (0.63), low person strata (2.07), and a slight ceiling effect. The GOSE demonstrated limitations in precisely measuring activities/participation for individuals after TBI. Future studies should examine the impact of the low precision of the GOSE on effect size. © The Author(s) 2016.
Reliability and validity of the Spanish version of the 10-item Connor-Davidson Resilience Scale (10-item CD-RISC in young adults

Directory of Open Access Journals (Sweden)

García-Campayo Javier

2011-08-01

Full Text Available Abstract Background The 10-item Connor-Davidson Resilience Scale (10-item CD-RISC is an instrument for measuring resilience that has shown good psychometric properties in its original version in English. The aim of this study was to evaluate the validity and reliability of the Spanish version of the 10-item CD-RISC in young adults and to verify whether it is structured in a single dimension as in the original English version. Findings Cross-sectional observational study including 681 university students ranging in age from 18 to 30 years. The number of latent factors in the 10 items of the scale was analyzed by exploratory factor analysis. Confirmatory factor analysis was used to verify whether a single factor underlies the 10 items of the scale as in the original version in English. The convergent validity was analyzed by testing whether the mean of the scores of the mental component of SF-12 (MCS and the quality of sleep as measured with the Pittsburgh Sleep Index (PSQI were higher in subjects with better levels of resilience. The internal consistency of the 10-item CD-RISC was estimated using the Cronbach α test and test-retest reliability was estimated with the intraclass correlation coefficient. The Cronbach α coefficient was 0.85 and the test-retest intraclass correlation coefficient was 0.71. The mean MCS score and the level of quality of sleep in both men and women were significantly worse in subjects with lower resilience scores. Conclusions The Spanish version of the 10-item CD-RISC showed good psychometric properties in young adults and thus can be used as a reliable and valid instrument for measuring resilience. Our study confirmed that a single factor underlies the resilience construct, as was the case of the original scale in English.
The basics of item response theory using R

CERN Document Server

Baker, Frank B

2017-01-01

This graduate-level textbook is a tutorial for item response theory that covers both the basics of item response theory and the use of R for preparing graphical presentation in writings about the theory. Item response theory has become one of the most powerful tools used in test construction, yet one of the barriers to learning and applying it is the considerable amount of sophisticated computational effort required to illustrate even the simplest concepts. This text provides the reader access to the basic concepts of item response theory freed of the tedious underlying calculations. It is intended for those who possess limited knowledge of educational measurement and psychometrics. Rather than presenting the full scope of item response theory, this textbook is concise and practical and presents basic concepts without becoming enmeshed in underlying mathematical and computational complexities. Clearly written text and succinct R code allow anyone familiar with statistical concepts to explore and apply item re...
Examining Multiple Sources of Differential Item Functioning on the Clinician & Group CAHPS® Survey

Science.gov (United States)

Rodriguez, Hector P; Crane, Paul K

2011-01-01

Objective To evaluate psychometric properties of a widely used patient experience survey. Data Sources English-language responses to the Clinician & Group Consumer Assessment of Healthcare Providers and Systems (CG-CAHPS®) survey (n = 12,244) from a 2008 quality improvement initiative involving eight southern California medical groups. Methods We used an iterative hybrid ordinal logistic regression/item response theory differential item functioning (DIF) algorithm to identify items with DIF related to patient sociodemographic characteristics, duration of the physician–patient relationship, number of physician visits, and self-rated physical and mental health. We accounted for all sources of DIF and determined its cumulative impact. Principal Findings The upper end of the CG-CAHPS® performance range is measured with low precision. With sensitive settings, some items were found to have DIF. However, overall DIF impact was negligible, as 0.14 percent of participants had salient DIF impact. Latinos who spoke predominantly English at home had the highest prevalence of salient DIF impact at 0.26 percent. Conclusions The CG-CAHPS® functions similarly across commercially insured respondents from diverse backgrounds. Consequently, previously documented racial and ethnic group differences likely reflect true differences rather than measurement bias. The impact of low precision at the upper end of the scale should be clarified. PMID:22092021
Combining item response theory with multiple imputation to equate health assessment questionnaires.

Science.gov (United States)

Gu, Chenyang; Gutman, Roee

2017-09-01

The assessment of patients' functional status across the continuum of care requires a common patient assessment tool. However, assessment tools that are used in various health care settings differ and cannot be easily contrasted. For example, the Functional Independence Measure (FIM) is used to evaluate the functional status of patients who stay in inpatient rehabilitation facilities, the Minimum Data Set (MDS) is collected for all patients who stay in skilled nursing facilities, and the Outcome and Assessment Information Set (OASIS) is collected if they choose home health care provided by home health agencies. All three instruments or questionnaires include functional status items, but the specific items, rating scales, and instructions for scoring different activities vary between the different settings. We consider equating different health assessment questionnaires as a missing data problem, and propose a variant of predictive mean matching method that relies on Item Response Theory (IRT) models to impute unmeasured item responses. Using real data sets, we simulated missing measurements and compared our proposed approach to existing methods for missing data imputation. We show that, for all of the estimands considered, and in most of the experimental conditions that were examined, the proposed approach provides valid inferences, and generally has better coverages, relatively smaller biases, and shorter interval estimates. The proposed method is further illustrated using a real data set. © 2016, The International Biometric Society.
Hand-related physical function in rheumatic hand conditions

DEFF Research Database (Denmark)

Klokker, Louise; Terwee, Caroline B; Wæhrens, Eva Ejlersen

2016-01-01

as well as those items from the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank that are relevant to patients with rheumatic hand conditions. Selection will be based on consensus among reviewers. Content validity of selected items will be established......INTRODUCTION: There is no consensus about what constitutes the most appropriate patient-reported outcome measurement (PROM) instrument for measuring physical function in patients with rheumatic hand conditions. Existing instruments lack psychometric testing and vary in feasibility...... and their psychometric qualities. We aim to develop a PROM instrument to assess hand-related physical function in rheumatic hand conditions. METHODS AND ANALYSIS: We will perform a systematic search to identify existing PROMs to rheumatic hand conditions, and select items relevant for hand-related physical function...
Hand-related physical function in rheumatic hand conditions

DEFF Research Database (Denmark)

Klokker, Louise; Terwee, Caroline; Wæhrens, Eva Elisabet Ejlersen

2016-01-01

INTRODUCTION: There is no consensus about what constitutes the most appropriate patient-reported outcome measurement (PROM) instrument for measuring physical function in patients with rheumatic hand conditions. Existing instruments lack psychometric testing and vary in feasibility...... and their psychometric qualities. We aim to develop a PROM instrument to assess hand-related physical function in rheumatic hand conditions. METHODS AND ANALYSIS: We will perform a systematic search to identify existing PROMs to rheumatic hand conditions, and select items relevant for hand-related physical function...... as well as those items from the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank that are relevant to patients with rheumatic hand conditions. Selection will be based on consensus among reviewers. Content validity of selected items will be established...
Comprehensively Measuring Health-Related Subjective Well-Being: Dimensionality Analysis for Improved Outcome Assessment in Health Economics.

Science.gov (United States)

de Vries, Marieke; Emons, Wilco H M; Plantinga, Arnoud; Pietersma, Suzanne; van den Hout, Wilbert B; Stiggelbout, Anne M; van den Akker-van Marle, M Elske

2016-01-01

Allocation of inevitably limited financial resources for health care requires assessment of an intervention's effectiveness. Interventions likely affect quality of life (QOL) more broadly than is measurable with commonly used health-related QOL utility scales. In line with the World Health Organization's definition of health, a recent Delphi procedure showed that assessment needs to put more emphasis on mental and social dimensions. To identify the core dimensions of health-related subjective well-being (HR-SWB) for a new, more comprehensive outcome measure. We formulated items for each domain of an initial Delphi-based set of 21 domains of HR-SWB. We tested these items in a large sample (N = 1143) and used dimensionality analyses to find a smaller number of latent factors. Exploratory factor analysis suggested a five-factor model, which explained 65% of the total variance. Factors related to physical independence, positive affect, negative affect, autonomy, and personal growth. Correlations between the factors ranged from 0.19 to 0.59. A closer inspection of the factors revealed an overlap between the newly identified core dimensions of HR-SWB and the validation scales, but the dimensions of HR-SWB also seemed to reflect additional aspects. This shows that the dimensions of HR-SWB we identified go beyond the existing health-related QOL instruments. We identified a set of five key dimensions to be included in a new, comprehensive measure of HR-SWB that reliably captures these dimensions and fills in the gaps of the existent measures used in economic evaluations. Copyright © 2016 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Linking Existing Instruments to Develop an Activity of Daily Living Item Bank.

Science.gov (United States)

Li, Chih-Ying; Romero, Sergio; Bonilha, Heather S; Simpson, Kit N; Simpson, Annie N; Hong, Ickpyo; Velozo, Craig A

2018-03-01

This study examined dimensionality and item-level psychometric properties of an item bank measuring activities of daily living (ADL) across inpatient rehabilitation facilities and community living centers. Common person equating method was used in the retrospective veterans data set. This study examined dimensionality, model fit, local independence, and monotonicity using factor analyses and fit statistics, principal component analysis (PCA), and differential item functioning (DIF) using Rasch analysis. Following the elimination of invalid data, 371 veterans who completed both the Functional Independence Measure (FIM) and minimum data set (MDS) within 6 days were retained. The FIM-MDS item bank demonstrated good internal consistency (Cronbach's α = .98) and met three rating scale diagnostic criteria and three of the four model fit statistics (comparative fit index/Tucker-Lewis index = 0.98, root mean square error of approximation = 0.14, and standardized root mean residual = 0.07). PCA of Rasch residuals showed the item bank explained 94.2% variance. The item bank covered the range of θ from -1.50 to 1.26 (item), -3.57 to 4.21 (person) with person strata of 6.3. The findings indicated the ADL physical function item bank constructed from FIM and MDS measured a single latent trait with overall acceptable item-level psychometric properties, suggesting that it is an appropriate source for developing efficient test forms such as short forms and computerized adaptive tests.
A mixed methods approach to adapting health-related quality of life measures for use in routine oncology clinical practice.

Science.gov (United States)

Harley, Clare; Takeuchi, Elena; Taylor, Sally; Keding, Ada; Absolom, Kate; Brown, Julia; Velikova, Galina

2012-04-01

The current study reviewed and adapted existing health-related quality of life (HRQoL) instruments for use in routine clinical practice delivering outpatient chemotherapy for colorectal, breast and gynaecological cancers. 564 (288 gynaecological, 208 breast and 68 colorectal) outpatient consultations of 141 patients were audio-recorded and analysed to identify discussed issues. Issues were ranked from most to least commonly discussed within each disease group. Existing HRQoL instruments were evaluated against these lists and best fitting items entered into cancer-specific item banks. Item banks were evaluated during semi-structured interviews by twenty-one oncologists (13 consultants and 8 specialist registrars), four clinical nurse specialists and thirty patients, from breast, gynaecological and colorectal cancer practices. Pilot questionnaires were completed by 448 (145 breast, 148 gynaecological and 155 colorectal) patients attending outpatient clinics. Item selection and scale reliability was explored using descriptive data and psychometric methods alongside qualitative patient and clinician ratings. Each questionnaire includes five physical and three psychosocial function scales each with good internal consistency reliability (α > 0.70) plus disease-specific individual-symptom items identified as useful in clinical practice. Three cancer-specific health-related quality of life measures were developed for use in routine clinical practice. Initial analyses suggest good clinical utility and acceptable psychometric properties for the new instruments.
Prediction of true test scores from observed item scores and ancillary data.

Science.gov (United States)

Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

2015-05-01

In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. © 2015 The British Psychological Society.
Loglinear multidimensional IRT models for polytomously scored items

NARCIS (Netherlands)

Kelderman, Henk; Rijkes, Carl P.M.; Rijkes, Carl

1994-01-01

A loglinear IRT model is proposed that relates polytomously scored item responses to a multidimensional latent space. The analyst may specify a response function for each response, indicating which latent abilities are necessary to arrive at that response. Each item may have a different number of
45 CFR 61.9 - Reporting civil judgments related to the delivery of a health care item or service.

Science.gov (United States)

2010-10-01

... 45 Public Welfare 1 2010-10-01 2010-10-01 false Reporting civil judgments related to the delivery of a health care item or service. 61.9 Section 61.9 Public Welfare DEPARTMENT OF HEALTH AND HUMAN SERVICES GENERAL ADMINISTRATION HEALTHCARE INTEGRITY AND PROTECTION DATA BANK FOR FINAL ADVERSE INFORMATION...
Structural validity of a 16-item abridged version of the Cervantes Health-Related Quality of Life scale for menopause: the Cervantes Short-Form Scale.

Science.gov (United States)

Coronado, Pluvio J; Borrego, Rafael Sánchez; Palacios, Santiago; Ruiz, Miguel A; Rejas, Javier

2015-03-01

The Cervantes Scale is a specific health-related quality of life questionnaire that was originally developed in Spanish to be used in Spain for women through and beyond menopause. It contains 31 items and is time-consuming. The aim of this study was to produce an abridged version with the same dimensional structure and with similar psychometric properties. A representative sample of 516 postmenopausal women (mean [SD] age, 57 [4.31] y) seen in outpatient gynecology clinics and extracted from an observational cross-sectional study was used. Item analysis, internal consistency reliability, item-total and item-dimension correlations, and item correlation with the 12-item Medical Outcomes Study Short Form Health Survey Version 2.0 were studied. Dimensional and full-model confirmatory factor analyses were used to check structure stability. A threefold cross-validation method was used to obtain stable estimates by means of multigroup analysis. The scale was reduced to a 16-item version, the Cervantes Short-Form Scale, containing four main dimensions (Menopause and Health, Psychological, Sexuality, and Couple Relations), with the first dimension composed of three subdimensions (Vasomotor Symptoms, Health, and Aging). Goodness-of-fit statistics were better than those of the extended version (χ(2)/df = 2.493; adjusted goodness-of-fit index, 0.802; parsimony comparative fit index, 0.749; root mean standard error of approximation, 0.054). Internal consistency was good (Cronbach's α = 0.880). Correlations between the extended and the reduced dimensions were high and significant in all cases (P < 0.001; r values ranged from 0.90 for Sexuality to 0.969 for Vasomotor Symptoms). The Cervantes Scale can be reduced to a 16-item abridged version (Cervantes Short-Form Scale) that maintains the original dimensional structure and psychometric properties. At 51% of the original length, this version can be administered faster, making it especially suitable for routine medical practice.

Item-level factor analysis of the Self-Efficacy Scale.

Science.gov (United States)

Bunketorp Käll, Lina

2014-03-01

This study explores the internal structure of the Self-Efficacy Scale (SES) using item response analysis. The SES was previously translated into Swedish and modified to encompass all types of pain, not exclusively back pain. Data on perceived self-efficacy in 47 patients with subacute whiplash-associated disorders were derived from a previously conducted randomized-controlled trial. The item-level factor analysis was carried out using a six-step procedure. To further study the item inter-relationships and to determine the underlying structure empirically, the 20 items of the SES were also subjected to principal component analysis with varimax rotation. The analyses showed two underlying factors, named 'social activities' and 'physical activities', with seven items loading on each factor. The remaining six items of the SES appeared to measure somewhat different constructs and need to be analysed further.
Australian Chemistry Test Item Bank: Years 11 and 12. Volume 2.

Science.gov (United States)

Commons, C., Ed.; Martin, P., Ed.

The second volume of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the…
ACER Chemistry Test Item Collection (ACER CHEMTIC Year 12 Supplement).

Science.gov (United States)

Australian Council for Educational Research, Hawthorn.

This publication contains 317 multiple-choice chemistry test items related to topics covered in the Victorian (Australia) Year 12 chemistry course. It allows teachers access to a range of items suitable for diagnostic and achievement purposes, supplementing the ACER Chemistry Test Item Collection--Year 12 (CHEMTIC). The topics covered are: organic…
Automatic item generation implemented for measuring artistic judgment aptitude.

Science.gov (United States)

Bezruczko, Nikolaus

2014-01-01

Automatic item generation (AIG) is a broad class of methods that are being developed to address psychometric issues arising from internet and computer-based testing. In general, issues emphasize efficiency, validity, and diagnostic usefulness of large scale mental testing. Rapid prominence of AIG methods and their implicit perspective on mental testing is bringing painful scrutiny to many sacred psychometric assumptions. This report reviews basic AIG ideas, then presents conceptual foundations, image model development, and operational application to artistic judgment aptitude testing.
Better assessment of physical function: item improvement is neglected but essential.

Science.gov (United States)

Bruce, Bonnie; Fries, James F; Ambrosini, Debbie; Lingala, Bharathi; Gandek, Barbara; Rose, Matthias; Ware, John E

2009-01-01

Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models
Reliability measures in item response theory: manifest versus latent correlation functions.

Science.gov (United States)

Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Verbeke, Geert; De Boeck, Paul

2015-02-01

For item response theory (IRT) models, which belong to the class of generalized linear or non-linear mixed models, reliability at the scale of observed scores (i.e., manifest correlation) is more difficult to calculate than latent correlation based reliability, but usually of greater scientific interest. This is not least because it cannot be calculated explicitly when the logit link is used in conjunction with normal random effects. As such, approximations such as Fisher's information coefficient, Cronbach's α, or the latent correlation are calculated, allegedly because it is easy to do so. Cronbach's α has well-known and serious drawbacks, Fisher's information is not meaningful under certain circumstances, and there is an important but often overlooked difference between latent and manifest correlations. Here, manifest correlation refers to correlation between observed scores, while latent correlation refers to correlation between scores at the latent (e.g., logit or probit) scale. Thus, using one in place of the other can lead to erroneous conclusions. Taylor series based reliability measures, which are based on manifest correlation functions, are derived and a careful comparison of reliability measures based on latent correlations, Fisher's information, and exact reliability is carried out. The latent correlations are virtually always considerably higher than their manifest counterparts, Fisher's information measure shows no coherent behaviour (it is even negative in some cases), while the newly introduced Taylor series based approximations reflect the exact reliability very closely. Comparisons among the various types of correlations, for various IRT models, are made using algebraic expressions, Monte Carlo simulations, and data analysis. Given the light computational burden and the performance of Taylor series based reliability measures, their use is recommended. © 2014 The British Psychological Society.
Secondary Psychometric Examination of the Dimensional Obsessive-Compulsive Scale: Classical Testing, Item Response Theory, and Differential Item Functioning.

Science.gov (United States)

Thibodeau, Michel A; Leonard, Rachel C; Abramowitz, Jonathan S; Riemann, Bradley C

2015-12-01

The Dimensional Obsessive-Compulsive Scale (DOCS) is a promising measure of obsessive-compulsive disorder (OCD) symptoms but has received minimal psychometric attention. We evaluated the utility and reliability of DOCS scores. The study included 832 students and 300 patients with OCD. Confirmatory factor analysis supported the originally proposed four-factor structure. DOCS total and subscale scores exhibited good to excellent internal consistency in both samples (α = .82 to α = .96). Patient DOCS total scores reduced substantially during treatment (t = 16.01, d = 1.02). DOCS total scores discriminated between students and patients (sensitivity = 0.76, 1 - specificity = 0.23). The measure did not exhibit gender-based differential item functioning as tested by Mantel-Haenszel chi-square tests. Expected response options for each item were plotted as a function of item response theory and demonstrated that DOCS scores incrementally discriminate OCD symptoms ranging from low to extremely high severity. Incremental differences in DOCS scores appear to represent unbiased and reliable differences in true OCD symptom severity. © The Author(s) 2014.
Using Localized Survey Items to Augment Standardized Benchmarking Measures: A LibQUAL+[TM] Study

Science.gov (United States)

Thompson, Bruce; Cook, Colleen; Kyrillidou, Martha

2006-01-01

The LibQUAL+[TM] protocol solicits open-ended comments from users with regard to library service quality, gathers data on 22 core items, and, at the option of individual libraries, also garners ratings on five items drawn from a pool of more than 100 choices selected by libraries. In this article, the relationship of scores on these locally…
Development of Test Items Related to Selected Concepts Within the Scheme the Particle Nature of Matter.

Science.gov (United States)

Doran, Rodney L.; Pella, Milton O.

The purpose of this study was to develop tests items with a minimum reading demand for use with pupils at grade levels two through six. An item was judged to be acceptable if the item satisfied at least four of six criteria. Approximately 250 students in grades 2-6 participated in the study. Half of the students were given instruction to develop…
Work-related measures of physical and behavioral health function: Test-retest reliability.

Science.gov (United States)

Marino, Molly Elizabeth; Meterko, Mark; Marfeo, Elizabeth E; McDonough, Christine M; Jette, Alan M; Ni, Pengsheng; Bogusz, Kara; Rasch, Elizabeth K; Brandt, Diane E; Chan, Leighton

2015-10-01

The Work Disability Functional Assessment Battery (WD-FAB), developed for potential use by the US Social Security Administration to assess work-related function, currently consists of five multi-item scales assessing physical function and four multi-item scales assessing behavioral health function; the WD-FAB scales are administered as Computerized Adaptive Tests (CATs). The goal of this study was to evaluate the test-retest reliability of the WD-FAB Physical Function and Behavioral Health CATs. We administered the WD-FAB scales twice, 7-10 days apart, to a sample of 376 working age adults and 316 adults with work-disability. Intraclass correlation coefficients were calculated to measure the consistency of the scores between the two administrations. Standard error of measurement (SEM) and minimal detectable change (MDC90) were also calculated to measure the scales precision and sensitivity. For the Physical Function CAT scales, the ICCs ranged from 0.76 to 0.89 in the working age adult sample, and 0.77-0.86 in the sample of adults with work-disability. ICCs for the Behavioral Health CAT scales ranged from 0.66 to 0.70 in the working age adult sample, and 0.77-0.80 in the adults with work-disability. The SEM ranged from 3.25 to 4.55 for the Physical Function scales and 5.27-6.97 for the Behavioral Health function scales. For all scales in both samples, the MDC90 ranged from 7.58 to 16.27. Both the Physical Function and Behavioral Health CATs of the WD-FAB demonstrated good test-retest reliability in adults with work-disability and general adult samples, a critical requirement for assessing work related functioning in disability applicants and in other contexts. Copyright © 2015 Elsevier Inc. All rights reserved.
Instemmingsgeneigdheid en verskillende item- en responsformate in 'n gesommeerde selfbeoordelingskaal

Directory of Open Access Journals (Sweden)

Nadene Hanekom

1998-06-01

Full Text Available This study examines the degree of acquiescence present when the item and response formats of a summated rating scale are varied. It is often recommended that acquiescence response bias in rating scales may be controlled by using both positively and negatively worded items. Such items are generally worded in the Likert-type format of statements. The purpose of the study was to establish whether items in question format would result in a smaller degree of acquiescence than items worded as statements. the response format was also varied (five- and seven-point options to determine whether this would influence the reliability and degree of acquiescence in the scales. A twenty-item Locus of Control (LC questionnaire was used, but each item was complemented by its opposite, resulting in 40 items. The subjects, divided randomly into two groups, were second year students who had to complete four versions of the questionnaire, plus a shortened version of Bass's scale for measuring acquiescence. The LC version were questions or statements each combined with a five- or seven-point respons format. Partial counterbalancing was introduced by testing on two separate occasions, presenting the tests to the two groups in the opposite order. The degree of acquiescence was assessed by correlating the items with their opposite, and by correlating scores on each version with scores on the acquiescence questionnaire. No major difference were found between the various item and response format in relation to acquiescence. Opsomming Hierdie ondersoek is uitgevoer om te bepaal of die mate van instemmingsgeneigdheid deur die item- en responsformaat van 'n gesommeerde selfbeoordelingskaal beinvloed word. Daar word dikwels aanbeveel dat die gebruik van positief- sowel as negatiefbewoorde items in 'n vraelys instemmingsgeneigdheid beperk. Suike items word gewoonlik in die tradisionele Likertformaat as stellings geformuleer. Die doel van die ondersoek was om te bepaal of items
An item response theory analysis of the Executive Interview and development of the EXIT8: A Project FRONTIER Study.

Science.gov (United States)

Jahn, Danielle R; Dressel, Jeffrey A; Gavett, Brandon E; O'Bryant, Sid E

2015-01-01

The Executive Interview (EXIT25) is an effective measure of executive dysfunction, but may be inefficient due to the time it takes to complete 25 interview-based items. The current study aimed to examine psychometric properties of the EXIT25, with a specific focus on determining whether a briefer version of the measure could comprehensively assess executive dysfunction. The current study applied a graded response model (a type of item response theory model for polytomous categorical data) to identify items that were most closely related to the underlying construct of executive functioning and best discriminated between varying levels of executive functioning. Participants were 660 adults ages 40 to 96 years living in West Texas, who were recruited through an ongoing epidemiological study of rural health and aging, called Project FRONTIER. The EXIT25 was the primary measure examined. Participants also completed the Trail Making Test and Controlled Oral Word Association Test, among other measures, to examine the convergent validity of a brief form of the EXIT25. Eight items were identified that provided the majority of the information about the underlying construct of executive functioning; total scores on these items were associated with total scores on other measures of executive functioning and were able to differentiate between cognitively healthy, mildly cognitively impaired, and demented participants. In addition, cutoff scores were recommended based on sensitivity and specificity of scores. A brief, eight-item version of the EXIT25 may be an effective and efficient screening for executive dysfunction among older adults.
Psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Nicotine Dependence Item Bank for use with electronic cigarettes.

Science.gov (United States)

Morean, Meghan; Krishnan-Sarin, Suchitra; Sussman, Steve; Foulds, Jonathan; Fishbein, Howard; Grana, Rachel; O'Malley, Stephanie S

2018-01-02

Psychometrically sound measures of e-cigarette dependence are lacking. We modified the PROMIS Nicotine Dependence Item Banks for use with e-cigarettes and evaluated the psychometrics of the 22-, 8- and 4-item adapted versions. 1009 adults who reported using e-cigarettes at least weekly completed an anonymous survey in Summer 2016 (50.2% male, 77.1% White, mean age 35.81 [10.71], 66.4% daily e-cigarette users, 72.6% current cigarette smokers). Psychometric analyses included confirmatory factor analysis, internal consistency, measurement invariance, examination of mean-level differences, convergent validity, and test-criterion relationships with e-cigarette use outcomes. All PROMIS-E versions had confirmable, internally consistent latent structures that were scalar invariant by sex, race, e-cigarette use (non-daily/daily), e-liquid nicotine content (no/yes), and current cigarette smoking status (no/yes). Daily e-cigarette users, nicotine e-liquid users, and cigarette smokers reported being more dependent on e-cigarettes than their counterparts. All PROMIS-E versions correlated strongly with one another, evidenced convergent validity with the Penn State E-cigarette Dependence Index and time to first e-cigarette use in the morning, and evidenced test-criterion relationships with vaping frequency, e-liquid nicotine concentration, and e-cigarette quit attempts. Similar results were observed when analyses were conducted within subsamples of exclusive e-cigarette users and duals-users of cigarettes and e-cigarettes. Each PROMIS-E version evidenced strong psychometric properties for assessing e-cigarette dependence in adults who either use e-cigarette exclusively or who are dual-users of cigarettes and e-cigarettes. However, results indicated little benefit of the longer versions over the 4-item PROMIS-E, which provides an efficient assessment of e-cigarette dependence. The availability of the novel, psychometrically sound PROMIS-E can further research on a wide range of
Students' approaches to learning in a clinical practicum: A psychometric evaluation based on item response theory.

Science.gov (United States)

Zhao, Yue; Kuan, Hoi Kei; Chung, Joyce O K; Chan, Cecilia K Y; Li, William H C

2018-07-01

The investigation of learning approaches in the clinical workplace context has remained an under-researched area. Despite the validation of learning approach instruments and their applications in various clinical contexts, little is known about the extent to which an individual item, that reflects a specific learning strategy and motive, effectively contributes to characterizing students' learning approaches. This study aimed to measure nursing students' approaches to learning in a clinical practicum using the Approaches to Learning at Work Questionnaire (ALWQ). Survey research design was used in the study. A sample of year 3 nursing students (n = 208) who undertook a 6-week clinical practicum course participated in the study. Factor analyses were conducted, followed by an item response theory analysis, including model assumption evaluation (unidimensionality and local independence), item calibration and goodness-of-fit assessment. Two subscales, deep and surface, were derived. Findings suggested that: (a) items measuring the deep motive from intrinsic interest and deep strategies of relating new ideas to similar situations, and that of concept mapping served as the strongest discriminating indicators; (b) the surface strategy of memorizing facts and details without an overall picture exhibited the highest discriminating power among all surface items; and, (c) both subscales appeared to be informative in assessing a broad range of the corresponding latent trait. The 21-item ALWQ derived from this study presented an efficient, internally consistent and precise measure. Findings provided a useful psychometric evaluation of the ALWQ in the clinical practicum context, added evidence to the utility of the ALWQ for nursing education practice and research, and echoed the discussions from previous studies on the role of the contextual factors in influencing student choices of different learning strategies. They provided insights for clinical educators to measure
Development and validation of the Treatment Related Impact Measure of Weight (TRIM-Weight

Directory of Open Access Journals (Sweden)

Lessard Suzanne

2010-02-01

Full Text Available Abstract Background The use of prescription anti-obesity medication (AOM is becoming increasingly common as treatment options grow and become more accessible. However, AOM may not be without a wide range of potentially negative impacts on patient functioning and well being. The Treatment Related Impact Measure (TRIM-Weight is an obesity treatment-specific patient reported outcomes (PRO measure designed to assess the key impacts of prescription anti-obesity medication. This paper will present the validation findings for the TRIM-Weight. Methods The online validation battery survey was administered in four countries (the U.S., U.K., Australia, and Canada. Eligible subjects were over age eighteen, currently taking a prescription AOM and were currently or had been obese during their life. Validation analyses were conducted according to an a priori statistical analysis plan. Item level psychometric and conceptual criteria were used to refine and reduce the preliminary item pool and factor analysis to identify structural domains was performed. Reliability and validity testing was then performed and the minimally importance difference (MID explored. Results Two hundred and eight subjects completed the survey. Twenty-one of the 43 items were dropped and a five-factor structure was achieved: Daily Life, Weight Management, Treatment Burden, Experience of Side Effects, and Psychological Health. A-priori criteria for internal consistency and test-retest coefficients for the total score and all five subscales were met. All pre-specified hypotheses for convergent and known group validity were also met with the exception of the domain of Daily Life (proven in an ad hoc analysis as well as the 1/2 standard deviation threshold for the MID. Conclusion The development and validation of the TRIM-Weight has been conducted according to well-defined principles for the creation of a PRO measure. Based on the evidence to date, the TRIM-Weight can be considered a brief
Item Construction and Psychometric Models Appropriate for Constructed Responses

Science.gov (United States)

1991-08-01

which involve only one attribute per item. This is especially true when we are dealing with constructed-response items, we have to measure much more...Service University of Ilinois Educacional Testing Service Rosedal Road Capign. IL 61801 Princeton. K3 08541 Princeton. N3 08541 Dr. Charles LeiS Dr
Item Response Theory Modeling of the Philadelphia Naming Test

Science.gov (United States)

Fergadiotis, Gerasimos; Kellough, Stacey; Hula, William D.

2015-01-01

Purpose: In this study, we investigated the fit of the Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996) to an item-response-theory measurement model, estimated the precision of the resulting scores and item parameters, and provided a theoretical rationale for the interpretation of PNT overall scores by relating…
Scale construction utilising the Rasch unidimensional measurement model: A measurement of adolescent attitudes towards abortion.

Science.gov (United States)

Hendriks, Jacqueline; Fyfe, Sue; Styles, Irene; Skinner, S Rachel; Merriman, Gareth

2012-01-01

Measurement scales seeking to quantify latent traits like attitudes, are often developed using traditional psychometric approaches. Application of the Rasch unidimensional measurement model may complement or replace these techniques, as the model can be used to construct scales and check their psychometric properties. If data fit the model, then a scale with invariant measurement properties, including interval-level scores, will have been developed. This paper highlights the unique properties of the Rasch model. Items developed to measure adolescent attitudes towards abortion are used to exemplify the process. Ten attitude and intention items relating to abortion were answered by 406 adolescents aged 12 to 19 years, as part of the "Teen Relationships Study". The sampling framework captured a range of sexual and pregnancy experiences. Items were assessed for fit to the Rasch model including checks for Differential Item Functioning (DIF) by gender, sexual experience or pregnancy experience. Rasch analysis of the original dataset initially demonstrated that some items did not fit the model. Rescoring of one item (B5) and removal of another (L31) resulted in fit, as shown by a non-significant item-trait interaction total chi-square and a mean log residual fit statistic for items of -0.05 (SD=1.43). No DIF existed for the revised scale. However, items did not distinguish as well amongst persons with the most intense attitudes as they did for other persons. A person separation index of 0.82 indicated good reliability. Application of the Rasch model produced a valid and reliable scale measuring adolescent attitudes towards abortion, with stable measurement properties. The Rasch process provided an extensive range of diagnostic information concerning item and person fit, enabling changes to be made to scale items. This example shows the value of the Rasch model in developing scales for both social science and health disciplines.
Differential items functioning to assess aggressiveness in college students / Funcionamento diferencial de itens para avaliar a agressividade de universitários

Directory of Open Access Journals (Sweden)

Fermino Fernandes Sisto

2008-01-01

Full Text Available In this research evidences of construct validity were searched analyzing the differential functioning items related to aggressiveness. The participants were 445 college students of both genders, attending the courses of Engineering, Computing and Psychology. The scale of aggressiveness composed by 81 items was collectively applied, in the classroom, to the students who consented to participate in the study. The items of the instrument were studied by means of the Rasch model. Twenty-eight items presented differential functioning item, 15 were characterized as typical for females and 13 for males. The reliability coefficients were 0.99 to the items and 0.86 to the persons. It was concluded that the aggressiveness can be measured separately on the basis of gender.
Guide to good practices for the development of test items

Energy Technology Data Exchange (ETDEWEB)

NONE

1997-01-01

While the methodology used in developing test items can vary significantly, to ensure quality examinations, test items should be developed systematically. Test design and development is discussed in the DOE Guide to Good Practices for Design, Development, and Implementation of Examinations. This guide is intended to be a supplement by providing more detailed guidance on the development of specific test items. This guide addresses the development of written examination test items primarily. However, many of the concepts also apply to oral examinations, both in the classroom and on the job. This guide is intended to be used as guidance for the classroom and laboratory instructor or curriculum developer responsible for the construction of individual test items. This document focuses on written test items, but includes information relative to open-reference (open book) examination test items, as well. These test items have been categorized as short-answer, multiple-choice, or essay. Each test item format is described, examples are provided, and a procedure for development is included. The appendices provide examples for writing test items, a test item development form, and examples of various test item formats.

Calibration of context-specific survey items to assess youth physical activity behaviour.

Science.gov (United States)

Saint-Maurice, Pedro F; Welk, Gregory J; Bartee, R Todd; Heelan, Kate

2017-05-01

This study tests calibration models to re-scale context-specific physical activity (PA) items to accelerometer-derived PA. A total of 195 4th-12th grades children wore an Actigraph monitor and completed the Physical Activity Questionnaire (PAQ) one week later. The relative time spent in moderate-to-vigorous PA (MVPA % ) obtained from the Actigraph at recess, PE, lunch, after-school, evening and weekend was matched with a respective item score obtained from the PAQ's. Item scores from 145 participants were calibrated against objective MVPA % using multiple linear regression with age, and sex as additional predictors. Predicted minutes of MVPA for school, out-of-school and total week were tested in the remaining sample (n = 50) using equivalence testing. The results showed that PAQ β-weights ranged from 0.06 (lunch) to 4.94 (PE) MVPA % (P PAQ and accelerometer MVPA at school and out-of-school ranged from -15.6 to +3.8 min and the PAQ was within 10-15% of accelerometer measured activity. This study demonstrated that context-specific items can be calibrated to predict minutes of MVPA in groups of youth during in- and out-of-school periods.
Identification and Development of Items Comprising Organizational Citizenship Behaviors Among Pharmacy Faculty.

Science.gov (United States)

Desselle, Shane P; Semsick, Gretchen R

2016-12-25

Objective. Identify behaviors that can compose a measure of organizational citizenship by pharmacy faculty. Methods. A four-round, modified Delphi procedure using open-ended questions (Round 1) was conducted with 13 panelists from pharmacy academia. The items generated were evaluated and refined for inclusion in subsequent rounds. A consensus was reached after completing four rounds. Results. The panel produced a set of 26 items indicative of extra-role behaviors by faculty colleagues considered to compose a measure of citizenship, which is an expressed manifestation of collegiality. Conclusions. The items generated require testing for validation and reliability in a large sample to create a measure of organizational citizenship. Even prior to doing so, the list of items can serve as a resource for mentorship of junior and senior faculty alike.
Identification and Development of Items Comprising Organizational Citizenship Behaviors Among Pharmacy Faculty

Science.gov (United States)

Semsick, Gretchen R.

2016-01-01

Objective. Identify behaviors that can compose a measure of organizational citizenship by pharmacy faculty. Methods. A four-round, modified Delphi procedure using open-ended questions (Round 1) was conducted with 13 panelists from pharmacy academia. The items generated were evaluated and refined for inclusion in subsequent rounds. A consensus was reached after completing four rounds. Results. The panel produced a set of 26 items indicative of extra-role behaviors by faculty colleagues considered to compose a measure of citizenship, which is an expressed manifestation of collegiality. Conclusions. The items generated require testing for validation and reliability in a large sample to create a measure of organizational citizenship. Even prior to doing so, the list of items can serve as a resource for mentorship of junior and senior faculty alike. PMID:28179717
Development and validation of a new instrument to measure health-related quality of life in patients with psoriatic arthritis: the VITACORA-19.

Science.gov (United States)

Torre-Alonso, Juan Carlos; Gratacós, Jordi; Rey-Rey, José Santos; Valdazo de Diego, Juan Pablo; Urriticoechea-Arana, Ana; Daudén, Esteban; Moreno, Mireia; Zarco-Montejo, Pedro; Collantes-Estévez, Eduardo; Fernández-López, Juan Antonio

2014-10-01

To develop/validate an instrument to measure health-related quality of life (HRQoL) in patients with psoriatic arthritis (PsA), for use in clinical studies. An item pool of 35 items was generated following standardized procedures. Item reduction was performed using clinimetric and psychometric approaches after administration to 66 patients with PsA. The resulting instrument, the VITACORA-19, consists of 19 items. Its validity content, internal consistency, test-retest reliability, known groups/convergent validity, and sensitivity to change were tested in a longitudinal and multicenter study conducted in 10 hospitals in Spain, with 323 patients who also completed the EuroQol 5-dimensional questionnaire (EQ-5D) and a health status transition item. There were 3 study groups: group A (n = 209, patients with PsA), group B (n = 71, patients with arthritis without psoriatic aspect, patients with arthrosis, and patients with dermatitis), and group C (n = 43, healthy controls). The questionnaire was considered easy/very easy to answer by 94.7% of the patients with PsA. The factorial analysis clearly identified only 1 factor. Cronbach's alpha coefficient and interclass correlation coefficients exceeded 0.90. Statistically significant differences (p measure HRQoL in patients with PsA, has good validity, reliability, and sensitivity to change.
Negative affect impairs associative memory but not item memory.

Science.gov (United States)

Bisby, James A; Burgess, Neil

2013-12-17

The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine the effects of emotion on memory for items and their associations. By presenting neutral and negative items with background contexts, Experiment 1 demonstrated that item memory was facilitated by emotional affect, whereas memory for an associated context was reduced. In Experiment 2, arousal was manipulated independently of the memoranda, by a threat of shock, whereby encoding trials occurred under conditions of threat or safety. Memory for context was equally impaired by the presence of negative affect, whether induced by threat of shock or a negative item, relative to retrieval of the context of a neutral item in safety. In Experiment 3, participants were presented with neutral and negative items as paired associates, including all combinations of neutral and negative items. The results showed both above effects: compared to a neutral item, memory for the associate of a negative item (a second item here, context in Experiments 1 and 2) is impaired, whereas retrieval of the item itself is enhanced. Our findings suggest that negative affect impairs associative memory while recognition of a negative item is enhanced. They support dual-processing models in which negative affect or stress impairs hippocampal-dependent associative memory while the storage of negative sensory/perceptual representations is spared or even strengthened.
Statistical power as a function of Cronbach alpha of instrument questionnaire items.

Science.gov (United States)

Heo, Moonseong; Kim, Namhee; Faith, Myles S

2015-10-14

In countless number of clinical trials, measurements of outcomes rely on instrument questionnaire items which however often suffer measurement error problems which in turn affect statistical power of study designs. The Cronbach alpha or coefficient alpha, here denoted by C(α), can be used as a measure of internal consistency of parallel instrument items that are developed to measure a target unidimensional outcome construct. Scale score for the target construct is often represented by the sum of the item scores. However, power functions based on C(α) have been lacking for various study designs. We formulate a statistical model for parallel items to derive power functions as a function of C(α) under several study designs. To this end, we assume fixed true score variance assumption as opposed to usual fixed total variance assumption. That assumption is critical and practically relevant to show that smaller measurement errors are inversely associated with higher inter-item correlations, and thus that greater C(α) is associated with greater statistical power. We compare the derived theoretical statistical power with empirical power obtained through Monte Carlo simulations for the following comparisons: one-sample comparison of pre- and post-treatment mean differences, two-sample comparison of pre-post mean differences between groups, and two-sample comparison of mean differences between groups. It is shown that C(α) is the same as a test-retest correlation of the scale scores of parallel items, which enables testing significance of C(α). Closed-form power functions and samples size determination formulas are derived in terms of C(α), for all of the aforementioned comparisons. Power functions are shown to be an increasing function of C(α), regardless of comparison of interest. The derived power functions are well validated by simulation studies that show that the magnitudes of theoretical power are virtually identical to those of the empirical power. Regardless
Development of a questionnaire to assess patient satisfaction with allergen-specific immunotherapy in adults: item generation, item reduction, and preliminary validation

Directory of Open Access Journals (Sweden)

Justícia JL

2011-05-01

Full Text Available Jose Luis Justícia1, Eva Baró2, Victoria Cardona3, Pedro Guardia4, Pedro Ojeda5, José Maria Olaguíbel6, José Maria Vega7, Carmen Vidal81Medical Department, Stallergenes Ibérica, Barcelona, Spain; 2Health Outcomes Research Department, 3D Health Research, Barcelona, Spain; 3Hospital Vall d'Hebron, Barcelona, Spain; 4Hospital Virgen Macarena, Sevilla, Spain; 5Clínica de Asma y Alergia Dres. Ojeda, Madrid, Spain; 6Complejo Hospitalario de Navarra, Pamplona, Spain; 7Hospital Regional Universitario Carlos Haya Málaga, Spain; 8Complejo Hospitalario Universitario de Santiago, Santiago de Compostela, SpainBackground: Allergen-specific immunotherapy (SIT is a treatment capable of modifying the natural course of allergy, so ensuring good adherence to SIT is fundamental. Up until now there has not existed an instrument specifically developed to measure patient satisfaction with SIT, although its assessment could help us to comprehend better and improve treatment adherence and effectiveness. The aim of this study was to develop an instrument to measure adult patient satisfaction with SIT.Methods: Items were generated from a literature review, focus groups with allergic adult patients undergoing SIT, and a meeting with experts. Potential items were administered to allergic patients undergoing SIT in an observational, cross-sectional, multicenter study. Item reduction was based on quantitative and qualitative criteria. A preliminary assessment of feasibility, reliability, and validity of the retained items was performed.Results: An initial pool of 70 items was administered to 257 patients undergoing SIT. Fifty-four items were eliminated resulting in a provisional instrument with 16 items. Factor analysis yielded four factors that were identified as perceived efficacy, activities and environment, cost-benefit balance, and overall satisfaction, explaining 74.8% of variance. Ceiling and floor effects were negligible for overall score. Overall score was
Scenes for Social Information Processing in Adolescence: Item and factor analytic procedures for psychometric appraisal.

Science.gov (United States)

Vagos, Paula; Rijo, Daniel; Santos, Isabel M

2016-04-01

Relatively little is known about measures used to investigate the validity and applications of social information processing theory. The Scenes for Social Information Processing in Adolescence includes items built using a participatory approach to evaluate the attribution of intent, emotion intensity, response evaluation, and response decision steps of social information processing. We evaluated a sample of 802 Portuguese adolescents (61.5% female; mean age = 16.44 years old) using this instrument. Item analysis and exploratory and confirmatory factor analytic procedures were used for psychometric examination. Two measures for attribution of intent were produced, including hostile and neutral; along with 3 emotion measures, focused on negative emotional states; 8 response evaluation measures; and 4 response decision measures, including prosocial and impaired social behavior. All of these measures achieved good internal consistency values and fit indicators. Boys seemed to favor and choose overt and relational aggression behaviors more often; girls conveyed higher levels of neutral attribution, sadness, and assertiveness and passiveness. The Scenes for Social Information Processing in Adolescence achieved adequate psychometric results and seems a valuable alternative for evaluating social information processing, even if it is essential to continue investigation into its internal and external validity. (c) 2016 APA, all rights reserved.
A confirmative clinimetric analysis of the 36-item Family Assessment Device.

Science.gov (United States)

Timmerby, Nina; Cosci, Fiammetta; Watson, Maggie; Csillag, Claudio; Schmitt, Florence; Steck, Barbara; Bech, Per; Thastum, Mikael

2018-02-07

The Family Assessment Device (FAD) is a 60-item questionnaire widely used to evaluate self-reported family functioning. However, the factor structure as well as the number of items has been questioned. A shorter and more user-friendly version of the original FAD-scale, the 36-item FAD, has therefore previously been proposed, based on findings in a nonclinical population of adults. We aimed in this study to evaluate the brief 36-item version of the FAD in a clinical population. Data from a European multinational study, examining factors associated with levels of family functioning in adult cancer patients' families, were used. Both healthy and ill parents completed the 60-item version FAD. The psychometric analyses conducted were Principal Component Analysis and Mokken-analysis. A total of 564 participants were included. Based on the psychometric analysis we confirmed that the 36-item version of the FAD has robust psychometric properties and can be used in clinical populations. The present analysis confirmed that the 36-item version of the FAD (18 items assessing 'well-being' and 18 items assessing 'dysfunctional' family function) is a brief scale where the summed total score is a valid measure of the dimensions of family functioning. This shorter version of the FAD is, in accordance with the concept of 'measurement-based care', an easy to use scale that could be considered when the aim is to evaluate self-reported family functioning.
Relational humility: conceptualizing and measuring humility as a personality judgment.

Science.gov (United States)

Davis, Don E; Hook, Joshua N; Worthington, Everett L; Van Tongeren, Daryl R; Gartner, Aubrey L; Jennings, David J; Emmons, Robert A

2011-05-01

The study of humility has progressed slowly due to measurement problems. We describe a model of relational humility that conceptualizes humility as a personality judgment. In this set of 5 studies, we developed the 16-item Relational Humility Scale (RHS) and offered initial evidence for the theoretical model. In Study 1 (N = 300), we developed the RHS and its subscales--Global Humility, Superiority, and Accurate View of Self. In Study 2, we confirmed the factor structure of the scale in an independent sample (N = 196). In Study 3, we provided initial evidence supporting construct validity using an experimental design (N = 200). In Study 4 (N = 150), we provided additional evidence of construct validity by examining the relationships between humility and empathy, forgiveness, and other virtues. In Study 5 (N = 163), we adduced evidence of discriminant and incremental validity of the RHS compared with the Honesty-Humility subscale of the HEXACO-PI (Lee & Ashton, 2004).
Item response theory and structural equation modelling for ordinal data: Describing the relationship between KIDSCREEN and Life-H.

Science.gov (United States)

Titman, Andrew C; Lancaster, Gillian A; Colver, Allan F

2016-10-01

Both item response theory and structural equation models are useful in the analysis of ordered categorical responses from health assessment questionnaires. We highlight the advantages and disadvantages of the item response theory and structural equation modelling approaches to modelling ordinal data, from within a community health setting. Using data from the SPARCLE project focussing on children with cerebral palsy, this paper investigates the relationship between two ordinal rating scales, the KIDSCREEN, which measures quality-of-life, and Life-H, which measures participation. Practical issues relating to fitting models, such as non-positive definite observed or fitted correlation matrices, and approaches to assessing model fit are discussed. item response theory models allow properties such as the conditional independence of particular domains of a measurement instrument to be assessed. When, as with the SPARCLE data, the latent traits are multidimensional, structural equation models generally provide a much more convenient modelling framework. © The Author(s) 2013.
Medial temporal lobe contributions to cued retrieval of items and contexts.

Science.gov (United States)

Hannula, Deborah E; Libby, Laura A; Yonelinas, Andrew P; Ranganath, Charan

2013-10-01

Several models have proposed that different regions of the medial temporal lobes contribute to different aspects of episodic memory. For instance, according to one view, the perirhinal cortex represents specific items, parahippocampal cortex represents information regarding the context in which these items were encountered, and the hippocampus represents item-context bindings. Here, we used event-related functional magnetic resonance imaging (fMRI) to test a specific prediction of this model-namely, that successful retrieval of items from context cues will elicit perirhinal recruitment and that successful retrieval of contexts from item cues will elicit parahippocampal cortex recruitment. Retrieval of the bound representation in either case was expected to elicit hippocampal engagement. To test these predictions, we had participants study several item-context pairs (i.e., pictures of objects and scenes, respectively), and then had them attempt to recall items from associated context cues and contexts from associated item cues during a scanned retrieval session. Results based on both univariate and multivariate analyses confirmed a role for hippocampus in content-general relational memory retrieval, and a role for parahippocampal cortex in successful retrieval of contexts from item cues. However, we also found that activity differences in perirhinal cortex were correlated with successful cued recall for both items and contexts. These findings provide partial support for the above predictions and are discussed with respect to several models of medial temporal lobe function. Copyright © 2013 Elsevier Ltd. All rights reserved.
Medial Temporal Lobe Contributions to Cued Retrieval of Items and Contexts

Science.gov (United States)

Hannula, Deborah E.; Libby, Laura A.; Yonelinas, Andrew P.; Ranganath, Charan

2013-01-01

Several models have proposed that different regions of the medial temporal lobes contribute to different aspects of episodic memory. For instance, according to one view, the perirhinal cortex represents specific items, parahippocampal cortex represents information regarding the context in which these items were encountered, and the hippocampus represents item-context bindings. Here, we used event-related functional magnetic resonance imaging (fMRI) to test a specific prediction of this model – namely, that successful retrieval of items from context cues will elicit perirhinal recruitment and that successful retrieval of contexts from item cues will elicit parahippocampal cortex recruitment. Retrieval of the bound representation in either case was expected to elicit hippocampal engagement. To test these predictions, we had participants study several item-context pairs (i.e., pictures of objects and scenes, respectively), and then had them attempt to recall items from associated context cues and contexts from associated item cues during a scanned retrieval session. Results based on both univariate and multivariate analyses confirmed a role for hippocampus in content-general relational memory retrieval, and a role for parahippocampal cortex in successful retrieval of contexts from item cues. However, we also found that activity differences in perirhinal cortex were correlated with successful cued recall for both items and contexts. These findings provide partial support for the above predictions and are discussed with respect to several models of medial temporal lobe function. PMID:23466350
Using Patient Health Questionnaire-9 item parameters of a common metric resulted in similar depression scores compared to independent item response theory model reestimation.

Science.gov (United States)

Liegl, Gregor; Wahl, Inka; Berghöfer, Anne; Nolte, Sandra; Pieh, Christoph; Rose, Matthias; Fischer, Felix

2016-03-01

To investigate the validity of a common depression metric in independent samples. We applied a common metrics approach based on item-response theory for measuring depression to four German-speaking samples that completed the Patient Health Questionnaire (PHQ-9). We compared the PHQ item parameters reported for this common metric to reestimated item parameters that derived from fitting a generalized partial credit model solely to the PHQ-9 items. We calibrated the new model on the same scale as the common metric using two approaches (estimation with shifted prior and Stocking-Lord linking). By fitting a mixed-effects model and using Bland-Altman plots, we investigated the agreement between latent depression scores resulting from the different estimation models. We found different item parameters across samples and estimation methods. Although differences in latent depression scores between different estimation methods were statistically significant, these were clinically irrelevant. Our findings provide evidence that it is possible to estimate latent depression scores by using the item parameters from a common metric instead of reestimating and linking a model. The use of common metric parameters is simple, for example, using a Web application (http://www.common-metrics.org) and offers a long-term perspective to improve the comparability of patient-reported outcome measures. Copyright © 2016 Elsevier Inc. All rights reserved.
Maintenance of item and order information in verbal working memory.

Science.gov (United States)

Camos, Valérie; Lagner, Prune; Loaiza, Vanessa M

2017-09-01

Although verbal recall of item and order information is well-researched in short-term memory paradigms, there is relatively little research concerning item and order recall from working memory. The following study examined whether manipulating the opportunity for attentional refreshing and articulatory rehearsal in a complex span task differently affected the recall of item- and order-specific information of the memoranda. Five experiments varied the opportunity for articulatory rehearsal and attentional refreshing in a complex span task, but the type of recall was manipulated between experiments (item and order, order only, and item only recall). The results showed that impairing attentional refreshing and articulatory rehearsal similarly affected recall regardless of whether the scoring procedure (Experiments 1 and 4) or recall requirements (Experiments 2, 3, and 5) reflected item- or order-specific recall. This implies that both mechanisms sustain the maintenance of item and order information, and suggests that the common cumulative functioning of these two mechanisms to maintain items could be at the root of order maintenance.
The Dysexecutive Questionnaire advanced: item and test score characteristics, 4-factor solution, and severity classification.

Science.gov (United States)

Bodenburg, Sebastian; Dopslaff, Nina

2008-01-01

The Dysexecutive Questionnaire (DEX, , Behavioral assessment of the dysexecutive syndrome, 1996) is a standardized instrument to measure possible behavioral changes as a result of the dysexecutive syndrome. Although initially intended only as a qualitative instrument, the DEX has also been used increasingly to address quantitative problems. Until now there have not been more fundamental statistical analyses of the questionnaire's testing quality. The present study is based on an unselected sample of 191 patients with acquired brain injury and reports on the data relating to the quality of the items, the reliability and the factorial structure of the DEX. Item 3 displayed too great an item difficulty, whereas item 11 was not sufficiently discriminating. The DEX's reliability in self-rating is r = 0.85. In addition to presenting the statistical values of the tests, a clinical severity classification of the overall scores of the 4 found factors and of the questionnaire as a whole is carried out on the basis of quartile standards.
Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.

Science.gov (United States)

Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald

2006-11-01

We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.
General mixture item response models with different item response structures: Exposition with an application to Likert scales.

Science.gov (United States)

Tijmstra, Jesper; Bolsinova, Maria; Jeon, Minjeong

2018-01-10

This article proposes a general mixture item response theory (IRT) framework that allows for classes of persons to differ with respect to the type of processes underlying the item responses. Through the use of mixture models, nonnested IRT models with different structures can be estimated for different classes, and class membership can be estimated for each person in the sample. If researchers are able to provide competing measurement models, this mixture IRT framework may help them deal with some violations of measurement invariance. To illustrate this approach, we consider a two-class mixture model, where a person's responses to Likert-scale items containing a neutral middle category are either modeled using a generalized partial credit model, or through an IRTree model. In the first model, the middle category ("neither agree nor disagree") is taken to be qualitatively similar to the other categories, and is taken to provide information about the person's endorsement. In the second model, the middle category is taken to be qualitatively different and to reflect a nonresponse choice, which is modeled using an additional latent variable that captures a person's willingness to respond. The mixture model is studied using simulation studies and is applied to an empirical example.
Scale construction utilising the Rasch unidimensional measurement model: A measurement of adolescent attitudes towards abortion

Directory of Open Access Journals (Sweden)

Jacqueline Hendriks

2012-05-01

Full Text Available BackgroundMeasurement scales seeking to quantify latent traits likeattitudes, are often developed using traditionalpsychometric approaches. Application of the Raschunidimensional measurement model may complement orreplace these techniques, as the model can be used toconstruct scales and check their psychometric properties. Ifdata fit the model, then a scale with invariant measurementproperties, including interval-level scores, will have beendeveloped.AimsThis paper highlights the unique properties of the Raschmodel. Items developed to measure adolescent attitudestowards abortion are used to exemplify the process.MethodTen attitude and intention items relating to abortion wereanswered by 406 adolescents aged 12 to 19 years, as part ofthe “Teen Relationships Study”. The sampling frameworkcaptured a range of sexual and pregnancy experiences.Items were assessed for fit to the Rasch model includingchecks for Differential Item Functioning (DIF by gender,sexual experience or pregnancy experience.ResultsRasch analysis of the original dataset initially demonstratedthat some items did not fit the model. Rescoring of one item(B5 and removal of another (L31 resulted in fit, as shownby a non-significant item-trait interaction total chi-squareand a mean log residual fit statistic for items of -0.05(SD=1.43. No DIF existed for the revised scale. However,items did not distinguish as well amongst persons with themost intense attitudes as they did for other persons. Aperson separation index of 0.82 indicated good reliability.ConclusionApplication of the Rasch model produced a valid andreliable scale measuring adolescent attitudes towardsabortion, with stable measurement properties. The Raschprocess provided an extensive range of diagnosticinformation concerning item and person fit, enablingchanges to be made to scale items. This example shows thevalue of the Rasch model in developing scales for bothsocial science and health disciplines.
QA in the procurement of items and services

International Nuclear Information System (INIS)

Wilhelm, H.

1980-01-01

Procurement of items and services is one of the important elements during the design and construction of Nuclear Power Plants. The purchaser has to establish and implement controls over the procurement process to ensure that the quality criteria, quality level and other quality requirements specified for the particuliar item or service are taken into account. The effect on safety of an error in service or the malfunction of an item is the most important factor to be considered in determining the extent of quality assurance efforts. A typical example of a procurement process will be demonstrated for safety related mechanical components. (orig./RW)

Applying Item Response Theory methods to design a learning progression-based science assessment

Science.gov (United States)

Chen, Jing

Learning progressions are used to describe how students' understanding of a topic progresses over time and to classify the progress of students into steps or levels. This study applies Item Response Theory (IRT) based methods to investigate how to design learning progression-based science assessments. The research questions of this study are: (1) how to use items in different formats to classify students into levels on the learning progression, (2) how to design a test to give good information about students' progress through the learning progression of a particular construct and (3) what characteristics of test items support their use for assessing students' levels. Data used for this study were collected from 1500 elementary and secondary school students during 2009--2010. The written assessment was developed in several formats such as the Constructed Response (CR) items, Ordered Multiple Choice (OMC) and Multiple True or False (MTF) items. The followings are the main findings from this study. The OMC, MTF and CR items might measure different components of the construct. A single construct explained most of the variance in students' performances. However, additional dimensions in terms of item format can explain certain amount of the variance in student performance. So additional dimensions need to be considered when we want to capture the differences in students' performances on different types of items targeting the understanding of the same underlying progression. Items in each item format need to be improved in certain ways to classify students more accurately into the learning progression levels. This study establishes some general steps that can be followed to design other learning progression-based tests as well. For example, first, the boundaries between levels on the IRT scale can be defined by using the means of the item thresholds across a set of good items. Second, items in multiple formats can be selected to achieve the information criterion at all
Item and response-category functioning of the Persian version of the KIDSCREEN-27: Rasch partial credit model

Directory of Open Access Journals (Sweden)

Jafari Peyman

2012-10-01

Full Text Available Abstract Background The purpose of the study was to determine whether the Persian version of the KIDSCREEN-27 has the optimal number of response category to measure health-related quality of life (HRQoL in children and adolescents. Moreover, we aimed to determine if all the items contributed adequately to their own domain. Findings The Persian version of the KIDSCREEN-27 was completed by 1083 school children and 1070 of their parents. The Rasch partial credit model (PCM was used to investigate item statistics and ordering of response categories. The PCM showed that no item was misfitting. The PCM also revealed that, successive response categories for all items were located in the expected order except for category 1 in self- and proxy-reports. Conclusions Although Rasch analysis confirms that all the items belong to their own underlying construct, response categories should be reorganized and evaluated in further studies, especially in children with chronic conditions.
Pattern analysis of total item score and item response of the Kessler Screening Scale for Psychological Distress (K6 in a nationally representative sample of US adults

Directory of Open Access Journals (Sweden)

Shinichiro Tomitaka

2017-02-01

Full Text Available Background Several recent studies have shown that total scores on depressive symptom measures in a general population approximate an exponential pattern except for the lower end of the distribution. Furthermore, we confirmed that the exponential pattern is present for the individual item responses on the Center for Epidemiologic Studies Depression Scale (CES-D. To confirm the reproducibility of such findings, we investigated the total score distribution and item responses of the Kessler Screening Scale for Psychological Distress (K6 in a nationally representative study. Methods Data were drawn from the National Survey of Midlife Development in the United States (MIDUS, which comprises four subsamples: (1 a national random digit dialing (RDD sample, (2 oversamples from five metropolitan areas, (3 siblings of individuals from the RDD sample, and (4 a national RDD sample of twin pairs. K6 items are scored using a 5-point scale: “none of the time,” “a little of the time,” “some of the time,” “most of the time,” and “all of the time.” The pattern of total score distribution and item responses were analyzed using graphical analysis and exponential regression model. Results The total score distributions of the four subsamples exhibited an exponential pattern with similar rate parameters. The item responses of the K6 approximated a linear pattern from “a little of the time” to “all of the time” on log-normal scales, while “none of the time” response was not related to this exponential pattern. Discussion The total score distribution and item responses of the K6 showed exponential patterns, consistent with other depressive symptom scales.
What Does a Verbal Test Measure? A New Approach to Understanding Sources of Item Difficulty.

Science.gov (United States)

Berk, Eric J. Vanden; Lohman, David F.; Cassata, Jennifer Coyne

Assessing the construct relevance of mental test results continues to present many challenges, and it has proven to be particularly difficult to assess the construct relevance of verbal items. This study was conducted to gain a better understanding of the conceptual sources of verbal item difficulty using a unique approach that integrates…
Evaluation of the Psychometric Properties of the Asian Adolescent Depression Scale and Construction of a Short Form: An Item Response Theory Analysis.

Science.gov (United States)

Lo, Barbara Chuen Yee; Zhao, Yue; Kwok, Alice Wai Yee; Chan, Wai; Chan, Calais Kin Yuen

2017-07-01

The present study applied item response theory to examine the psychometric properties of the Asian Adolescent Depression Scale and to construct a short form among 1,084 teenagers recruited from secondary schools in Hong Kong. Findings suggested that some items of the full form reflected higher levels of severity and were more discriminating than others, and the Asian Adolescent Depression Scale was useful in measuring a broad range of depressive severity in community youths. Differential item functioning emerged in several items where females reported higher depressive severity than males. In the short form construction, preliminary validation suggested that, relative to the 20-item full form, our derived short form offered significantly greater diagnostic performance and stronger discriminatory ability in differentiating depressed and nondepressed groups, and simultaneously maintained adequate measurement precision with a reduced response burden in assessing depression in the Asian adolescents. Cultural variance in depressive symptomatology and clinical implications are discussed.
Evaluation of the Relative Validity and Test-Retest Reliability of a 15-Item Beverage Intake Questionnaire in Children and Adolescents.

Science.gov (United States)

Hill, Catelyn E; MacDougall, Carly R; Riebl, Shaun K; Savla, Jyoti; Hedrick, Valisa E; Davy, Brenda M

2017-11-01

-item BEVQ provides results that are similar relative to multiple 24HRs for determining habitual milk and total beverage intake in children, and water and SSB intake in adolescents. The 15-item BEVQ is a reliable indicator of habitual beverage intake in both children and adolescents. Future studies could explore whether adjustments to BEVQ beverage categories, portion size, and format could improve the tool's ability to measure beverage intake in young populations. Copyright © 2017 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.
Development of six PROMIS pediatrics proxy-report item banks.

Science.gov (United States)

Irwin, Debra E; Gross, Heather E; Stucky, Brian D; Thissen, David; DeWitt, Esi Morgan; Lai, Jin Shei; Amtmann, Dagmar; Khastou, Leyla; Varni, James W; DeWalt, Darren A

2012-02-22

Pediatric self-report should be considered the standard for measuring patient reported outcomes (PRO) among children. However, circumstances exist when the child is too young, cognitively impaired, or too ill to complete a PRO instrument and a proxy-report is needed. This paper describes the development process including the proxy cognitive interviews and large-field-test survey methods and sample characteristics employed to produce item parameters for the Patient Reported Outcomes Measurement Information System (PROMIS) pediatric proxy-report item banks. The PROMIS pediatric self-report items were converted into proxy-report items before undergoing cognitive interviews. These items covered six domains (physical function, emotional distress, social peer relationships, fatigue, pain interference, and asthma impact). Caregivers (n = 25) of children ages of 5 and 17 years provided qualitative feedback on proxy-report items to assess any major issues with these items. From May 2008 to March 2009, the large-scale survey enrolled children ages 8-17 years to complete the self-report version and caregivers to complete the proxy-report version of the survey (n = 1548 dyads). Caregivers of children ages 5 to 7 years completed the proxy report survey (n = 432). In addition, caregivers completed other proxy instruments, PedsQL™ 4.0 Generic Core Scales Parent Proxy-Report version, PedsQL™ Asthma Module Parent Proxy-Report version, and KIDSCREEN Parent-Proxy-52. Item content was well understood by proxies and did not require item revisions but some proxies clearly noted that determining an answer on behalf of their child was difficult for some items. Dyads and caregivers of children ages 5-17 years old were enrolled in the large-scale testing. The majority were female (85%), married (70%), Caucasian (64%) and had at least a high school education (94%). Approximately 50% had children with a chronic health condition, primarily asthma, which was diagnosed or treated within 6
Infants’ Visual Recognition Memory for a Series of Categorically Related Items

Science.gov (United States)

Oakes, Lisa M.; Kovack-Lesh, Kristine A.

2013-01-01

Six-month-old infants' ("N" = 168) memory for individual items in a categorized list (e.g., images of dogs or cats) was examined to investigate the interactions between visual recognition memory, working memory, and categorization. In Experiments 1 and 2, infants were familiarized with six different cats or dogs, presented one at a time…
Perception that "everything requires a lot of effort": transcultural SCL-25 item validation.

Science.gov (United States)

Moreau, Nicolas; Hassan, Ghayda; Rousseau, Cécile; Chenguiti, Khalid

2009-09-01

This brief report illustrates how the migration context can affect specific item validity of mental health measures. The SCL-25 was administered to 432 recently settled immigrants (220 Haitian and 212 Arabs). We performed descriptive analyses, as well as Infit and Outfit statistics analyses using WINSTEPS Rasch Measurement Software based on Item Response Theory. The participants' comments about the item You feel everything requires a lot of effort in the SCL-25 were also qualitatively analyzed. Results revealed that the item You feel everything requires a lot of effort is an outlier and does not adjust in an expected and valid fashion with its cluster items, as it is over-endorsed by Haitian and Arab healthy participants. Our study thus shows that, in transcultural mental health research, the cultural and migratory contexts may interact and significantly influence the meaning of some symptom items and consequently, the validity of symptom scales.
Quantum partial search for uneven distribution of multiple target items

Science.gov (United States)

Zhang, Kun; Korepin, Vladimir

2018-06-01

Quantum partial search algorithm is an approximate search. It aims to find a target block (which has the target items). It runs a little faster than full Grover search. In this paper, we consider quantum partial search algorithm for multiple target items unevenly distributed in a database (target blocks have different number of target items). The algorithm we describe can locate one of the target blocks. Efficiency of the algorithm is measured by number of queries to the oracle. We optimize the algorithm in order to improve efficiency. By perturbation method, we find that the algorithm runs the fastest when target items are evenly distributed in database.
A comparison of item response models for accuracy and speed of item responses with applications to adaptive testing.

Science.gov (United States)

van Rijn, Peter W; Ali, Usama S

2017-05-01

We compare three modelling frameworks for accuracy and speed of item responses in the context of adaptive testing. The first framework is based on modelling scores that result from a scoring rule that incorporates both accuracy and speed. The second framework is the hierarchical modelling approach developed by van der Linden (2007, Psychometrika, 72, 287) in which a regular item response model is specified for accuracy and a log-normal model for speed. The third framework is the diffusion framework in which the response is assumed to be the result of a Wiener process. Although the three frameworks differ in the relation between accuracy and speed, one commonality is that the marginal model for accuracy can be simplified to the two-parameter logistic model. We discuss both conditional and marginal estimation of model parameters. Models from all three frameworks were fitted to data from a mathematics and spelling test. Furthermore, we applied a linear and adaptive testing mode to the data off-line in order to determine differences between modelling frameworks. It was found that a model from the scoring rule framework outperformed a hierarchical model in terms of model-based reliability, but the results were mixed with respect to correlations with external measures. © 2017 The British Psychological Society.
Is a single item stress measure independently associated with subsequent severe injury: a prospective cohort study of 16,385 forest industry employees.

Science.gov (United States)

Salminen, Simo; Kouvonen, Anne; Koskinen, Aki; Joensuu, Matti; Väänänen, Ari

2014-06-02

A previous review showed that high stress increases the risk of occupational injury by three- to five-fold. However, most of the prior studies have relied on short follow-ups. In this prospective cohort study we examined the effect of stress on recorded hospitalised injuries in an 8-year follow-up. A total of 16,385 employees of a Finnish forest company responded to the questionnaire. Perceived stress was measured with a validated single-item measure, and analysed in relation recorded hospitalised injuries from 1986 to 2008. We used Cox proportional hazard regression models to examine the prospective associations between work stress, injuries and confounding factors. Highly stressed participants were approximately 40% more likely to be hospitalised due to injury over the follow-up period than participants with low stress. This association remained significant after adjustment for age, gender, marital status, occupational status, educational level, and physical work environment. High stress is associated with an increased risk of severe injury.
Why item parcels are (almost) never appropriate: two wrongs do not make a right--camouflaging misspecification with item parcels in CFA models.

Science.gov (United States)

Marsh, Herbert W; Lüdtke, Oliver; Nagengast, Benjamin; Morin, Alexandre J S; Von Davier, Matthias

2013-09-01

The present investigation has a dual focus: to evaluate problematic practice in the use of item parcels and to suggest exploratory structural equation models (ESEMs) as a viable alternative to the traditional independent clusters confirmatory factor analysis (ICM-CFA) model (with no cross-loadings, subsidiary factors, or correlated uniquenesses). Typically, it is ill-advised to (a) use item parcels when ICM-CFA models do not fit the data, and (b) retain ICM-CFA models when items cross-load on multiple factors. However, the combined use of (a) and (b) is widespread and often provides such misleadingly good fit indexes that applied researchers might believe that misspecification problems are resolved--that 2 wrongs really do make a right. Taking a pragmatist perspective, in 4 studies we demonstrate with responses to the Rosenberg Self-Esteem Inventory (Rosenberg, 1965), Big Five personality factors, and simulated data that even small cross-loadings seriously distort relations among ICM-CFA constructs or even decisions on the number of factors; although obvious in item-level analyses, this is camouflaged by the use of parcels. ESEMs provide a viable alternative to ICM-CFAs and a test for the appropriateness of parcels. The use of parcels with an ICM-CFA model is most justifiable when the fit of both ICM-CFA and ESEM models is acceptable and equally good, and when substantively important interpretations are similar. However, if the ESEM model fits the data better than the ICM-CFA model, then the use of parcels with an ICM-CFA model typically is ill-advised--particularly in studies that are also interested in scale development, latent means, and measurement invariance.
Measuring Relational Reasoning

Science.gov (United States)

Alexander, Patricia A.; Dumas, Denis; Grossnickle, Emily M.; List, Alexandra; Firetto, Carla M.

2016-01-01

Relational reasoning is the foundational cognitive ability to discern meaningful patterns within an informational stream, but its reliable and valid measurement remains problematic. In this investigation, the measurement of relational reasoning unfolded in three stages. Stage 1 entailed the establishment of a research-based conceptualization of…
SHIPPING OF RADIOACTIVE ITEMS

CERN Multimedia

TIS/RP Group

2001-01-01

The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate, preparation of the package and related paperwork). Large and massive objects require a longer procedure and will therefore take longer.
Improving the measurement of health-related quality of life in adolescent with idiopathic scoliosis: the SRS-7, a Rasch-developed short form of the SRS-22 questionnaire.

Science.gov (United States)

Caronni, Antonio; Zaina, Fabio; Negrini, Stefano

2014-04-01

Scoliosis Research Society-22 (SRS-22) questionnaire was developed to evaluate health-related quality of life (HRQL) in adolescent idiopathic scoliosis (AIS) patients. Rasch analysis (RA) is a statistical procedure which turns questionnaire ordinal scores into interval measures. Measures from Rasch-compatible questionnaires can be used, similar to body temperature or blood pressure, to quantify disease severity progression and treatment efficacy. Purpose of the current work is to present Rasch analysis (RA) of the SRS-22 questionnaire and to develop an SRS-22 Rasch-approved short form. 300 SRS-22 were randomly collected from 2447 consecutive IS adolescents at their first evaluation (229 females; 13.9 ± 1.9 years; 26.9 ± 14.7 Cobb°) in a scoliosis outpatient clinic. RA showed both disordered thresholds and overall misfit of the SRS-22. Sixteen items were re-scored and two misfitting items (6 and 14) removed to obtain a Rasch-compatible questionnaire. Participants HRQL measured too high with the rearranged questionnaire, indicating a severe SRS-22 ceiling effect. RA also highlighted SRS-22 multidimensionality, with pain/function not merging with self-image/mental health items. Item 3 showed differential item functioning (DIF) for both curve and hump amplitude. A 7-item questionnaire (SRS-7) was prepared by selecting single items from the original SRS-22. SRS-7 showed fit to the model, unidimensionality and no DIF. Compared with the SRS-22, the short form scale shows better targeting of the participants' population. RA shows that SRS-22 has poor clinimetric properties; moreover, when used with AIS at first evaluation, SRS-22 is affected by a severe ceiling effect. SRS-7, an SRS-22 7-item short form questionnaire, provides an HRQL interval measure better tailored to these participants. Copyright © 2014 Elsevier Ltd. All rights reserved.
An item response theory analysis of Harter's Self-Perception Profile for children or why strong clinical scales should be distrusted.

Science.gov (United States)

Egberink, Iris J L; Meijer, Rob R

2011-06-01

The authors investigated the psychometric properties of the subscales of the Self-Perception Profile for Children with item response theory (IRT) models using a sample of 611 children. Results from a nonparametric Mokken analysis and a parametric IRT approach for boys (n = 268) and girls (n = 343) were compared. The authors found that most scales formed weak scales and that measurement precision was relatively low and only present for latent trait values indicating low self-perception. The subscales Physical Appearance and Global Self-Worth formed one strong scale. Children seem to interpret Global Self-Worth items as if they measure Physical Appearance. Furthermore, the authors found that strong Mokken scales (such as Global Self-Worth) consisted mostly of items that repeat the same item content. They conclude that researchers should be very careful in interpreting the total scores on the different Self-Perception Profile for Children scales. Finally, implications for further research are discussed.
Introducing the Body-QoL®: A New Patient-Reported Outcome Instrument for Measuring Body Satisfaction-Related Quality of Life in Aesthetic and Post-bariatric Body Contouring Patients.

Science.gov (United States)

Danilla, Stefan; Cuevas, Pedro; Aedo, Sócrates; Dominguez, Carlos; Jara, Rocío; Calderón, María E; Al-Himdani, Sarah; Rios, Marco A; Taladriz, Cristián; Rodriguez, Diego; Gonzalez, Rolando; Lazo, Ángel; Erazo, Cristián; Benitez, Susana; Andrades, Patricio; Sepúlveda, Sergio

2016-02-01

To develop a new patient-reported outcome instrument (PRO) to measure body-related satisfaction quality of life (QoL). Standard 3-phase PRO design was followed; in the first phase, a qualitative design was used in 45 patients to develop a conceptual framework and to create preliminary scale domains and items. In phase 2, large-scale population testing on 1340 subjects was performed to reduce items and domains. In phase 3, final testing of the developed instrument on 34 patients was performed. Statistics used include Factor, RASCH, and multivariate regression analysis. Psychometric properties measured were internal reliability, item-rest, item-test, and test-retest correlations. The PRO-developed instrument is composed of four domains (satisfaction with the abdomen, sex life, self-esteem and social life, and physical symptoms) and 20 items in total. The score can range from 20 (worst) to 100 (best). Responsiveness was 100 %, internal reliability 93.3 %, and test-retest concordance 97.7 %. Body image-related QoL was superior in men than women (p instrument (p measured reliably with the Body-QoL instrument. It can be used to quantify the improvement in cosmetic and post-bariatric patients including non- or minimally invasive procedures, suction assisted lipectomy, abdominoplasty, lipoabdominoplasty, and lower body lift and to give an evidence-based approach to standard practice. This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266.
Measuring single constructs by single items: Constructing an even shorter version of the "Short Five" personality inventory.

Directory of Open Access Journals (Sweden)

Kenn Konstabel

Full Text Available The aim of this study was to construct a short, 30-item personality questionnaire that would be, in terms of content and meaning of the scores, as comparable as possible with longer, well-established inventories such as NEO PI-R and its clones. To do this, we shortened the formerly constructed 60-item "Short Five" (S5 by half so that each subscale would be represented by a single item. We compared all possibilities of selecting 30 items (preserving balanced keying within each domain of the five-factor model in terms of correlations with well-established scales, self-peer correlations, and clarity of meaning, and selected an optimal combination for each domain. The resulting shortened questionnaire, XS5, was compared to the original S5 using data from student samples in 6 different countries (Estonia, Finland, UK, Germany, Spain, and China, and a representative Finnish sample. The correlations between XS5 domain scales and their longer counterparts from well-established scales ranged from 0.74 to 0.84; the difference from the equivalent correlations for full version of S5 or from meta-analytic short-term dependability coefficients of NEO PI-R was not large. In terms of prediction of external criteria (emotional experience and self-reported behaviours, there were no important differences between XS5, S5, and the longer well-established scales. Controlling for acquiescence did not improve the prediction of criteria, self-peer correlations, or correlations with longer scales, but it did improve internal reliability and, in some analyses, comparability of the principal component structure. XS5 can be recommended as an economic measure of the five-factor model of personality at the level of domain scales; it has reasonable psychometric properties, fair correlations with longer well-established scales, and it can predict emotional experience and self-reported behaviours no worse than S5. When subscales are essential, we would still recommend using the
Binary classification of items of interest in a repeatable process

Science.gov (United States)

Abell, Jeffrey A.; Spicer, John Patrick; Wincek, Michael Anthony; Wang, Hui; Chakraborty, Debejyo

2014-06-24

A system includes host and learning machines in electrical communication with sensors positioned with respect to an item of interest, e.g., a weld, and memory. The host executes instructions from memory to predict a binary quality status of the item. The learning machine receives signals from the sensor(s), identifies candidate features, and extracts features from the candidates that are more predictive of the binary quality status relative to other candidate features. The learning machine maps the extracted features to a dimensional space that includes most of the items from a passing binary class and excludes all or most of the items from a failing binary class. The host also compares the received signals for a subsequent item of interest to the dimensional space to thereby predict, in real time, the binary quality status of the subsequent item of interest.

Dissociation between source and item memory in Parkinson's disease

Institute of Scientific and Technical Information of China (English)

Hu Panpan; Li Youhai; Ma Huijuan; Xi Chunhua; Chen Xianwen; Wang Kai

2014-01-01

Background Episodic memory includes information about item memory and source memory.Many researches support the hypothesis that these two memory systems are implemented by different brain structures.The aim of this study was to investigate the characteristics of item memory and source memory processing in patients with Parkinson's disease (PD),and to further verify the hypothesis of dual-process model of source and item memory.Methods We established a neuropsychological battery to measure the performance of item memory and source memory.Totally 35 PD individuals and 35 matched healthy controls (HC) were administrated with the battery.Item memory task consists of the learning and recognition of high-frequency national Chinese characters; source memory task consists of the learning and recognition of three modes (character,picture,and image) of objects.Results Compared with the controls,the idiopathic PD patients have been impaired source memory (PD vs.HC:0.65±0.06 vs.0.72±0.09,P=0.001),but not impaired in item memory (PD vs.HC:0.65±0.07 vs.0.67±0.08,P=0.240).Conclusions The present experiment provides evidence for dissociation between item and source memory in PD patients,thereby strengthening the claim that the item or source memory rely on different brain structures.PD patients show poor source memory,in which dopamine plays a critical role.
Negative affectivity and social inhibition in cardiovascular disease: evaluating type-D personality and its assessment using item response theory.

Science.gov (United States)

Emons, Wilco H M; Meijer, Rob R; Denollet, Johan

2007-07-01

Individuals with increased levels of both negative affectivity (NA) and social inhibition (SI)-referred to as type-D personality-are at increased risk of adverse cardiac events. We used item response theory (IRT) to evaluate NA, SI, and type-D personality as measured by the DS14. The objectives of this study were (a) to evaluate the relative contribution of individual items to the measurement precision at the cutoff to distinguish type-D from non-type-D personality and (b) to investigate the comparability of NA, SI, and type-D constructs across the general population and clinical populations. Data from representative samples including 1316 respondents from the general population, 427 respondents diagnosed with coronary heart disease, and 732 persons suffering from hypertension were analyzed using the graded response IRT model. In Study 1, the information functions obtained in the IRT analysis showed that (a) all items had highest measurement precision around the cutoff and (b) items are most informative at the higher end of the scale. In Study 2, the IRT analysis showed that measurements were fairly comparable across the general population and clinical populations. The DS14 adequately measures NA and SI, with highest reliability in the trait range around the cutoff. The DS14 is a valid instrument to assess and compare type-D personality across clinical groups.
Random selection of items. Selection of n1 samples among N items composing a stratum

International Nuclear Information System (INIS)

Jaech, J.L.; Lemaire, R.J.

1987-02-01

STR-224 provides generalized procedures to determine required sample sizes, for instance in the course of a Physical Inventory Verification at Bulk Handling Facilities. The present report describes procedures to generate random numbers and select groups of items to be verified in a given stratum through each of the measurement methods involved in the verification. (author). 3 refs
Evolution of a Test Item

Science.gov (United States)

Spaan, Mary

2007-01-01

This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…
Criterion-related validity of the foot health status questionnaire regarding strength and plantar pressure measurements in elderly people.

Science.gov (United States)

Cuesta-Vargas, Antonio I; Galan-Mercant, Alejandro; Martín-Borras, Maria Carmen; González-Sánchez, Manuel

2012-12-01

Criterion-related validity of a self-administered questionnaire listed as gold standard requires objective testing. The aim of this study was to analyze the Foot Health Status Questionnaire (FHSQ) using functional variable measures (dynamic plantar pressure and foot strength). A total of 22 elderly healthy participants (13 women and 9 men) were screened by interview and physical examination for foot or gait abnormalities. Foot strength, footprint pressure, and foot health status were measured. All the items of the FHSQ show significant correlation with functional variables, but general foot health shows the highest correlation with the 4 physical variables related to plantar pressure (R2 = 0.741), followed by foot pain (R2 = 0.652). A set of different, directly measured physical variables related to foot strength and plantar pressure significantly correlate with the FHSQ dimensions. Cross-sectional trial.
Measuring health-related quality of life: psychometric evaluation of the Tunisian version of the SF-12 health survey.

Science.gov (United States)

Younsi, Moheddine; Chakroun, Mohamed

2014-09-01

The 12-item short-form health survey (SF-12) was developed as a shorter alternative to the SF-36 for use in large-scale studies as an applicable instrument for measuring health-related quality of life. The main purpose of this study was to evaluate the psychometric properties of the Tunisian version of the SF-12. A stratified representative sample (N = 3,582) of the general Tunisian population aged 18 years and over was interviewed. SF-12 summary scores were derived using the standard US algorithm. Factor analysis was used to confirm the hypothesized component structure of the SF-12 items. Reliability was estimated using internal consistency, and construct validity was investigated with "known groups" validity testing and via convergent and divergent validity. SF-12 summary scores distinguished well, and in the expected manner, between groups of respondents on the basis of gender, age, education and socioeconomic status, thus providing evidence of construct validity. Mean scores in the total sample were 50.11 (SD 8.53) for the physical component summary (PCS) score and 47.96 (SD 9.82) for the mental component summary (MCS) score. The results showed satisfactory internal consistency and acceptable convergent validity for both summary scores. Cronbach's α coefficient for PCS-12 and MCS-12 was 0.73 and 0.72, respectively. Known groups comparison showed that the SF-12 discriminated well between groups of respondents on the basis of gender, age, education and socioeconomic status. In addition, no floor or ceiling effects at baseline were observed. The PCA confirmed the two-factor structure of the SF-12 items. Items belonging to the physical component correlated more strongly with the PCS-12 than those with the MCS-12. Similarly, items belonging to the mental component correlated more strongly with the MCS-12 than those with the PCS-12. The findings suggest that the SF-12 appears to be a valid and reliable measure that can be used for measuring of population health
Surveillance indicators for potential reduced exposure products (PREPs: developing survey items to measure awareness

Directory of Open Access Journals (Sweden)

McNeill Ann

2009-10-01

Full Text Available Abstract Background Over the past decade, tobacco companies have introduced cigarettes and smokeless tobacco products (known as Potential Reduced Exposure Products, PREPs with purportedly lower levels of some toxins than conventional cigarettes and smokeless products. It is essential that public health agencies monitor awareness, interest, use, and perceptions of these products so that their impact on population health can be detected at the earliest stages. Methods This paper reviews and critiques existing strategies for measuring awareness of PREPs from 16 published and unpublished studies. From these measures, we developed new surveillance items and subjected them to two rounds of cognitive testing, a common and accepted method for evaluating questionnaire wording. Results Our review suggests that high levels of awareness of PREPs reported in some studies are likely to be inaccurate. Two likely sources of inaccuracy in awareness measures were identified: 1 the tendency of respondents to misclassify "no additive" and "natural" cigarettes as PREPs and 2 the tendency of respondents to mistakenly report awareness as a result of confusion between PREPs brands and similarly named familiar products, for example, Eclipse chewing gum and Accord automobiles. Conclusion After evaluating new measures with cognitive interviews, we conclude that as of winter 2006, awareness of reduced exposure products among U.S. smokers was likely to be between 1% and 8%, with the higher estimates for some products occurring in test markets. Recommended measurement strategies for future surveys are presented.
Surveillance indicators for potential reduced exposure products (PREPs): developing survey items to measure awareness

Science.gov (United States)

Bogen, Karen; Biener, Lois; Garrett, Catherine A; Allen, Jane; Cummings, K Michael; Hartman, Anne; Marcus, Stephen; McNeill, Ann; O'Connor, Richard J; Parascandola, Mark; Pederson, Linda

2009-01-01

Background Over the past decade, tobacco companies have introduced cigarettes and smokeless tobacco products (known as Potential Reduced Exposure Products, PREPs) with purportedly lower levels of some toxins than conventional cigarettes and smokeless products. It is essential that public health agencies monitor awareness, interest, use, and perceptions of these products so that their impact on population health can be detected at the earliest stages. Methods This paper reviews and critiques existing strategies for measuring awareness of PREPs from 16 published and unpublished studies. From these measures, we developed new surveillance items and subjected them to two rounds of cognitive testing, a common and accepted method for evaluating questionnaire wording. Results Our review suggests that high levels of awareness of PREPs reported in some studies are likely to be inaccurate. Two likely sources of inaccuracy in awareness measures were identified: 1) the tendency of respondents to misclassify "no additive" and "natural" cigarettes as PREPs and 2) the tendency of respondents to mistakenly report awareness as a result of confusion between PREPs brands and similarly named familiar products, for example, Eclipse chewing gum and Accord automobiles. Conclusion After evaluating new measures with cognitive interviews, we conclude that as of winter 2006, awareness of reduced exposure products among U.S. smokers was likely to be between 1% and 8%, with the higher estimates for some products occurring in test markets. Recommended measurement strategies for future surveys are presented. PMID:19840394
Evaluation of the Multiple Sclerosis Walking Scale-12 (MSWS-12) in a Dutch sample: Application of item response theory.

Science.gov (United States)

Mokkink, Lidwine Brigitta; Galindo-Garre, Francisca; Uitdehaag, Bernard Mj

2016-12-01

The Multiple Sclerosis Walking Scale-12 (MSWS-12) measures walking ability from the patients' perspective. We examined the quality of the MSWS-12 using an item response theory model, the graded response model (GRM). A total of 625 unique Dutch multiple sclerosis (MS) patients were included. After testing for unidimensionality, monotonicity, and absence of local dependence, a GRM was fit and item characteristics were assessed. Differential item functioning (DIF) for the variables gender, age, duration of MS, type of MS and severity of MS, reliability, total test information, and standard error of the trait level (θ) were investigated. Confirmatory factor analysis showed a unidimensional structure of the 12 items of the scale, explaining 88% of the variance. Item 2 did not fit into the GRM model. Reliability was 0.93. Items 8 and 9 (of the 11 and 12 item version respectively) showed DIF on the variable severity, based on the Expanded Disability Status Scale (EDSS). However, the EDSS is strongly related to the content of both items. Our results confirm the good quality of the MSWS-12. The trait level (θ) scores and item parameters of both the 12- and 11-item versions were highly comparable, although we do not suggest to change the content of the MSWS-12. © The Author(s), 2016.
An evaluation of the brief symptom inventory-18 using item response theory: which items are most strongly related to psychological distress?

NARCIS (Netherlands)

Meijer, R.R.; de Vries, Rivka M.; van Bruggen, Vincent

2011-01-01

The psychometric structure of the Brief Symptom Inventory–18 (BSI-18; Derogatis, 2001) was investigated using Mokken scaling and parametric item response theory. Data of 487 outpatients, 266 students, and 207 prisoners were analyzed. Results of the Mokken analysis indicated that the BSI-18 formed a
An Evaluation of the Brief Symptom Inventory-18 Using Item Response Theory : Which Items Are Most Strongly Related to Psychological Distress?

NARCIS (Netherlands)

Meijer, Rob R.; de Vries, Rivka M.; van Bruggen, Vincent

The psychometric structure of the Brief Symptom Inventory-18 (BSI-18; Derogatis, 2001) was investigated using Mokken scaling and parametric item response theory. Data of 487 outpatients, 266 students, and 207 prisoners were analyzed. Results of the Mokken analysis indicated that the BSI-18 formed a
Using Reversed MFCC and IT-EM for Automatic Speaker Verification

Directory of Open Access Journals (Sweden)

Sheeraz Memon

2012-01-01

Full Text Available This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/ Reverse Mel Frequency Coefficients and IT-EM (Information Theoretic Expectation Maximization. To perform speaker verification, feature extraction using Mel scale has been widely applied and has established better results. The IMFCC is based on inverse Mel-scale. The IMFCC effectively captures information available at the high frequency formants which is ignored by the MFCC. In this paper the fusion of MFCC and IMFCC at input level is proposed. GMMs (Gaussian Mixture Models based on EM (Expectation Maximization have been widely used for classification of text independent verification. However EM comes across the convergence issue. In this paper we use our proposed IT-EM which has faster convergence, to train speaker models. IT-EM uses information theory principles such as PDE (Parzen Density Estimation and KL (Kullback-Leibler divergence measure. IT-EM acclimatizes the weights, means and covariances, like EM. However, IT-EM process is not performed on feature vector sets but on a set of centroids obtained using IT (Information Theoretic metric. The IT-EM process at once diminishes divergence measure between PDE estimates of features distribution within a given class and the centroids distribution within the same class. The feature level fusion and IT-EM is tested for the task of speaker verification using NIST2001 and NIST2004. The experimental evaluation validates that MFCC/IMFCC has better results than the conventional delta/MFCC feature set. The MFCC/IMFCC feature vector size is also much smaller than the delta MFCC thus reducing the computational burden as well. IT-EM method also showed faster convergence, than the conventional EM method, and thus it leads to higher speaker recognition scores.
Exploring differential item functioning in the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC

Directory of Open Access Journals (Sweden)

Pollard Beth

2012-12-01

Full Text Available Abstract Background The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC is a widely used patient reported outcome in osteoarthritis. An important, but frequently overlooked, aspect of validating health outcome measures is to establish if items exhibit differential item functioning (DIF. That is, if respondents have the same underlying level of an attribute, does the item give the same score in different subgroups or is it biased towards one subgroup or another. The aim of the study was to explore DIF in the Likert format WOMAC for the first time in a UK osteoarthritis population with respect to demographic, social, clinical and psychological factors. Methods The sample comprised a community sample of 763 people with osteoarthritis who participated in the Somerset and Avon Survey of Health. The WOMAC was explored for DIF by gender, age, social deprivation, social class, employment status, distress, body mass index and clinical factors. Ordinal regression models were used to identify DIF items. Results After adjusting for age, two items were identified for the physical functioning subscale as having DIF with age identified as the DIF factor for 2 items, gender for 1 item and body mass index for 1 item. For the WOMAC pain subscale, for people with hip osteoarthritis one item was identified with age-related DIF. The impact of the DIF items rarely had a significant effect on the conclusions of group comparisons. Conclusions Overall, the WOMAC performed well with only a small number of DIF items identified. However, as DIF items were identified in for the WOMAC physical functioning subscale it would be advisable to analyse data taking into account the possible impact of the DIF items when weight, gender or especially age effects, are the focus of interest in UK-based osteoarthritis studies. Similarly for the WOMAC pain subscale in people with hip osteoarthritis it would be worthwhile to analyse data taking into account the
Psychometric validation and reliability analysis of a Spanish version of the patient satisfaction with cancer-related care measure: a patient navigation research program study.

Science.gov (United States)

Jean-Pierre, Pascal; Fiscella, Kevin; Winters, Paul C; Paskett, Electra; Wells, Kristen; Battaglia, Tracy

2012-09-01

Patient satisfaction (PS), a key measure of quality of cancer care, is a core study outcome of the multi-site National Cancer Institute-funded Patient Navigation Research Program. Despite large numbers of underserved monolingual Spanish speakers (MSS) residing in USA, there is no validated Spanish measure of PS that spans the whole spectrum of cancer-related care. The present study reports on the validation of the Patient Satisfaction with Cancer Care (PSCC) measure for Spanish (PSCC-Sp) speakers receiving diagnostic and therapeutic cancer-related care. Original PSCC items were professionally translated and back translated to ensure cultural appropriateness, meaningfulness, and equivalence. Then, the resulting 18-item PSCC-Sp measure was administered to 285 MSS. We evaluated latent structure and internal consistency of the PSCC-Sp using principal components analysis (PCA) and Cronbach coefficient alpha (α). We used correlation analyses to demonstrate divergence and convergence of the PSCC-Sp with a Spanish version of the Patient Satisfaction with Interpersonal Relationship with Navigator (PSN-I-Sp) measure and patients' demographics. The PCA revealed a coherent set of items that explicates 47% of the variance in PS. Reliability assessment demonstrated that the PSCC-Sp had high internal consistency (α = 0.92). The PSCC-Sp demonstrated good face validity and convergent and divergent validities as indicated by moderate correlations with the PSN-I-Sp (p = 0.003) and nonsignificant correlations with marital status and household income (all p(s) > 0.05). The PSCC-Sp is a valid and reliable measure of PS and should be tested in other MSS populations.
Measuring health-related quality of life in high-grade glioma patients at the end of life using a proxy-reported retrospective questionnaire

NARCIS (Netherlands)

Sizoo, E.M.; Dirven, L.; Reijneveld, J.C.; Postma, T.J.; Heimans, J.J.; Deliens, L.; Pasman, H.R.W.; Taphoorn, M.J.B.

2014-01-01

To develop, validate, and report on the use of a retrospective proxy-reported questionnaire measuring health-related quality of life (HRQoL) in the end-of-life (EOL) phase of high-grade glioma (HGG) patients. Items relevant for the defined construct were selected using existing questionnaires,
Detection of advance item knowledge using response times in computer adaptive testing

NARCIS (Netherlands)

Meijer, R.R.; Sotaridona, Leonardo

2006-01-01

We propose a new method for detecting item preknowledge in a CAT based on an estimate of “effective response time” for each item. Effective response time is defined as the time required for an individual examinee to answer an item correctly. An unusually short response time relative to the expected
Characterizing Sources of Uncertainty in Item Response Theory Scale Scores

Science.gov (United States)

Yang, Ji Seung; Hansen, Mark; Cai, Li

2012-01-01

Traditional estimators of item response theory scale scores ignore uncertainty carried over from the item calibration process, which can lead to incorrect estimates of the standard errors of measurement (SEMs). Here, the authors review a variety of approaches that have been applied to this problem and compare them on the basis of their statistical…
Item bias in self-reported functional ability among 75-year-old men and women in three Nordic localities

DEFF Research Database (Denmark)

Avlund, K; Era, P; Davidsen, M

1996-01-01

to geographical locality and gender. Information about self-reported functional ability was gathered from surveys on 75-year-old men and women in Glostrup (Denmark), Göteborg (Sweden) and Jyväskylä (Finland). The data were collected by structured home interviews about mobility and Physical activities of daily......The purpose of this article is to analyse item bias in a measure of self-reported functional ability among 75-year-old people in three Nordic localities. The present item bias analysis examines whether the construction of a functional ability index from several variables results in bias in relation...... living (PADL) in relation to tiredness, reduced speed and dependency and combined into three tiredness-scales, three reduced speed-scales and two dependency-scales. The analysis revealed item bias regarding geographical locality in seven out of eight of the functional ability scales, but nearly no bias...
Developing Item Response Theory-Based Short Forms to Measure the Social Impact of Burn Injuries.

Science.gov (United States)

Marino, Molly E; Dore, Emily C; Ni, Pengsheng; Ryan, Colleen M; Schneider, Jeffrey C; Acton, Amy; Jette, Alan M; Kazis, Lewis E

2018-03-01

To develop self-reported short forms for the Life Impact Burn Recovery Evaluation (LIBRE) Profile. Short forms based on the item parameters of discrimination and average difficulty. A support network for burn survivors, peer support networks, social media, and mailings. Burn survivors (N=601) older than 18 years. Not applicable. The LIBRE Profile. Ten-item short forms were developed to cover the 6 LIBRE Profile scales: Relationships with Family & Friends, Social Interactions, Social Activities, Work & Employment, Romantic Relationships, and Sexual Relationships. Ceiling effects were ≤15% for all scales; floor effects were item bank, computerized adaptive test, and short forms are all scored along the same metric, and therefore scores are comparable regardless of the mode of administration. Copyright © 2017 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Dimensions of insight in schizophrenia: Exploratory factor analysis of items from multiple self- and interviewer-rated measures of insight.

Science.gov (United States)

Konsztowicz, Susanna; Schmitz, Norbert; Lepage, Martin

2018-03-10

Insight in schizophrenia is regarded as a multidimensional construct that comprises aspects such as awareness of the disorder and recognition of the need for treatment. The proposed number of underlying dimensions of insight is variable in the literature. In an effort to identify a range of existing dimensions of insight, we conducted a factor analysis on combined items from multiple measures of insight. We recruited 165 participants with enduring schizophrenia (treated for >3years). Exploratory factor analysis was conducted on itemized scores from two interviewer-rated measures of insight: the Schedule for the Assessment of Insight-Expanded and the abbreviated Scale to assess Unawareness of Mental Disorder; and two self-report measures: the Birchwood Insight Scale and the Beck Cognitive Insight Scale. A five-factor solution was selected as the best-fitting model, with the following dimensions of insight: 1) awareness of illness and the need for treatment; 2) awareness and attribution of symptoms and consequences; 3) self-certainty; 4) self-reflectiveness for objectivity and fallibility; and 5) self-reflectiveness for errors in reasoning and openness to feedback. Insight in schizophrenia is a multidimensional construct comprised of distinct clinical and cognitive domains of awareness. Multiple measures of insight, both clinician- and self-rated, are needed to capture all of the existing dimensions of insight. Future exploration of associations between the various dimensions and their potential determinants will facilitate the development of clinically useful models of insight and effective interventions to improve outcome. Copyright © 2018 Elsevier B.V. All rights reserved.

Development of six PROMIS pediatrics proxy-report item banks

Directory of Open Access Journals (Sweden)

Irwin Debra E

2012-02-01

Full Text Available Abstract Background Pediatric self-report should be considered the standard for measuring patient reported outcomes (PRO among children. However, circumstances exist when the child is too young, cognitively impaired, or too ill to complete a PRO instrument and a proxy-report is needed. This paper describes the development process including the proxy cognitive interviews and large-field-test survey methods and sample characteristics employed to produce item parameters for the Patient Reported Outcomes Measurement Information System (PROMIS pediatric proxy-report item banks. Methods The PROMIS pediatric self-report items were converted into proxy-report items before undergoing cognitive interviews. These items covered six domains (physical function, emotional distress, social peer relationships, fatigue, pain interference, and asthma impact. Caregivers (n = 25 of children ages of 5 and 17 years provided qualitative feedback on proxy-report items to assess any major issues with these items. From May 2008 to March 2009, the large-scale survey enrolled children ages 8-17 years to complete the self-report version and caregivers to complete the proxy-report version of the survey (n = 1548 dyads. Caregivers of children ages 5 to 7 years completed the proxy report survey (n = 432. In addition, caregivers completed other proxy instruments, PedsQL™ 4.0 Generic Core Scales Parent Proxy-Report version, PedsQL™ Asthma Module Parent Proxy-Report version, and KIDSCREEN Parent-Proxy-52. Results Item content was well understood by proxies and did not require item revisions but some proxies clearly noted that determining an answer on behalf of their child was difficult for some items. Dyads and caregivers of children ages 5-17 years old were enrolled in the large-scale testing. The majority were female (85%, married (70%, Caucasian (64% and had at least a high school education (94%. Approximately 50% had children with a chronic health condition, primarily
Affect, Behavior, Cognition, and Desire in the Big Five: An Analysis of Item Content and Structure

Science.gov (United States)

Wilt, Joshua; Revelle, William

2015-01-01

Personality psychology is concerned with affect (A), behavior (B), cognition (C) and desire (D), and personality traits have been defined conceptually as abstractions used to either explain or summarize coherent ABC (and sometimes D) patterns over time and space. However, this conceptual definition of traits has not been reflected in their operationalization, possibly resulting in theoretical and practical limitations to current trait inventories. Thus, the goal of this project was to determine the affective, behavioral, cognitive and desire (ABCD) components of Big-Five personality traits. The first study assessed the ABCD content of items measuring Big-Five traits in order to determine the ABCD composition of traits and identify items measuring relatively high amounts of only one ABCD content. The second study examined the correlational structure of scales constructed from items assessing ABCD content via a large, web-based study. An assessment of Big-Five traits that delineates ABCD components of each trait is presented, and the discussion focuses on how this assessment builds upon current approaches of assessing personality. PMID:26279606
Psychometrics of a Child Report Measure of Maternal Support following Disclosure of Sexual Abuse.

Science.gov (United States)

Smith, Daniel W; Sawyer, Genelle K; Heck, Nicholas C; Zajac, Kristyn; Solomon, David; Self-Brown, Shannon; Danielson, Carla K; Ralston, M Elizabeth

2017-04-01

The study examined a new child report measure of maternal support following child sexual abuse. One hundred and forty-six mother-child dyads presenting for a forensic evaluation completed assessments including standardized measures of adjustment. Child participants also responded to 32 items considered for inclusion in a new measure, the Maternal Support Questionnaire-Child Report (MSQ-CR). Exploratory factor analysis of the Maternal Support Questionnaire-Child Report resulted in a three factor, 20-item solution: Emotional Support (9 items), Skeptical Preoccupation (5 items), and Protection/Retaliation (6 items). Each factor demonstrated adequate internal consistency. Construct and concurrent validity of the new measure were supported in comparison to other trauma-specific measures. The Maternal Support Questionnaire-Child Report demonstrated sound psychometric properties. Future research is needed to determine whether the Maternal Support Questionnaire-Child Report provides a more sensitive approximation of maternal support following disclosure of sexual abuse, relative to measures of global parent-child relations and to contextualize discrepancies between mother and child ratings of maternal support.
Applying modern psychometric techniques to melodic discrimination testing: Item response theory, computerised adaptive testing, and automatic item generation.

Science.gov (United States)

Harrison, Peter M C; Collins, Tom; Müllensiefen, Daniel

2017-06-15

Modern psychometric theory provides many useful tools for ability testing, such as item response theory, computerised adaptive testing, and automatic item generation. However, these techniques have yet to be integrated into mainstream psychological practice. This is unfortunate, because modern psychometric techniques can bring many benefits, including sophisticated reliability measures, improved construct validity, avoidance of exposure effects, and improved efficiency. In the present research we therefore use these techniques to develop a new test of a well-studied psychological capacity: melodic discrimination, the ability to detect differences between melodies. We calibrate and validate this test in a series of studies. Studies 1 and 2 respectively calibrate and validate an initial test version, while Studies 3 and 4 calibrate and validate an updated test version incorporating additional easy items. The results support the new test's viability, with evidence for strong reliability and construct validity. We discuss how these modern psychometric techniques may also be profitably applied to other areas of music psychology and psychological science in general.
Relatives' view on collaboration with nurses in acute wards: development and testing of a new measure

DEFF Research Database (Denmark)

Lindhardt, Tove; Nyberg, Per; Hallberg, Ingalill Rahm

2008-01-01

BACKGROUND: Collaboration between relatives and nurses in acute care settings is sparsely investigated, and that mostly from nurses' point of view. Feasible and valid instruments are needed for assessing collaboration, its prerequisites and outcome. OBJECTIVES: To develop and test an instrument...... to assess, from the relatives' perspective, collaboration between relatives of frail elderly patients and nurses in acute hospital wards, as well as prerequisites for, and outcome of, collaboration. DESIGN: Instrument development and psychometric testing. SETTING: Acute medical and geriatric wards......, item-to-total correlation and item-to-item correlation. Systematic internal dropout was investigated. RESULTS: A five-factor solution labelled "influence on decisions", "quality of contact with nurses", "trust and its prerequisites", "achieved information level" and "influence on discharge" showed...
Are great apes able to reason from multi-item samples to populations of food items?

Science.gov (United States)

Eckert, Johanna; Rakoczy, Hannes; Call, Josep

2017-10-01

Inductive learning from limited observations is a cognitive capacity of fundamental importance. In humans, it is underwritten by our intuitive statistics, the ability to draw systematic inferences from populations to randomly drawn samples and vice versa. According to recent research in cognitive development, human intuitive statistics develops early in infancy. Recent work in comparative psychology has produced first evidence for analogous cognitive capacities in great apes who flexibly drew inferences from populations to samples. In the present study, we investigated whether great apes (Pongo abelii, Pan troglodytes, Pan paniscus, Gorilla gorilla) also draw inductive inferences in the opposite direction, from samples to populations. In two experiments, apes saw an experimenter randomly drawing one multi-item sample from each of two populations of food items. The populations differed in their proportion of preferred to neutral items (24:6 vs. 6:24) but apes saw only the distribution of food items in the samples that reflected the distribution of the respective populations (e.g., 4:1 vs. 1:4). Based on this observation they were then allowed to choose between the two populations. Results show that apes seemed to make inferences from samples to populations and thus chose the population from which the more favorable (4:1) sample was drawn in Experiment 1. In this experiment, the more attractive sample not only contained proportionally but also absolutely more preferred food items than the less attractive sample. Experiment 2, however, revealed that when absolute and relative frequencies were disentangled, apes performed at chance level. Whether these limitations in apes' performance reflect true limits of cognitive competence or merely performance limitations due to accessory task demands is still an open question. © 2017 Wiley Periodicals, Inc.
32 CFR 507.17 - Procurement and wear of heraldic items.

Science.gov (United States)

2010-07-01

... controlled heraldic items, when authorized by local procurement procedures, may forward a sample insignia to... 32 National Defense 3 2010-07-01 2010-07-01 true Procurement and wear of heraldic items. 507.17... AUTHORITIES AND PUBLIC RELATIONS MANUFACTURE AND SALE OF DECORATIONS, MEDALS, BADGES, INSIGNIA, COMMERCIAL USE...
Item Difficulty in the Evaluation of Computer-Based Instruction: An Example from Neuroanatomy

Science.gov (United States)

Chariker, Julia H.; Naaz, Farah; Pani, John R.

2012-01-01

This article reports large item effects in a study of computer-based learning of neuroanatomy. Outcome measures of the efficiency of learning, transfer of learning, and generalization of knowledge diverged by a wide margin across test items, with certain sets of items emerging as particularly difficult to master. In addition, the outcomes of…
The Effects of Emotional Intelligence (EI Items Education on Job Related Stress in Physicians and Nurses who Work in Intensive Care Units

Directory of Open Access Journals (Sweden)

Kh Nooryan

2011-12-01

Full Text Available Background & Aim: Intensive care units (ICUs are recognized as stressful environments. The objective of this study was to determine the effects of emotional intelligence education items on job related stress on physicians and nurses who work in intensive care units at hospitals of Yerevan, Armenia. Methods: A interventional study design was implemented with 106 registered hospital physicians and nurses, who were widely distributed all the way through. Case group was taught about 15 E.I items. For data collection, the 20-question Berger situational (overt anxiety questionnaire, the 20-item personality (covert anxiety questionnaire, and the Bar-on emotional intelligence questionnaire with 133 questions were used. Statistical descriptive methods, chi-square (X2 and t-tests were used to analyze data. Results: The research achievements revealed that the average score of the case group`s situational anxiety was 46.59 before intervention, which decreased to 39.95 after the training of the items of emotional intelligence. The average score of situational anxiety of control group was 44.32 before intervention which increased to 44.76 after examination. There was a meaningful statistical difference between case and control group after education of emotional intelligence`s items (p=0.001. Conclusion: Results of the current study showed that physicians and nurses experience high level of stress. The ability to effectively deal with emotion intelligence and emotional information in the workplace assists employees in coping with occupational stress and should be developed in stress managing trainings.
The Self-Perception and Relationships Tool (S-PRT: A novel approach to the measurement of subjective health-related quality of life

Directory of Open Access Journals (Sweden)

Wishart Paul M

2004-07-01

Full Text Available Abstract Background The Self-Perception and Relationships Tool (S-PRT is intended to be a clinically responsive and holistic assessment of patients' experience of illness and subjective Health Related Quality of Life (HRQL. Methods A diversity of patients were involved in two phases of this study. Patient samples included individuals involved with renal, cardiology, psychiatric, cancer, chronic pelvic pain, and sleep services. In Phase I, five patient focus groups generated 128 perceptual rating scales. These scales described important characteristics of illness-related experience within six life domains (i.e., Physical, Mental-Emotional, Interpersonal Receptiveness, Interpersonal Contribution, Transpersonal Receptiveness and Transpersonal Orientation. Item reduction was accomplished using Importance Q-sort and Importance Checklist methodologies with 150 patients across the participating services. In Phase II, a refined item pool (88 items was administered along with measures of health status (SF-36 and spiritual beliefs (Spiritual Involvements and Beliefs Scale – SIBS to 160 patients, of these 136 patients returned complete response sets. Results Factor analysis of S-PRT results produced a surprisingly clean five-factor solution (Eigen values> 2.0 explaining 73.5% of the pooled variance. Items with weaker or split loadings were removed leaving 36 items to form the final S-PRT rating scales; Intrapersonal Well-being (physical, mental & emotional items, Interpersonal Receptivity, Interpersonal Contribution, Transpersonal Receptivity and Transpersonal Orientation (Eigen values> 5.4 explaining 83.5% of the pooled variance. The internal consistency (Cronbach's Alpha of these scales was very high (0.82–0.97. Good convergent correlations (0.40 to 0.67 were observed between the S-PRT scales and the Mental Health scales of the SF-36. Correlations between the S-PRT Intrapersonal Well-being scale and three of SF-36 Physical Health scales were moderate
Validity and Reliability of the 8-Item Work Limitations Questionnaire.

Science.gov (United States)

Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C

2017-12-01

Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.
41 CFR 101-28.306-6 - Sensitive items.

Science.gov (United States)

2010-07-01

... Regulations System FEDERAL PROPERTY MANAGEMENT REGULATIONS SUPPLY AND PROCUREMENT 28-STORAGE AND DISTRIBUTION... accountable item of personal property. Each customer activity shall take all appropriate measures necessary to... Government use. ...
An Evaluation of the Brief Symptom Inventory-18 Using Item Response Theory: Which Items Are Most Strongly Related to Psychological Distress?

Science.gov (United States)

Meijer, Rob R.; de Vries, Rivka M.; van Bruggen, Vincent

2011-01-01

The psychometric structure of the Brief Symptom Inventory-18 (BSI-18; Derogatis, 2001) was investigated using Mokken scaling and parametric item response theory. Data of 487 outpatients, 266 students, and 207 prisoners were analyzed. Results of the Mokken analysis indicated that the BSI-18 formed a strong Mokken scale for outpatients and…
Three Modeling Applications to Promote Automatic Item Generation for Examinations in Dentistry.

Science.gov (United States)

Lai, Hollis; Gierl, Mark J; Byrne, B Ellen; Spielman, Andrew I; Waldschmidt, David M

2016-03-01

Test items created for dentistry examinations are often individually written by content experts. This approach to item development is expensive because it requires the time and effort of many content experts but yields relatively few items. The aim of this study was to describe and illustrate how items can be generated using a systematic approach. Automatic item generation (AIG) is an alternative method that allows a small number of content experts to produce large numbers of items by integrating their domain expertise with computer technology. This article describes and illustrates how three modeling approaches to item content-item cloning, cognitive modeling, and image-anchored modeling-can be used to generate large numbers of multiple-choice test items for examinations in dentistry. Test items can be generated by combining the expertise of two content specialists with technology supported by AIG. A total of 5,467 new items were created during this study. From substitution of item content, to modeling appropriate responses based upon a cognitive model of correct responses, to generating items linked to specific graphical findings, AIG has the potential for meeting increasing demands for test items. Further, the methods described in this study can be generalized and applied to many other item types. Future research applications for AIG in dental education are discussed.
Identifying Country-Specific Cultures of Physics Education: A differential item functioning approach

Science.gov (United States)

Mesic, Vanes

2012-11-01

In international large-scale assessments of educational outcomes, student achievement is often represented by unidimensional constructs. This approach allows for drawing general conclusions about country rankings with respect to the given achievement measure, but it typically does not provide specific diagnostic information which is necessary for systematic comparisons and improvements of educational systems. Useful information could be obtained by exploring the differences in national profiles of student achievement between low-achieving and high-achieving countries. In this study, we aimed to identify the relative weaknesses and strengths of eighth graders' physics achievement in Bosnia and Herzegovina in comparison to the achievement of their peers from Slovenia. For this purpose, we ran a secondary analysis of Trends in International Mathematics and Science Study (TIMSS) 2007 data. The student sample consisted of 4,220 students from Bosnia and Herzegovina and 4,043 students from Slovenia. After analysing the cognitive demands of TIMSS 2007 physics items, the correspondent differential item functioning (DIF)/differential group functioning contrasts were estimated. Approximately 40% of items exhibited large DIF contrasts, indicating significant differences between cultures of physics education in Bosnia and Herzegovina and Slovenia. The relative strength of students from Bosnia and Herzegovina showed to be mainly associated with the topic area 'Electricity and magnetism'. Classes of items which required the knowledge of experimental method, counterintuitive thinking, proportional reasoning and/or the use of complex knowledge structures proved to be differentially easier for students from Slovenia. In the light of the presented results, the common practice of ranking countries with respect to universally established cognitive categories seems to be potentially misleading.
Relative validity of a tool to measure food acculturation in children of Mexican descent.

Science.gov (United States)

Vera-Becerra, Luz Elvia; Lopez, Martha L; Kaiser, Lucia L

2016-02-01

The purpose of this study was to examine relative validity of a food frequency questionnaire (FFQ) to measure food acculturation in young Mexican-origin children. In 2006, Spanish-speaking staff interviewed mothers in a community-based sample of households from Ventura, California (US) (n = 95) and Guanajuato, Mexico (MX) (n = 200). Data included two 24-h dietary recalls (24-DR); a 30-item FFQ; and anthropometry of the children. To measure construct, convergent, and discriminant validity, data analyses included factor analysis, Spearman correlations, t-test, respectively. Factor analysis revealed two constructs: 1) a US food pattern including hamburgers, pizza, hot dogs, fried chicken, juice, cereal, pastries, lower fat milk, quesadillas, and American cheese and 2) a MX food pattern including tortillas, fried beans, rice/noodles, whole milk, and pan dulce (sweet bread). Out of 22 food items that could be compared across the FFQ and mean 24-DRs, 17 were significantly, though weakly, correlated (highest r = 0.62, for whole milk). The mean US food pattern score was significantly higher, and the MX food pattern score, lower in US children than in MX children (p < 0.0001). After adjusting for child's age and gender; mother's education; and household size, the US food pattern score was positively related to body mass index (BMI) z-scores (beta coefficient: +0.29, p = - 0.004), whereas the MX food pattern score was negatively related to BMI z-scores (beta coefficient: -0.28, p = 0.002). This tool may be useful to evaluate nutrition education interventions to prevent childhood obesity on both sides of the border. Copyright © 2015 Elsevier Ltd. All rights reserved.
24 CFR 200.926a - Residential building code comparison items.

Science.gov (United States)

2010-04-01

... 24 Housing and Urban Development 2 2010-04-01 2010-04-01 false Residential building code comparison items. 200.926a Section 200.926a Housing and Urban Development Regulations Relating to Housing and... § 200.926a Residential building code comparison items. HUD will review each local and State code...
Discussion on monitoring items of radionuclides in influents from nuclear power plants

International Nuclear Information System (INIS)

Zhang Yanxia; Li Jin; Liu Jiacheng; Han Shanbiao; Yu Zhengwei

2014-01-01

For the radionuclide monitoring items of effluents from nuclear power plant, this paper makes some comparisons and analysis from three aspects of the international atomic energy general requirements, the routine radionuclide measurement items of China's nuclear power plant and effluents low level radionuclide experimental research results. Finally, it summarizes the necessary items and recommended items of the radionuclide monitoring of effluents from nuclear power plant, which can provide references for the radioactivity monitoring activities of nuclear power plant effluent and the supervisions of regulatory departments. (authors)
Selecting Items for Criterion-Referenced Tests.

Science.gov (United States)

Mellenbergh, Gideon J.; van der Linden, Wim J.

1982-01-01

Three item selection methods for criterion-referenced tests are examined: the classical theory of item difficulty and item-test correlation; the latent trait theory of item characteristic curves; and a decision-theoretic approach for optimal item selection. Item contribution to the standardized expected utility of mastery testing is discussed. (CM)
Asymptotic Standard Errors for Item Response Theory True Score Equating of Polytomous Items

Science.gov (United States)

Cher Wong, Cheow

2015-01-01

Building on previous works by Lord and Ogasawara for dichotomous items, this article proposes an approach to derive the asymptotic standard errors of item response theory true score equating involving polytomous items, for equivalent and nonequivalent groups of examinees. This analytical approach could be used in place of empirical methods like…

Measurement of self-evaluative motives: a shopping scenario.

Science.gov (United States)

Wajda, Theresa A; Kolbe, Richard; Hu, Michael Y; Cui, Annie Peng

2008-08-01

To develop measures of consumers' self-evaluative motives of Self-verification, Self-enhancement, and Self-improvement within the context of a mall shopping environment, an initial set of 49 items was generated by conducting three focus-group sessions. These items were subsequently converted into shopping-dependent motive statements. 250 undergraduate college students responded on a 7-point scale to each statement as these related to the acquisition of recent personal shopping goods. An exploratory factor analysis yielded five factors, accounting for 57.7% of the variance, three of which corresponded to the Self-verification motive (five items), Self-enhancement motive (three items), and Self-improvement motive (six items). These 14 items, along with 9 reconstructed items, yielded 23 items retained and subjected to additional testing. In a final round of data collection, 169 college students provided data for exploratory factor analysis. 11 items were used in confirmatory factor analysis. Analysis indicated that the 11-item scale adequately captured measures of the three self-evaluative motives. However, further data reduction produced a 9-item scale with marked improvement in statistical fit over the 11-item scale.
MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

Science.gov (United States)

Wang, Wen-Chung; Shih, Ching-Lin

2010-01-01

Three multiple indicators-multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods…
A measure of early physical functioning (EPF) post-stroke.

Science.gov (United States)

Finch, Lois E; Higgins, Johanne; Wood-Dauphinee, Sharon; Mayo, Nancy E

2008-07-01

To develop a comprehensive measure of Early Physical Functioning (EPF) post-stroke quantified through Rasch analysis and conceptualized using the International Classification of Functioning Disability and Health (ICF). An observational cohort study. A cohort of 262 subjects (mean age 71.6 (standard deviation 12.5) years) hospitalized post-acute stroke. Functional assessments were made within 3 days of stroke with items from valid and reliable indices commonly utilized to evaluate stroke survivors. Information on important variables was also collected. Principal component and Rasch analysis confirmed the factor structure, and dimensionality of the measure. Rasch analysis combined items across ICF components to develop the measure. Items were deleted iteratively, those retained fit the model and were related to the construct; reliability and validity were assessed. A 38-item unidimensional measure of the EPF met all Rasch model requirements. The item difficulty matched the person ability (mean person measure: -0.31; standard error 0.37 logits), reliability of the person-item-hierarchy was excellent at 0.97. Initial validity was adequate. The 38-item EPF measure was developed. It expands the range of assessment post acute stroke; it covers a broad spectrum of difficulty with good initial psychometric properties that, once revalidated, can assist in planning and evaluating early interventions.
Using Differential Item Functioning Procedures to Explore Sources of Item Difficulty and Group Performance Characteristics.

Science.gov (United States)

Scheuneman, Janice Dowd; Gerritz, Kalle

1990-01-01

Differential item functioning (DIF) methodology for revealing sources of item difficulty and performance characteristics of different groups was explored. A total of 150 Scholastic Aptitude Test items and 132 Graduate Record Examination general test items were analyzed. DIF was evaluated for males and females and Blacks and Whites. (SLD)
A Development of Group Decision Support System for Strategic Item Classification using Analytic Hierarchy Process

International Nuclear Information System (INIS)

Yoon, Sung Ho; Tae, Jae Woong; Yang, Seung Hyo; Shin, Dong Hoon

2016-01-01

Korea has carried out export controls on nuclear items that reflect the Nuclear Suppliers Group (NSG) guidelines (Notice on Trade of Strategic Item of Foreign Trade Act) since joining the NSG in 1995. Nuclear export control starts with classifications that determine whether export items are relevant to nuclear proliferation or not according to NSG guidelines. However, due to qualitative characteristics of nuclear item definition in the guidelines, classification spends a lot of time and effort to make a consensus. The aim of this study is to provide an analysis of an experts' group decision support system (GDSS) based on an analytic hierarchy process (AHP) for the classification of strategic items. The results of this study clearly demonstrated that a GDSS based on an AHP proved positive, systematically providing relative weight among the planning variables and objectives. By using an AHP we can quantify the subjective judgements of reviewers. An order of priority is derived from a numerical value. The verbal and fuzzy measurement of an AHP enables us reach a consensus among reviewers in a GDSS. An AHP sets common weight factors which are a priority of each attribute that represent the views of an entire group. It makes a consistency in decision-making that is important for classification
A Development of Group Decision Support System for Strategic Item Classification using Analytic Hierarchy Process

Energy Technology Data Exchange (ETDEWEB)

Yoon, Sung Ho; Tae, Jae Woong; Yang, Seung Hyo; Shin, Dong Hoon [Korea Institute of Nuclear Nonproliferation and Control, Daejeon (Korea, Republic of)

2016-05-15

Korea has carried out export controls on nuclear items that reflect the Nuclear Suppliers Group (NSG) guidelines (Notice on Trade of Strategic Item of Foreign Trade Act) since joining the NSG in 1995. Nuclear export control starts with classifications that determine whether export items are relevant to nuclear proliferation or not according to NSG guidelines. However, due to qualitative characteristics of nuclear item definition in the guidelines, classification spends a lot of time and effort to make a consensus. The aim of this study is to provide an analysis of an experts' group decision support system (GDSS) based on an analytic hierarchy process (AHP) for the classification of strategic items. The results of this study clearly demonstrated that a GDSS based on an AHP proved positive, systematically providing relative weight among the planning variables and objectives. By using an AHP we can quantify the subjective judgements of reviewers. An order of priority is derived from a numerical value. The verbal and fuzzy measurement of an AHP enables us reach a consensus among reviewers in a GDSS. An AHP sets common weight factors which are a priority of each attribute that represent the views of an entire group. It makes a consistency in decision-making that is important for classification.
Item Response Data Analysis Using Stata Item Response Theory Package

Science.gov (United States)

Yang, Ji Seung; Zheng, Xiaying

2018-01-01

The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…
Working memory in schizophrenia: behavioral and neural evidence for reduced susceptibility to item-specific proactive interference.

Science.gov (United States)

Kaller, Christoph P; Loosli, Sandra V; Rahm, Benjamin; Gössel, Astrid; Schieting, Stephan; Hornig, Tobias; Hennig, Jürgen; Tebartz van Elst, Ludger; Weiller, Cornelius; Katzev, Michael

2014-09-15

Susceptibility to item-specific proactive interference (PI) contributes to interindividual differences in working memory (WM) capacity and complex cognition relying on WM. Although WM deficits are a well-recognized impairment in schizophrenia, the underlying pathophysiological effects on specific WM control functions, such as the ability to resist item-specific PI, remain unknown. Moreover, opposing hypotheses on increased versus reduced PI susceptibility in schizophrenia are both justifiable by the extant literature. To provide first insights into the behavioral and neural correlates of PI-related WM control in schizophrenia, a functional magnetic resonance imaging experiment was conducted in a sample of 20 patients and 20 well-matched control subjects. Demands on item-specific PI were experimentally manipulated in a recent-probes task (three runs, 64 trials each) requiring subjects to encode and maintain a set of four target items per trial. Compared with healthy control subjects, schizophrenia patients showed a significantly reduced PI susceptibility in both accuracy and latency measures. Notably, reduced PI susceptibility in schizophrenia was not associated with overall WM impairments and thus constituted an independent phenomenon. In addition, PI-related activations in inferior frontal gyrus and anterior insula, typically assumed to support PI resistance, were reduced in schizophrenia, thus ruling out increased neural efforts as a potential cause of the patients' reduced PI susceptibility. The present study provides first evidence for a diminished vulnerability of schizophrenia patients to item-specific PI, which is presumably a consequence of the patients' more efficient clearing of previously relevant WM traces and the accordingly reduced likelihood for item-specific PI to occur. Copyright © 2014 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Combining item and bulk material loss-detection uncertainties

International Nuclear Information System (INIS)

Eggers, R.F.

1982-01-01

Loss detection requirements, such as five formula kilograms with 99% probability of detection, which apply to the sum of losses from material in both item and bulk form, constitute a special problem for the nuclear material statistician. Requirements of this type are included in the Material Control and Accounting Reform Amendments described in the Advance Notice of Proposed Rule Making (Federal Register, 46(175):45144-46151). Attribute test sampling of items is the method used to detect gross defects in the inventory of items in a given control unit. Attribute sampling plans are designed to detect a loss of a specificed goal quantity of material with a given probability. In contrast to the methods and statistical models used for item loss detection, bulk material loss detection requires all the material entering and leaving a control unit to be measured and the calculation of a loss estimator that will be tested against an appropriate alarm threshold. The alarm threshold is determined from an estimate of the error inherent in the components of the loss estimator. In this paper a simple grahical method of evaluating the combined capabilities of bulk material loss detection methods and item attribute testing procedures will be described. Quantitative results will be given for several cases, indicating how a decrease in the precision of the item loss detection method tends to force an increase in the precision of the bulk loss detection procedure in order to meet the overall detection requirement. 4 figures
The Dutch-Flemish PROMIS Physical Function item bank exhibited strong psychometric properties in patients with chronic pain.

Science.gov (United States)

Crins, Martine H P; Terwee, Caroline B; Klausch, Thomas; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis A; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Roorda, Leo D

2017-07-01

The objective of this study was to assess the psychometric properties of the Dutch-Flemish Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank in Dutch patients with chronic pain. A bank of 121 items was administered to 1,247 Dutch patients with chronic pain. Unidimensionality was assessed by fitting a one-factor confirmatory factor analysis and evaluating resulting fit statistics. Items were calibrated with the graded response model and its fit was evaluated. Cross-cultural validity was assessed by testing items for differential item functioning (DIF) based on language (Dutch vs. English). Construct validity was evaluated by calculation correlations between scores on the Dutch-Flemish PROMIS Physical Function measure and scores on generic and disease-specific measures. Results supported the Dutch-Flemish PROMIS Physical Function item bank's unidimensionality (Comparative Fit Index = 0.976, Tucker Lewis Index = 0.976) and model fit. Item thresholds targeted a wide range of physical function construct (threshold-parameters range: -4.2 to 5.6). Cross-cultural validity was good as four items only showed DIF for language and their impact on item scores was minimal. Physical Function scores were strongly associated with scores on all other measures (all correlations ≤ -0.60 as expected). The Dutch-Flemish PROMIS Physical Function item bank exhibited good psychometric properties. Development of a computer adaptive test based on the large bank is warranted. Copyright © 2017 Elsevier Inc. All rights reserved.
Effects of Reducing the Cognitive Load of Mathematics Test Items on Student Performance

Directory of Open Access Journals (Sweden)

Susan C. Gillmor

2015-01-01

Full Text Available This study explores a new item-writing framework for improving the validity of math assessment items. The authors transfer insights from Cognitive Load Theory (CLT, traditionally used in instructional design, to educational measurement. Fifteen, multiple-choice math assessment items were modified using research-based strategies for reducing extraneous cognitive load. An experimental design with 222 middle-school students tested the effects of the reduced cognitive load items on student performance and anxiety. Significant findings confirm the main research hypothesis that reducing the cognitive load of math assessment items improves student performance. Three load-reducing item modifications are identified as particularly effective for reducing item difficulty: signalling important information, aesthetic item organization, and removing extraneous content. Load reduction was not shown to impact student anxiety. Implications for classroom assessment and future research are discussed.
Item Banking with Embedded Standards

Science.gov (United States)

MacCann, Robert G.; Stanley, Gordon

2009-01-01

An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…
Teachers' Checklist on Reading-Related Behavioral Characteristics of Chinese Primary Students: A Rasch Measurement Model Analysis

Science.gov (United States)

Chan, David W.; Ho, Connie Suk-han; Chung, Kevin K. H.; Tsang, Suk-man; Lee, Suk-han

2010-01-01

Data of item responses to the Hong Kong Specific Learning Difficulties Behaviour Checklist from 673 Chinese primary grade students were analyzed using the dichotomous Rasch measurement model. Rasch scaling suggested that the data fit the model adequately with a latent dimension of global dyslexic dysfunctioning. Estimates of item attributes and…
Measuring single constructs by single items: Constructing an even shorter version of the “Short Five” personality inventory

Science.gov (United States)

Konstabel, Kenn; Lönnqvist, Jan-Erik; Leikas, Sointu; García Velázquez, Regina; Qin, Hiaying; Verkasalo, Markku; Walkowitz, Gari

2017-01-01

The aim of this study was to construct a short, 30-item personality questionnaire that would be, in terms of content and meaning of the scores, as comparable as possible with longer, well-established inventories such as NEO PI-R and its clones. To do this, we shortened the formerly constructed 60-item “Short Five” (S5) by half so that each subscale would be represented by a single item. We compared all possibilities of selecting 30 items (preserving balanced keying within each domain of the five-factor model) in terms of correlations with well-established scales, self-peer correlations, and clarity of meaning, and selected an optimal combination for each domain. The resulting shortened questionnaire, XS5, was compared to the original S5 using data from student samples in 6 different countries (Estonia, Finland, UK, Germany, Spain, and China), and a representative Finnish sample. The correlations between XS5 domain scales and their longer counterparts from well-established scales ranged from 0.74 to 0.84; the difference from the equivalent correlations for full version of S5 or from meta-analytic short-term dependability coefficients of NEO PI-R was not large. In terms of prediction of external criteria (emotional experience and self-reported behaviours), there were no important differences between XS5, S5, and the longer well-established scales. Controlling for acquiescence did not improve the prediction of criteria, self-peer correlations, or correlations with longer scales, but it did improve internal reliability and, in some analyses, comparability of the principal component structure. XS5 can be recommended as an economic measure of the five-factor model of personality at the level of domain scales; it has reasonable psychometric properties, fair correlations with longer well-established scales, and it can predict emotional experience and self-reported behaviours no worse than S5. When subscales are essential, we would still recommend using the full version
Development of abbreviated eight-item form of the Penn Verbal Reasoning Test.

Science.gov (United States)

Bilker, Warren B; Wierzbicki, Michael R; Brensinger, Colleen M; Gur, Raquel E; Gur, Ruben C

2014-12-01

The ability to reason with language is a highly valued cognitive capacity that correlates with IQ measures and is sensitive to damage in language areas. The Penn Verbal Reasoning Test (PVRT) is a 29-item computerized test for measuring abstract analogical reasoning abilities using language. The full test can take over half an hour to administer, which limits its applicability in large-scale studies. We previously described a procedure for abbreviating a clinical rating scale and a modified procedure for reducing tests with a large number of items. Here we describe the application of the modified method to reducing the number of items in the PVRT to a parsimonious subset of items that accurately predicts the total score. As in our previous reduction studies, a split sample is used for model fitting and validation, with cross-validation to verify results. We find that an 8-item scale predicts the total 29-item score well, achieving a correlation of .9145 for the reduced form for the model fitting sample and .8952 for the validation sample. The results indicate that a drastically abbreviated version, which cuts administration time by more than 70%, can be safely administered as a predictor of PVRT performance. © The Author(s) 2014.
Development of Abbreviated Eight-Item Form of the Penn Verbal Reasoning Test

Science.gov (United States)

Bilker, Warren B.; Wierzbicki, Michael R.; Brensinger, Colleen M.; Gur, Raquel E.; Gur, Ruben C.

2014-01-01

The ability to reason with language is a highly valued cognitive capacity that correlates with IQ measures and is sensitive to damage in language areas. The Penn Verbal Reasoning Test (PVRT) is a 29-item computerized test for measuring abstract analogical reasoning abilities using language. The full test can take over half an hour to administer, which limits its applicability in large-scale studies. We previously described a procedure for abbreviating a clinical rating scale and a modified procedure for reducing tests with a large number of items. Here we describe the application of the modified method to reducing the number of items in the PVRT to a parsimonious subset of items that accurately predicts the total score. As in our previous reduction studies, a split sample is used for model fitting and validation, with cross-validation to verify results. We find that an 8-item scale predicts the total 29-item score well, achieving a correlation of .9145 for the reduced form for the model fitting sample and .8952 for the validation sample. The results indicate that a drastically abbreviated version, which cuts administration time by more than 70%, can be safely administered as a predictor of PVRT performance. PMID:24577310
Assessing cross-cultural item bias in questionnaires : Acculturation and the Measurement of Social Support and Family Cohesion for Adolescents

NARCIS (Netherlands)

Hemert, Dianne A. van; Baerveldt, Chris; Vermande, Marjolijn

2001-01-01

Amethod is presented for evaluating the presence and size of cross-cultural item biases. The examined items concern parental support and family cohesion in a Likert-type questionnaire for adolescents in The Netherlands. Each evaluated item has two versions, a collectivist and an individualistic one,
Mental health of a police force: estimating prevalence of work-related depression in Australia without a direct national measure.

Science.gov (United States)

Lawson, Katrina J; Rodwell, John J; Noblet, Andrew J

2012-06-01

The risk of work-related depression in Australia was estimated based on a survey of 631 police officers. Psychological wellbeing and psychological distress items were mapped onto a measure of depression to identify optimal cutoff points. Based on a sample of police officers, Australian workers, in general, are at risk of depression when general psychological wellbeing is considerably compromised. Large-scale estimation of work-related depression in the broader population of employed persons in Australia is reasonable. The relatively high prevalence of depression among police officers emphasizes the need to examine prevalence rates of depression among Australian employees.
Recommended core items to assess e-cigarette use in population-based surveys.

Science.gov (United States)

Pearson, Jennifer L; Hitchman, Sara C; Brose, Leonie S; Bauld, Linda; Glasser, Allison M; Villanti, Andrea C; McNeill, Ann; Abrams, David B; Cohen, Joanna E

2018-05-01

A consistent approach using standardised items to assess e-cigarette use in both youth and adult populations will aid cross-survey and cross-national comparisons of the effect of e-cigarette (and tobacco) policies and improve our understanding of the population health impact of e-cigarette use. Focusing on adult behaviour, we propose a set of e-cigarette use items, discuss their utility and potential adaptation, and highlight e-cigarette constructs that researchers should avoid without further item development. Reliable and valid items will strengthen the emerging science and inform knowledge synthesis for policy-making. Building on informal discussions at a series of international meetings of 65 experts from 15 countries, the authors provide recommendations for assessing e-cigarette use behaviour, relative perceived harm, device type, presence of nicotine, flavours and reasons for use. We recommend items assessing eight core constructs: e-cigarette ever use, frequency of use and former daily use; relative perceived harm; device type; primary flavour preference; presence of nicotine; and primary reason for use. These items should be standardised or minimally adapted for the policy context and target population. Researchers should be prepared to update items as e-cigarette device characteristics change. A minimum set of e-cigarette items is proposed to encourage consensus around items to allow for cross-survey and cross-jurisdictional comparisons of e-cigarette use behaviour. These proposed items are a starting point. We recognise room for continued improvement, and welcome input from e-cigarette users and scientific colleagues. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Brief Sensation Seeking Scale: Latent structure of 8-item and 4-item versions in Peruvian adolescents.

Science.gov (United States)

Merino-Soto, Cesar; Salas Blas, Edwin

2018-01-01

This research intended to validate two brief scales of sensations seeking with Peruvian adolescents: the eight item scale (BSSS8; Hoyle, Stephenson, Palmgreen, Lorch, y Donohew, 2002) and the four item scale (BSSS4; Stephenson, Hoyle, Slater, y Palmgreen, 2003). Questionnaires were administered to 618 voluntary participants, with an average age of 13.6 years, from different levels of high school, state and private school in a district in the south of Lima. It analyzed the internal structure of both short versions using three models: a) unidimensional (M1), b) oblique or related dimensions (M2), and c) the bifactor model (M3). Results show that both instruments have a single dimension which best represents the variability of the items; a fact that can be explained both by the complexity of the concept and by the small number of items representing each factor, which is more noticeable in the BSSS4. Reliability is within levels found by previous studies: alpha: .745 = BSSS8 and BSSS4 =. 643; omega coefficient: .747 in BSSS8 and .651 in BSSS4. These are considered suitable for the type of instruments studied. Based on the correlation between the two instruments, it was found that there are satisfactory levels of equivalence between the BSSS8 and BSSS4. However, it is recommended that the BSSS4 is mainly used for research and for the purpose of describing populations.

Development and evaluation of an instrument to measure health-related quality of life in Cuban breast cancer patients receiving radiotherapy.

Science.gov (United States)

Lugo, Josefina; Nápoles, Misleidy; Pérez, Inés; Ordaz, Niurka; Luzardo, Mario; Fernández, Leticia

2014-01-01

INTRODUCTION Although modern technology has extended the survival of breast cancer patients, treatment's adverse effects impact their health-related quality of life. Currently, no instrument exists capable of identifying the range of problems affecting breast cancer patients receiving radiotherapy in Cuba's socioeconomic and cultural context. OBJECTIVES Construct and validate an instrument to measure the effects of breast cancer and radiotherapy on health-related quality of life in Cuban patients. METHODS The study was conducted at the Oncology and Radiobiology Institute, Havana, Cuba, from January 2010 through December 2011. Inclusion criteria were: adult female, histological diagnosis of breast cancer, treated with ambulatory radiotherapy, and written informed consent; patients unable to communicate orally or in writing, or who had neurologic or psychiatric conditions were excluded. Development phase: focus groups guided by a list of questions were carried out with 50 women. The patients reported 61 problems affecting their health-related quality-of-life. A nominal group (six oncologists and two nurses) identified the same problems. A syntactic analysis of the information was performed to create items for study and measurement scales. Content validity was determined by a nominal group of seven experts using professional judgment. Another 20 patients were selected to evaluate face validity. Validation phase: the instrument was applied to 230 patients at three different points: before radiotherapy, at the end of radiotherapy and four weeks after radiotherapy was concluded. Reliability, construct validity, discriminant validity, predictive validity, interpretability and response burden were evaluated. RESULTS The final instrument developed had 33 items distributed in 4 domains: physical functioning, psychological functioning, social and family relationships, and physical and emotional adverse effects of disease and treatment. There were two discrete items: perceived
Local context effects during emotional item directed forgetting in younger and older adults.

Science.gov (United States)

Gallant, Sara N; Dyson, Benjamin J; Yang, Lixia

2017-09-01

This paper explored the differential sensitivity young and older adults exhibit to the local context of items entering memory. We examined trial-to-trial performance during an item directed forgetting task for positive, negative, and neutral (or baseline) words each cued as either to-be-remembered (TBR) or to-be-forgotten (TBF). This allowed us to focus on how variations in emotional valence (independent of arousal) and instruction (TBR vs. TBF) of the previous item (trial n-1) impacted memory for the current item (trial n) during encoding. Different from research showing impairing effects of emotional arousal, both age groups showed a memorial boost for stimuli when preceded by items high in positive or negative valence relative to those preceded by neutral items. This advantage was particularly prominent for neutral trial n items that followed emotional items suggesting that, regardless of age, neutral memories may be strengthened by a local context that is high in valence. A trending age difference also emerged with older adults showing greater sensitivity when encoding instructions changed between trial n-1 and n. Results are discussed in light of age-related theories of cognitive and emotional processing, highlighting the need to consider the dynamic, moment-to-moment fluctuations of these systems.
Psychometric properties of the Epworth Sleepiness Scale: A factor analysis and item-response theory approach.

Science.gov (United States)

Pilcher, June J; Switzer, Fred S; Munc, Alec; Donnelly, Janet; Jellen, Julia C; Lamm, Claus

2018-04-01

The purpose of this study is to examine the psychometric properties of the Epworth Sleepiness Scale (ESS) in two languages, German and English. Students from a university in Austria (N = 292; 55 males; mean age = 18.71 ± 1.71 years; 237 females; mean age = 18.24 ± 0.88 years) and a university in the US (N = 329; 128 males; mean age = 18.71 ± 0.88 years; 201 females; mean age = 21.59 ± 2.27 years) completed the ESS. An exploratory-factor analysis was completed to examine dimensionality of the ESS. Item response theory (IRT) analyses were used to provide information about the response rates on the items on the ESS and provide differential item functioning (DIF) analyses to examine whether the items were interpreted differently between the two languages. The factor analyses suggest that the ESS measures two distinct sleepiness constructs. These constructs indicate that the ESS is probing sleepiness in settings requiring active versus passive responding. The IRT analyses found that overall, the items on the ESS perform well as a measure of sleepiness. However, Item 8 and to a lesser extent Item 6 were being interpreted differently by respondents in comparison to the other items. In addition, the DIF analyses showed that the responses between German and English were very similar indicating that there are only minor measurement differences between the two language versions of the ESS. These findings suggest that the ESS provides a reliable measure of propensity to sleepiness; however, it does convey a two-factor approach to sleepiness. Researchers and clinicians can use the German and English versions of the ESS but may wish to exclude Item 8 when calculating a total sleepiness score.
Assessment of Sexual Desire for Clinical Trials of Women With Hypoactive Sexual Desire Disorder: Measures, Desire-Related Behavior, and Assessment of Clinical Significance.

Science.gov (United States)

Pyke, Robert E; Clayton, Anita H

2018-01-19

The Female Sexual Function Index-desire subscale is the standard measure for clinical trials of hypoactive sexual desire disorder (HSDD), but lacks items assessing sexually related behaviors and attitudes toward partner. Counting satisfying sexual events is criticized, but sexual behavior remains important. Mean treatment differences cannot define clinical significance; responder and remitter analyses help. We reviewed measures on sexual desire and sexual behavior relevant to HSDD, and how to assess clinical significance. We conducted a literature review of measures of sexual desire comparing expert-proposed criteria for dysfunctional desire, expert-developed scales, and scales from patient input. Commonly recognized symptoms of HSDD were identified. Results of HSDD trials and scale validation studies were evaluated to extract responder and remitter values. The utility of distribution-based measures of responders and remitters was assessed. Symptom relevance was evaluated as the proportion of symptom sets that included the item; responder and remitter cut points were determined by distribution-based methods. 12 Validated rating scales, 5 scales primarily derived from expert recommendations and 7 scales initially from patient input, and 5 sets of diagnostic criteria for conditions like HSDD were compared. Content varied highly between scales despite compliance with U.S. Food and Drug Administration recommendations for patient-reported outcomes. This disunity favors an expert-recommended scale such as the Elements of Desire Questionnaire with each of the common items, plus a measure of frequency of sexual activity, eg, item in the Patient Reported Outcomes Measurement Information System. Registrational drug trials, but not psychological treatment trials, usually give responder/remitter analyses, using dichotomized global impressions or anchor-based definitions. Distribution-based methods are more uniformly applicable to define responder and remitter status. The
Assessing nicotine dependence in adolescent E-cigarette users: The 4-item Patient-Reported Outcomes Measurement Information System (PROMIS) Nicotine Dependence Item Bank for electronic cigarettes.

Science.gov (United States)

Morean, Meghan E; Krishnan-Sarin, Suchitra; S O'Malley, Stephanie

2018-04-26

Adolescent e-cigarette use (i.e., "vaping") likely confers risk for developing nicotine dependence. However, there have been no studies assessing e-cigarette nicotine dependence in youth. We evaluated the psychometric properties of the 4-item Patient-Reported Outcomes Measurement Information System Nicotine Dependence Item Bank for E-cigarettes (PROMIS-E) for assessing youth e-cigarette nicotine dependence and examined risk factors for experiencing stronger dependence symptoms. In 2017, 520 adolescent past-month e-cigarette users completed the PROMIS-E during a school-based survey (50.5% female, 84.8% White, 16.22[1.19] years old). Adolescents also reported on sex, grade, race, age at e-cigarette use onset, vaping frequency, nicotine e-liquid use, and past-month cigarette smoking. Analyses included conducting confirmatory factor analysis and examining the internal consistency of the PROMIS-E. Bivariate correlations and independent-samples t-tests were used to examine unadjusted relationships between e-cigarette nicotine dependence and the proposed risk factors. Regression models were run in which all potential risk factors were entered as simultaneous predictors of PROMIS-E scores. The single-factor structure of the PROMIS-E was confirmed and evidenced good internal consistency. Across models, larger PROMIS-E scores were associated with being in a higher grade, initiating e-cigarette use at an earlier age, vaping more frequently, using nicotine e-liquid (and higher nicotine concentrations), and smoking cigarettes. Adolescent e-cigarette users reported experiencing nicotine dependence, which was assessed using the psychometrically sound PROMIS-E. Experiencing stronger nicotine dependence symptoms was associated with characteristics that previously have been shown to confer risk for frequent vaping and tobacco cigarette dependence. Copyright © 2018 Elsevier B.V. All rights reserved.
Item difficulty of multiple choice tests dependant on different item response formats – An experiment in fundamental research on psychological assessment

Directory of Open Access Journals (Sweden)

KLAUS D. KUBINGER

2007-12-01

Full Text Available Multiple choice response formats are problematical as an item is often scored as solved simply because the test-taker is a lucky guesser. Instead of applying pertinent IRT models which take guessing effects into account, a pragmatic approach of re-conceptualizing multiple choice response formats to reduce the chance of lucky guessing is considered. This paper compares the free response format with two different multiple choice formats. A common multiple choice format with a single correct response option and five distractors (“1 of 6” is used, as well as a multiple choice format with five response options, of which any number of the five is correct and the item is only scored as mastered if all the correct response options and none of the wrong ones are marked (“x of 5”. An experiment was designed, using pairs of items with exactly the same content but different response formats. 173 test-takers were randomly assigned to two test booklets of 150 items altogether. Rasch model analyses adduced a fitting item pool, after the deletion of 39 items. The resulting item difficulty parameters were used for the comparison of the different formats. The multiple choice format “1 of 6” differs significantly from “x of 5”, with a relative effect of 1.63, while the multiple choice format “x of 5” does not significantly differ from the free response format. Therefore, the lower degree of difficulty of items with the “1 of 6” multiple choice format is an indicator of relevant guessing effects. In contrast the “x of 5” multiple choice format can be seen as an appropriate substitute for free response format.
Interpreting Mini-Mental State Examination Performance in Highly Proficient Bilingual Spanish-English and Asian Indian-English Speakers: Demographic Adjustments, Item Analyses, and Supplemental Measures.

Science.gov (United States)

Milman, Lisa H; Faroqi-Shah, Yasmeen; Corcoran, Chris D; Damele, Deanna M

2018-04-17

Performance on the Mini-Mental State Examination (MMSE), among the most widely used global screens of adult cognitive status, is affected by demographic variables including age, education, and ethnicity. This study extends prior research by examining the specific effects of bilingualism on MMSE performance. Sixty independent community-dwelling monolingual and bilingual adults were recruited from eastern and western regions of the United States in this cross-sectional group study. Independent sample t tests were used to compare 2 bilingual groups (Spanish-English and Asian Indian-English) with matched monolingual speakers on the MMSE, demographically adjusted MMSE scores, MMSE item scores, and a nonverbal cognitive measure. Regression analyses were also performed to determine whether language proficiency predicted MMSE performance in both groups of bilingual speakers. Group differences were evident on the MMSE, on demographically adjusted MMSE scores, and on a small subset of individual MMSE items. Scores on a standardized screen of language proficiency predicted a significant proportion of the variance in the MMSE scores of both bilingual groups. Bilingual speakers demonstrated distinct performance profiles on the MMSE. Results suggest that supplementing the MMSE with a language screen, administering a nonverbal measure, and/or evaluating item-based patterns of performance may assist with test interpretation for this population.
Development and evaluation of CAHPS survey items assessing how well healthcare providers address health literacy.

Science.gov (United States)

Weidmer, Beverly A; Brach, Cindy; Hays, Ron D

2012-09-01

The complexity of health information often exceeds patients' skills to understand and use it. To develop survey items assessing how well healthcare providers communicate health information. Domains and items for the Consumer Assessment of Healthcare Providers and Systems (CAHPS) Item Set for Addressing Health Literacy were identified through an environmental scan and input from stakeholders. The draft item set was translated into Spanish and pretested in both English and Spanish. The revised item set was field tested with a randomly selected sample of adult patients from 2 sites using mail and telephonic data collection. Item-scale correlations, confirmatory factor analysis, and internal consistency reliability estimates were estimated to assess how well the survey items performed and identify composite measures. Finally, we regressed the CAHPS global rating of the provider item on the CAHPS core communication composite and the new health literacy composites. A total of 601 completed surveys were obtained (52% response rate). Two composite measures were identified: (1) Communication to Improve Health Literacy (16 items); and (2) How Well Providers Communicate About Medicines (6 items). These 2 composites were significantly uniquely associated with the global rating of the provider (communication to improve health literacy: PLiteracy composite accounted for 90% of the variance of the original 16-item composite. This study provides support for reliability and validity of the CAHPS Item Set for Addressing Health Literacy. These items can serve to assess whether healthcare providers have communicated effectively with their patients and as a tool for quality improvement.
Procurement Engineering Process for Commercial Grade Item Dedication

International Nuclear Information System (INIS)

Park, Jong-Hyuck; Park, Jong-Eun; Kwak, Tack-Hun; Yoo, Keun-Bae; Lee, Sang-Guk; Hong, Sung-Yull

2006-01-01

Procurement Engineering Process for commercial grade item dedication plays an increasingly important role in operation management of Korea Nuclear Power Plants. The purpose of the Procurement Engineering Process is the provision and assurance of a high quality and quantity of spare, replacement, retrofit and new parts and equipment while maximizing plant availability, minimizing downtime due to parts unavailability and providing reasonable overall program and inventory cost. In this paper, we will review the overview requirements, responsibilities and the process for demonstrating with reasonable assurance that a procured item for potential nuclear safety related services or other essential plant service is adequate with reasonable assurance for its application. This paper does not cover the details of technical evaluation, selecting critical characteristics, selecting acceptance methods, performing failure modes and effects analysis, performing source surveillance, performing quality surveys, performing special tests and inspections, and the other aspects of effective Procurement Engineering and Commercial Grade Item Dedication. The main contribution of this paper is to provide the provision of an overview of Procurement Engineering Process for commercial grade item
Infant Movement Motivation Questionnaire: development of a measure evaluating infant characteristics relating to motor development in the first year of life.

Science.gov (United States)

Doralp, Samantha; Bartlett, Doreen

2014-08-01

This paper highlights the development and testing of the Infant Movement Motivation Questionnaire (IMMQ), an instrument designed to evaluate qualities of infant characteristics that relate specifically to early motor development. The measurement development process included three phases: item generation, pilot testing and evaluation of acceptability and feasibility for parents and exploratory factor analysis. The resultant 27-item questionnaire is designed for completion by parents and contains four factors including Activity, Exploration, Motivation and Adaptability. Overall, the internal consistency of the IMMQ is 0.89 (Cronbach's alpha), with test-retest reliability measured at 0.92 (ICC, with 95% CI 0.83-0.96). Further work could be done to strengthen the individual factors; however it is adequate for use in its full form. The IMMQ can be used for clinical or research purposes, as well as an educational tool for parents. Copyright © 2014 Elsevier Inc. All rights reserved.
A note on monotonicity of item response functions for ordered polytomous item response theory models.

Science.gov (United States)

Kang, Hyeon-Ah; Su, Ya-Hui; Chang, Hua-Hua

2018-03-08

A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales. © 2018 The British Psychological Society.
FIM-Minimum Data Set Motor Item Bank: Short Forms Development and Precision Comparison in Veterans.

Science.gov (United States)

Li, Chih-Ying; Romero, Sergio; Simpson, Annie N; Bonilha, Heather S; Simpson, Kit N; Hong, Ickpyo; Velozo, Craig A

2018-03-01

To improve the practical use of the short forms (SFs) developed from the item bank, we compared the measurement precision of the 4- and 8-item SFs generated from a motor item bank composed of the FIM and the Minimum Data Set (MDS). The FIM-MDS motor item bank allowed scores generated from different instruments to be co-calibrated. The 4- and 8-item SFs were developed based on Rasch analysis procedures. This article compared person strata, ceiling/floor effects, and test SE plots for each administration form and examined 95% confidence interval error bands of anchored person measures with the corresponding SFs. We used 0.3 SE as a criterion to reflect a reliability level of .90. Veterans' inpatient rehabilitation facilities and community living centers. Veterans (N=2500) who had both FIM and the MDS data within 6 days during 2008 through 2010. Not applicable. Four- and 8-item SFs of FIM, MDS, and FIM-MDS motor item bank. Six SFs were generated with 4 and 8 items across a range of difficulty levels from the FIM-MDS motor item bank. The three 8-item SFs all had higher correlations with the item bank (r=.82-.95), higher person strata, and less test error than the corresponding 4-item SFs (r=.80-.90). The three 4-item SFs did not meet the criteria of SE bank composed of existing instruments across the continuum of care in veterans. We also found that the number of items, not test specificity, determines the precision of the instrument. Copyright © 2017 American Congress of Rehabilitation Medicine. All rights reserved.
Use of commercial grade item dedication to reduce procurement costs

International Nuclear Information System (INIS)

Rosch, F.

1995-01-01

In the mid-1980s, the Nuclear Regulatory Industry (NRC) began inspecting utility practices of procuring and dedicating commercial grade items intended for plant safety-related applications. As a result of the industry efforts to address NRC concerns, nuclear utilities have enhanced existing programs and procedures for dedication of commercial grade items. Though these programs were originally enhanced to meet NRC concerns, utilities have discovered that the dedication of commercial grade items can also reduce overall procurement costs. This paper will discuss the enhancement of utility dedication programs and demonstrates how utilities have utilized them to reduce procurement costs
Adult Attachment Ratings (AAR): an item response theory analysis.

Science.gov (United States)

Pilkonis, Paul A; Kim, Yookyung; Yu, Lan; Morse, Jennifer Q

2014-01-01

The Adult Attachment Ratings (AAR) include 3 scales for anxious, ambivalent attachment (excessive dependency, interpersonal ambivalence, and compulsive care-giving), 3 for avoidant attachment (rigid self-control, defensive separation, and emotional detachment), and 1 for secure attachment. The scales include items (ranging from 6-16 in their original form) scored by raters using a 3-point format (0 = absent, 1 = present, and 2 = strongly present) and summed to produce a total score. Item response theory (IRT) analyses were conducted with data from 414 participants recruited from psychiatric outpatient, medical, and community settings to identify the most informative items from each scale. The IRT results allowed us to shorten the scales to 5-item versions that are more precise and easier to rate because of their brevity. In general, the effective range of measurement for the scales was 0 to +2 SDs for each of the attachment constructs; that is, from average to high levels of attachment problems. Evidence for convergent and discriminant validity of the scales was investigated by comparing them with the Experiences of Close Relationships-Revised (ECR-R) scale and the Kobak Attachment Q-sort. The best consensus among self-reports on the ECR-R, informant ratings on the ECR-R, and expert judgments on the Q-sort and the AAR emerged for anxious, ambivalent attachment. Given the good psychometric characteristics of the scale for secure attachment, however, this measure alone might provide a simple alternative to more elaborate procedures for some measurement purposes. Conversion tables are provided for the 7 scales to facilitate transformation from raw scores to IRT-calibrated (theta) scores.
Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients.

Science.gov (United States)

Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J B; Conroy, Thierry; Tomaszewski, Krzysztof A; Young, Teresa; Petersen, Morten Aa

2017-11-01

The European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Group is developing computerized adaptive testing (CAT) versions of all EORTC Quality of Life Questionnaire (QLQ-C30) scales with the aim to enhance measurement precision. Here we present the results on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients. This evaluation included an assessment of dimensionality, fit to the item response theory (IRT) model, differential item functioning (DIF), and measurement properties. A total of 1030 cancer patients completed the 44 candidate items on CF. Of these, 34 items could be included in a unidimensional IRT model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient assessment of HRQOL of cancer patients, without loss of comparability of results.
Development of an item bank and computer adaptive test for role functioning

DEFF Research Database (Denmark)

Anatchkova, Milena D; Rose, Matthias; Ware, John E

2012-01-01

Role functioning (RF) is a key component of health and well-being and an important outcome in health research. The aim of this study was to develop an item bank to measure impact of health on role functioning.......Role functioning (RF) is a key component of health and well-being and an important outcome in health research. The aim of this study was to develop an item bank to measure impact of health on role functioning....
A Concealed Information Test with multimodal measurement.

Science.gov (United States)

Ambach, Wolfgang; Bursch, Stephanie; Stark, Rudolf; Vaitl, Dieter

2010-03-01

A Concealed Information Test (CIT) investigates differential physiological responses to deed-related (probe) vs. irrelevant items. The present study focused on the detection of concealed information using simultaneous recordings of autonomic and brain electrical measures. As a secondary issue, verbal and pictorial presentations were compared with respect to their influence on the recorded measures. Thirty-one participants underwent a mock-crime scenario with a combined verbal and pictorial presentation of nine items. The subsequent CIT, designed with respect to event-related potential (ERP) measurement, used a 3-3.5s interstimulus interval. The item presentation modality, i.e. pictures or written words, was varied between subjects; no response was required from the participants. In addition to electroencephalogram (EEG), electrodermal activity (EDA), electrocardiogram (ECG), respiratory activity, and finger plethysmogram were recorded. A significant probe-vs.-irrelevant effect was found for each of the measures. Compared to sole ERP measurement, the combination of ERP and EDA yielded incremental information for detecting concealed information. Although, EDA per se did not reach the predictive value known from studies primarily designed for peripheral physiological measurement. Presentation modality neither influenced the detection accuracy for autonomic measures nor EEG measures; this underpins the equivalence of verbal and pictorial item presentation in a CIT, regardless of the physiological measures recorded. Future studies should further clarify whether the incremental validity observed in the present study reflects a differential sensitivity of ERP and EDA to different sub-processes in a CIT. Copyright 2009 Elsevier B.V. All rights reserved.
The Technical Quality of Test Items Generated Using a Systematic Approach to Item Writing.

Science.gov (United States)

Siskind, Theresa G.; Anderson, Lorin W.

The study was designed to examine the similarity of response options generated by different item writers using a systematic approach to item writing. The similarity of response options to student responses for the same item stems presented in an open-ended format was also examined. A non-systematic (subject matter expertise) approach and a…
Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients

DEFF Research Database (Denmark)

Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J. B.

2017-01-01

on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). METHODS: In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients...... model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study...... sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. CONCLUSION: A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient...
Item-level informant discrepancies across obese-overweight children and their parents on the PedsQL™ 4.0 instrument: an iterative hybrid ordinal logistic regression.

Science.gov (United States)

Jafari, Peyman; Allahyari, Elahe; Salarzadeh, Mina; Bagheri, Zahra

2016-01-01

Child obesity has become a major health concern worldwide. In order to provide successful intervention strategies, it is necessary to understand how obese-overweight children and their parents perceive obesity and its consequences on child's health-related quality of life (HRQoL). This study aimed to assess measurement equivalence of the PedsQL™ 4.0 across obese-overweight children and their parents. The items in the PedsQL™ 4.0 were analysed for differential item functioning (DIF) across obese-overweight children and their parents using an iterative hybrid ordinal logistic regression/item response theory approach. The sample included 647 overweight-obese children and their parents, who completed child and parent reports of the PedsQL™ 4.0, respectively. Overall, 17 out of 23 (74%) items were flagged with DIF across two groups: eight items exhibited uniform DIF and nine items non-uniform DIF. In addition, parents of obese children rated the child's HRQoL significantly lower than their children in all domains of the PedsQL™ 4.0, and this finding did not change whether or not items with uniform DIF were included. Although obese-overweight children and their parents interpret items of the PedsQL™ 4.0 in a conceptually different manner, removing or retaining DIF items in the subscales had no significant effects on group differences. Accordingly, it appears that observed differences in HRQoL scores across child and parent reports are a true difference and not a reflection of measurement artefact.

Validity of the Neuromuscular Recovery Scale: a measurement model approach.

Science.gov (United States)

Velozo, Craig; Moorhouse, Michael; Ardolino, Elizabeth; Lorenz, Doug; Suter, Sarah; Basso, D Michele; Behrman, Andrea L

2015-08-01

To determine how well the Neuromuscular Recovery Scale (NRS) items fit the Rasch, 1-parameter, partial-credit measurement model. Confirmatory factor analysis (CFA) and principal components analysis (PCA) of residuals were used to determine dimensionality. The Rasch, 1-parameter, partial-credit rating scale model was used to determine rating scale structure, person/item fit, point-measure item correlations, item discrimination, and measurement precision. Seven NeuroRecovery Network clinical sites. Outpatients (N=188) with spinal cord injury. Not applicable. NRS. While the NRS met 1 of 3 CFA criteria, the PCA revealed that the Rasch measurement dimension explained 76.9% of the variance. Ten of 11 items and 91% of the patients fit the Rasch model, with 9 of 11 items showing high discrimination. Sixty-nine percent of the ratings met criteria. The items showed a logical item-difficulty order, with Stand retraining as the easiest item and Walking as the most challenging item. The NRS showed no ceiling or floor effects and separated the sample into almost 5 statistically distinct strata; individuals with an American Spinal Injury Association Impairment Scale (AIS) D classification showed the most ability, and those with an AIS A classification showed the least ability. Items not meeting the rating scale criteria appear to be related to the low frequency counts. The NRS met many of the Rasch model criteria for construct validity. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Measuring perceptions related to e-cigarettes: Important principles and next steps to enhance study validity.

Science.gov (United States)

Gibson, Laura A; Creamer, MeLisa R; Breland, Alison B; Giachello, Aida Luz; Kaufman, Annette; Kong, Grace; Pechacek, Terry F; Pepper, Jessica K; Soule, Eric K; Halpern-Felsher, Bonnie

2018-04-01

Measuring perceptions associated with e-cigarette use can provide valuable information to help explain why youth and adults initiate and continue to use e-cigarettes. However, given the complexity of e-cigarette devices and their continuing evolution, measures of perceptions of this product have varied greatly. Our goal, as members of the working group on e-cigarette measurement within the Tobacco Centers of Regulatory Science (TCORS) network, is to provide guidance to researchers developing surveys concerning e-cigarette perceptions. We surveyed the 14 TCORS sites and received and reviewed 371 e-cigarette perception items from seven sites. We categorized the items based on types of perceptions asked, and identified measurement approaches that could enhance data validity and approaches that researchers may consider avoiding. The committee provides suggestions in four areas: (1) perceptions of benefits, (2) harm perceptions, (3) addiction perceptions, and (4) perceptions of social norms. Across these 4 areas, the most appropriate way to assess e-cigarette perceptions depends largely on study aims. The type and number of items used to examine e-cigarette perceptions will also vary depending on respondents' e-cigarette experience (i.e., user vs. non-user), level of experience (e.g., experimental vs. established), type of e-cigarette device (e.g., cig-a-like, mod), and age. Continuous formative work is critical to adequately capture perceptions in response to the rapidly changing e-cigarette landscape. Most important, it is imperative to consider the unique perceptual aspects of e-cigarettes, building on the conventional cigarette literature as appropriate, but not relying on existing conventional cigarette perception items without adjustment. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dissociative effects of orthographic distinctiveness in pure and mixed lists: an item-order account.

Science.gov (United States)

McDaniel, Mark A; Cahill, Michael; Bugg, Julie M; Meadow, Nathaniel G

2011-10-01

We apply the item-order theory of list composition effects in free recall to the orthographic distinctiveness effect. The item-order account assumes that orthographically distinct items advantage item-specific encoding in both mixed and pure lists, but at the expense of exploiting relational information present in the list. Experiment 1 replicated the typical free recall advantage of orthographically distinct items in mixed lists and the elimination of that advantage in pure lists. Supporting the item-order account, recognition performances indicated that orthographically distinct items received greater item-specific encoding than did orthographically common items in mixed and pure lists (Experiments 1 and 2). Furthermore, order memory (input-output correspondence and sequential contiguity effects) was evident in recall of pure unstructured common lists, but not in recall of unstructured distinct lists (Experiment 1). These combined patterns, although not anticipated by prevailing views, are consistent with an item-order account.
Generalizability theory and item response theory

NARCIS (Netherlands)

Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

2012-01-01

Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a
A Non-Parametric Item Response Theory Evaluation of the CAGE Instrument Among Older Adults.

Science.gov (United States)

Abdin, Edimansyah; Sagayadevan, Vathsala; Vaingankar, Janhavi Ajit; Picco, Louisa; Chong, Siow Ann; Subramaniam, Mythily

2018-02-23

The validity of the CAGE using item response theory (IRT) has not yet been examined in older adult population. This study aims to investigate the psychometric properties of the CAGE using both non-parametric and parametric IRT models, assess whether there is any differential item functioning (DIF) by age, gender and ethnicity and examine the measurement precision at the cut-off scores. We used data from the Well-being of the Singapore Elderly study to conduct Mokken scaling analysis (MSA), dichotomous Rasch and 2-parameter logistic IRT models. The measurement precision at the cut-off scores were evaluated using classification accuracy (CA) and classification consistency (CC). The MSA showed the overall scalability H index was 0.459, indicating a medium performing instrument. All items were found to be homogenous, measuring the same construct and able to discriminate well between respondents with high levels of the construct and the ones with lower levels. The item discrimination ranged from 1.07 to 6.73 while the item difficulty ranged from 0.33 to 2.80. Significant DIF was found for 2-item across ethnic group. More than 90% (CC and CA ranged from 92.5% to 94.3%) of the respondents were consistently and accurately classified by the CAGE cut-off scores of 2 and 3. The current study provides new evidence on the validity of the CAGE from the IRT perspective. This study provides valuable information of each item in the assessment of the overall severity of alcohol problem and the precision of the cut-off scores in older adult population.
Intentional forgetting reduces color-naming interference: evidence from item-method directed forgetting.

Science.gov (United States)

Lee, Yuh-Shiow; Lee, Huang-Mou; Fawcett, Jonathan M

2013-01-01

In an item-method-directed forgetting task, Chinese words were presented individually, each followed by an instruction to remember or forget. Colored probe items were presented following each memory instruction requiring a speeded color-naming response. Half of the probe items were novel and unrelated to the preceding study item, whereas the remaining half of the probe items were a repetition of the preceding study item. Repeated probe items were either identical to the preceding study item (E1, E2), a phonetic reproduction of the preceding study item (E3), or perceptually matched to the preceding study item (E4). Color-naming interference was calculated by subtracting color-naming reaction times made in response to a string of meaningless symbols from that of the novel and repeated conditions. Across all experiments, participants recalled more to-be-remembered (TBR) than to-be-forgotten (TBF) study words. More importantly, Experiments 1 and 2 found that color-naming interference was reduced for repeated TBF words relative to repeated TBR words. Experiments 3 and 4 further found that this effect occurred at the perceptual rather than semantic level. These findings suggest that participants may bias processing resources away from the perceptual representation of to-be-forgotten information.
Dissociating the neural correlates of intra-item and inter-item working-memory binding.

Directory of Open Access Journals (Sweden)

Carinne Piekema

Full Text Available BACKGROUND: Integration of information streams into a unitary representation is an important task of our cognitive system. Within working memory, the medial temporal lobe (MTL has been conceptually linked to the maintenance of bound representations. In a previous fMRI study, we have shown that the MTL is indeed more active during working-memory maintenance of spatial associations as compared to non-spatial associations or single items. There are two explanations for this result, the mere presence of the spatial component activates the MTL, or the MTL is recruited to bind associations between neurally non-overlapping representations. METHODOLOGY/PRINCIPAL FINDINGS: The current fMRI study investigates this issue further by directly comparing intrinsic intra-item binding (object/colour, extrinsic intra-item binding (object/location, and inter-item binding (object/object. The three binding conditions resulted in differential activation of brain regions. Specifically, we show that the MTL is important for establishing extrinsic intra-item associations and inter-item associations, in line with the notion that binding of information processed in different brain regions depends on the MTL. CONCLUSIONS/SIGNIFICANCE: Our findings indicate that different forms of working-memory binding rely on specific neural structures. In addition, these results extend previous reports indicating that the MTL is implicated in working-memory maintenance, challenging the classic distinction between short-term and long-term memory systems.
Generalizability theory and item response theory

OpenAIRE

Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

2012-01-01

Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a selected-response format. This chapter presents a short overview of how item response theory and generalizability theory were integrated to model such assessments. Further, the precision of the esti...
Using item response theory to address vulnerabilities in FFQ.

Science.gov (United States)

Kazman, Josh B; Scott, Jonathan M; Deuster, Patricia A

2017-09-01

The limitations for self-reporting of dietary patterns are widely recognised as a major vulnerability of FFQ and the dietary screeners/scales derived from FFQ. Such instruments can yield inconsistent results to produce questionable interpretations. The present article discusses the value of psychometric approaches and standards in addressing these drawbacks for instruments used to estimate dietary habits and nutrient intake. We argue that a FFQ or screener that treats diet as a 'latent construct' can be optimised for both internal consistency and the value of the research results. Latent constructs, a foundation for item response theory (IRT)-based scales (e.g. Patient Reported Outcomes Measurement Information System) are typically introduced in the design stage of an instrument to elicit critical factors that cannot be observed or measured directly. We propose an iterative approach that uses such modelling to refine FFQ and similar instruments. To that end, we illustrate the benefits of psychometric modelling by using items and data from a sample of 12 370 Soldiers who completed the 2012 US Army Global Assessment Tool (GAT). We used factor analysis to build the scale incorporating five out of eleven survey items. An IRT-driven assessment of response category properties indicates likely problems in the ordering or wording of several response categories. Group comparisons, examined with differential item functioning (DIF), provided evidence of scale validity across each Army sub-population (sex, service component and officer status). Such an approach holds promise for future FFQ.
A Methodological Study of Order Effects in Reporting Relational Aggression Experiences.

Science.gov (United States)

Serico, Jennifer M; NeMoyer, Amanda; Goldstein, Naomi E S; Houck, Mark; Leff, Stephen S

2018-03-01

Unlike the overt nature of physical aggression, which lends itself to simpler and more direct methods of investigation, the often-masked nature of relational aggression has led to difficulties and debate regarding the most effective tools of study. Given concerns with the accuracy of third-party relational aggression reports, especially as individuals age, self-report measures may be particularly useful when assessing experiences with relational aggression. However, it is important to recognize validity concerns-in particular, the potential effects of item order presentation-associated with self-report of relational aggression perpetration and victimization. To investigate this issue, surveys were administered and completed by 179 young adults randomly assigned to one of four survey conditions reflecting manipulation of item order. Survey conditions included presentation of (a) perpetration items only, (b) victimization items only, (c) perpetration items followed by victimization items, and (d) victimization items followed by perpetration items. Results revealed that participants reported perpetrating relational aggression significantly more often when asked only about perpetration or when asked about perpetration before victimization, compared with participants who were asked about victimization before perpetration. Item order manipulation did not result in significant differences in self-reported victimization experiences. Results of this study indicate a need for greater consideration of item order when conducting research using self-report data and the importance of additional investigation into which form of item presentation elicits the most accurate self-report information.
Beyond the Shadow of a Trait: Understanding Discounting through Item-Level Analysis of Personality Scales

Science.gov (United States)

Charlton, Shawn R.; Gossett, Bradley D.; Charlton, Veda A.

2011-01-01

Temporal discounting, the loss in perceived value associated with delayed outcomes, correlates with a number of personality measures, suggesting that an item-level analysis of trait measures might provide a more detailed understanding of discounting. The current report details two studies that investigate the utility of such an item-level…
Development and validation of a 6-item working alliance questionnaire for repeated administrations during psychotherapy.

Science.gov (United States)

Falkenström, Fredrik; Hatcher, Robert L; Skjulsvik, Tommy; Larsson, Mattias Holmqvist; Holmqvist, Rolf

2015-03-01

Recently, researchers have started to measure the working alliance repeatedly across sessions of psychotherapy, relating the working alliance to symptom change session by session. Responding to questionnaires after each session can become tedious, leading to careless responses and/or increasing levels of missing data. Therefore, assessment with the briefest possible instrument is desirable. Because previous research on the Working Alliance Inventory has found the separation of the Goal and Task factors problematic, the present study examined the psychometric properties of a 2-factor, 6-item working alliance measure, adapted from the Working Alliance Inventory, in 3 patient samples (ns = 1,095, 235, and 234). Results showed that a bifactor model fit the data well across the 3 samples, and the factor structure was stable across 10 sessions of primary care counseling/psychotherapy. Although the bifactor model with 1 general and 2 specific factors outperformed the 1-factor model in terms of model fit, dimensionality analyses based on the bifactor model results indicated that in practice the instrument is best treated as unidimensional. Results support the use of composite scores of all 6 items. The instrument was validated by replicating previous findings of session-by-session prediction of symptom reduction using the Autoregressive Latent Trajectory model. The 6-item working alliance scale, called the Session Alliance Inventory, is a promising alternative for researchers in search for a brief alliance measure to administer after every session. 2015 APA, all rights reserved
Item Response Theory with Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model

Science.gov (United States)

Tay, Louis; Huang, Qiming; Vermunt, Jeroen K.

2016-01-01

In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…
Item Effects in Recognition Memory for Words

Science.gov (United States)

Freeman, Emily; Heathcote, Andrew; Chalmers, Kerry; Hockley, William

2010-01-01

We investigate the effects of word characteristics on episodic recognition memory using analyses that avoid Clark's (1973) "language-as-a-fixed-effect" fallacy. Our results demonstrate the importance of modeling word variability and show that episodic memory for words is strongly affected by item noise (Criss & Shiffrin, 2004), as measured by the…
Emotional vitality in caregivers: application of Rasch Measurement Theory with secondary data to development and test a new measure.

Science.gov (United States)

Barbic, Skye P; Bartlett, Susan J; Mayo, Nancy E

2015-07-01

To describe the practical steps in identifying items and evaluating scoring strategies for a new measure of emotional vitality in informal caregivers of individuals who have experienced a significant health event. The psychometric properties of responses to selected items from validated health-related quality of life and other psychosocial questionnaires administered four times over a one-year period were evaluated using Rasch Measurement Theory. Community. A total of 409 individuals providing informal care at home to older adults who had experienced a recent stroke. Rasch Measurement Theory was used to test the ordering of response option thresholds, fit, spread of the item locations, residual correlations, person separation index, and stability across time. Based on a theoretical framework developed in earlier work, we identified 22 candidate items from a pool of relevant psychosocial measures available. Of these, additional evaluation resulted in 19 items that could be used to assess the five core domains. The overall model fit was reasonable (χ(2) = 202.26, DF = 117, p = 0.06), stable across time, with borderline evidence of multidimensionality (10%). Items and people covered a continuum ranging from -3.7 to +2.7 logits, reflecting coverage of the measurement continuum, with a person separation index of 0.85. Mean fit of caregivers was lower than expected (-1.31 ±1.10 logits). Established methods from the Rasch Measurement Theory were applied to develop a prototype measure of emotional vitality that is acceptable, reliable, and can be used to obtain an interval level score for use in future research and clinical settings. © The Author(s) 2014.
Item response theory analysis of the Pain Self-Efficacy Questionnaire.

Science.gov (United States)

Costa, Daniel S J; Asghari, Ali; Nicholas, Michael K

2017-01-01

The Pain Self-Efficacy Questionnaire (PSEQ) is a 10-item instrument designed to assess the extent to which a person in pain believes s/he is able to accomplish various activities despite their pain. There is strong evidence for the validity and reliability of both the full-length PSEQ and a 2-item version. The purpose of this study is to further examine the properties of the PSEQ using an item response theory (IRT) approach. We used the two-parameter graded response model to examine the category probability curves, and location and discrimination parameters of the 10 PSEQ items. In item response theory, responses to a set of items are assumed to be probabilistically determined by a latent (unobserved) variable. In the graded-response model specifically, item response threshold (the value of the latent variable for which adjacent response categories are equally likely) and discrimination parameters are estimated for each item. Participants were 1511 mixed, chronic pain patients attending for initial assessment at a tertiary pain management centre. All items except item 7 ('I can cope with my pain without medication') performed well in IRT analysis, and the category probability curves suggested that participants used the 7-point response scale consistently. Items 6 ('I can still do many of the things I enjoy doing, such as hobbies or leisure activity, despite pain'), 8 ('I can still accomplish most of my goals in life, despite the pain') and 9 ('I can live a normal lifestyle, despite the pain') captured higher levels of the latent variable with greater precision. The results from this IRT analysis add to the body of evidence based on classical test theory illustrating the strong psychometric properties of the PSEQ. Despite the relatively poor performance of Item 7, its clinical utility warrants its retention in the questionnaire. The strong psychometric properties of the PSEQ support its use as an effective tool for assessing self-efficacy in people with pain
Development and validation of a ten-item questionnaire with explanatory illustrations to assess upper extremity disorders: favorable effect of illustrations in the item reduction process.

Science.gov (United States)

Kurimoto, Shigeru; Suzuki, Mikako; Yamamoto, Michiro; Okui, Nobuyuki; Imaeda, Toshihiko; Hirata, Hitoshi

2011-11-01

The purpose of this study is to develop a short and valid measure for upper extremity disorders and to assess the effect of attached illustrations in item reduction of a self-administered disability questionnaire while retaining psychometric properties. A validated questionnaire used to assess upper extremity disorders, the Hand20, was reduced to ten items using two item-reduction techniques. The psychometric properties of the abbreviated form, the Hand10, were evaluated on an independent sample that was used for the shortening process. Validity, reliability, and responsiveness of the Hand10 were retained in the item reduction process. It was possible that the use of explanatory illustrations attached to the Hand10 helped with its reproducibility. The illustrations for the Hand10 promoted text comprehension and motivation to answer the items. These changes resulted in high acceptability; more than 99.3% of patients, including 98.5% of elderly patients, could complete the Hand10 properly. The illustrations had favorable effects on the item reduction process and made it possible to retain precision of the instrument. The Hand10 is a reliable and valid instrument for individual-level applications with the advantage of being compact and broadly applicable, even in elderly individuals.
The Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES): item response theory findings.

Science.gov (United States)

Grigg, Kaine; Manderson, Lenore

2016-03-17

Racism and associated discrimination are pervasive and persistent challenges with multiple cumulative deleterious effects contributing to inequities in various health outcomes. Globally, research over the past decade has shown consistent associations between racism and negative health concerns. Such research confirms that race endures as one of the strongest predictors of poor health. Due to the lack of validated Australian measures of racist attitudes, RACES (Racism, Acceptance, and Cultural-Ethnocentrism Scale) was developed. Here, we examine RACES' psychometric properties, including the latent structure, utilising Item Response Theory (IRT). Unidimensional and Multidimensional Rating Scale Model (RSM) Rasch analyses were utilised with 296 Victorian primary school students and 182 adolescents and 220 adults from the Australian community. RACES was demonstrated to be a robust 24-item three-dimensional scale of Accepting Attitudes (12 items), Racist Attitudes (8 items), and Ethnocentric Attitudes (4 items). RSM Rasch analyses provide strong support for the instrument as a robust measure of racist attitudes in the Australian context, and for the overall factorial and construct validity of RACES across primary school children, adolescents, and adults. RACES provides a reliable and valid measure that can be utilised across the lifespan to evaluate attitudes towards all racial, ethnic, cultural, and religious groups. A core function of RACES is to assess the effectiveness of interventions to reduce community levels of racism and in turn inequities in health outcomes within Australia.
The randomly renewed general item and the randomly inspected item with exponential life distribution

International Nuclear Information System (INIS)

Schneeweiss, W.G.

1979-01-01

For a randomly renewed item the probability distributions of the time to failure and of the duration of down time and the expectations of these random variables are determined. Moreover, it is shown that the same theory applies to randomly checked items with exponential probability distribution of life such as electronic items. The case of periodic renewals is treated as an example. (orig.) [de
Prevalence of item level negative symptoms in first episode psychosis diagnoses.

LENUS (Irish Health Repository)

Lyne, John

2012-03-01

The relevance of negative symptoms across the diagnostic spectrum of the psychoses remains uncertain. The purpose of this study was to report on prevalence of item and subscale level negative symptoms across the first episode psychosis (FEP) diagnostic spectrum in an epidemiological sample, and to ascertain whether items and subscales were more prevalent in a schizophrenia spectrum diagnoses group compared to an \\'all other psychotic diagnoses\\' group. We measured negative symptoms in 330 patients presenting with FEP using the Scale for Assessment of Negative Symptoms (SANS), and ascertained diagnosis using the Structured Clinical Interview for DSM IV. Prevalence of SANS items and subscales were tabulated across all psychotic diagnoses, and logistic regression analysis determined which items and subscales were predictive of schizophrenia spectrum diagnoses. SANS items were most prevalent in schizophrenia spectrum conditions but frequently presented in other FEP diagnoses, particularly substance induced psychotic disorder and Major Depressive Disorder. Brief psychotic disorder and bipolar disorders had low levels of negative symptoms. SANS items and subscales which significantly predicted schizophrenia spectrum diagnoses, were also frequently present in some of the other psychotic diagnoses. Conclusions: SANS items have high prevalence in FEP, and while commonest in schizophrenia spectrum conditions are not restricted to this diagnostic subgroup.

Work ability as prognostic risk marker of disability pension: single-item work ability score versus multi-item work ability index.

Science.gov (United States)

Roelen, Corné A M; van Rhenen, Willem; Groothoff, Johan W; van der Klink, Jac J L; Twisk, Jos W R; Heymans, Martijn W

2014-07-01

Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. This prospective cohort study comprised 11 537 male construction workers, who completed the WAI at baseline and reported DP after a mean 2.3 years of follow-up. WAS and WAI were calibrated for DP risk predictions with the Hosmer-Lemeshow (H-L) test and their ability to discriminate between high- and low-risk construction workers was investigated with the area under the receiver operating characteristic curve (AUC). At follow-up, 336 (3%) construction workers reported DP. Both WAS [odds ratio (OR) 0.72, 95% confidence interval (95% CI) 0.66-0.78] and WAI (OR 0.57, 95% CI 0.52-0.63) scores were associated with DP at follow-up. The WAS showed miscalibration (H-L model χ (�)=10.60; df=3; P=0.01) and poorly discriminated between high- and low-risk construction workers (AUC 0.67, 95% CI 0.64-0.70). In contrast, calibration (H-L model χ �=8.20; df=8; P=0.41) and discrimination (AUC 0.78, 95% CI 0.75-0.80) were both adequate for the WAI. Although associated with the risk of future DP, the single-item WAS poorly identified male construction workers at risk of DP. We recommend using the multi-item WAI to screen for risk of DP in occupational health practice.
Assessment of Differential Item Functioning in Health-Related Outcomes: A Simulation and Empirical Analysis with Hierarchical Polytomous Data

Directory of Open Access Journals (Sweden)

Zahra Sharafi

2017-01-01

Full Text Available Background. The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning (DIF in the presence of multilevel data and polytomously scored items. The assessment of DIF with multilevel data (e.g., patients nested within hospitals, hospitals nested within districts from large-scale assessment programs has received considerable attention but very few studies evaluated the effect of hierarchical structure of data on DIF detection for polytomously scored items. Methods. The ordinal logistic regression (OLR and hierarchical ordinal logistic regression (HOLR were utilized to assess DIF in simulated and real multilevel polytomous data. Six factors (DIF magnitude, grouping variable, intraclass correlation coefficient, number of clusters, number of participants per cluster, and item discrimination parameter with a fully crossed design were considered in the simulation study. Furthermore, data of Pediatric Quality of Life Inventory™ (PedsQL™ 4.0 collected from 576 healthy school children were analyzed. Results. Overall, results indicate that both methods performed equivalently in terms of controlling Type I error and detection power rates. Conclusions. The current study showed negligible difference between OLR and HOLR in detecting DIF with polytomously scored items in a hierarchical structure. Implications and considerations while analyzing real data were also discussed.
Scoring best-worst data in unbalanced many-item designs, with applications to crowdsourcing semantic judgments.

Science.gov (United States)

Hollis, Geoff

2018-04-01

Best-worst scaling is a judgment format in which participants are presented with a set of items and have to choose the superior and inferior items in the set. Best-worst scaling generates a large quantity of information per judgment because each judgment allows for inferences about the rank value of all unjudged items. This property of best-worst scaling makes it a promising judgment format for research in psychology and natural language processing concerned with estimating the semantic properties of tens of thousands of words. A variety of different scoring algorithms have been devised in the previous literature on best-worst scaling. However, due to problems of computational efficiency, these scoring algorithms cannot be applied efficiently to cases in which thousands of items need to be scored. New algorithms are presented here for converting responses from best-worst scaling into item scores for thousands of items (many-item scoring problems). These scoring algorithms are validated through simulation and empirical experiments, and considerations related to noise, the underlying distribution of true values, and trial design are identified that can affect the relative quality of the derived item scores. The newly introduced scoring algorithms consistently outperformed scoring algorithms used in the previous literature on scoring many-item best-worst data.
17 CFR 260.7a-16 - Inclusion of items, differentiation between items and answers, omission of instructions.

Science.gov (United States)

2010-04-01

... 17 Commodity and Securities Exchanges 3 2010-04-01 2010-04-01 false Inclusion of items, differentiation between items and answers, omission of instructions. 260.7a-16 Section 260.7a-16 Commodity and... INDENTURE ACT OF 1939 Formal Requirements § 260.7a-16 Inclusion of items, differentiation between items and...
Developing an African youth psychosocial assessment: an application of item response theory.

Science.gov (United States)

Betancourt, Theresa S; Yang, Frances; Bolton, Paul; Normand, Sharon-Lise

2014-06-01

This study aimed to refine a dimensional scale for measuring psychosocial adjustment in African youth using item response theory (IRT). A 60-item scale derived from qualitative data was administered to 667 war-affected adolescents (55% female). Exploratory factor analysis (EFA) determined the dimensionality of items based on goodness-of-fit indices. Items with loadings less than 0.4 were dropped. Confirmatory factor analysis (CFA) was used to confirm the scale's dimensionality found under the EFA. Item discrimination and difficulty were estimated using a graded response model for each subscale using weighted least squares means and variances. Predictive validity was examined through correlations between IRT scores (θ) for each subscale and ratings of functional impairment. All models were assessed using goodness-of-fit and comparative fit indices. Fisher's Information curves examined item precision at different underlying ranges of each trait. Original scale items were optimized and reconfigured into an empirically-robust 41-item scale, the African Youth Psychosocial Assessment (AYPA). Refined subscales assess internalizing and externalizing problems, prosocial attitudes/behaviors and somatic complaints without medical cause. The AYPA is a refined dimensional assessment of emotional and behavioral problems in African youth with good psychometric properties. Validation studies in other cultures are recommended. Copyright © 2014 John Wiley & Sons, Ltd.
Linguistic Simplification of Mathematics Items: Effects for Language Minority Students in Germany

Science.gov (United States)

Haag, Nicole; Heppt, Birgit; Roppelt, Alexander; Stanat, Petra

2015-01-01

In large-scale assessment studies, language minority students typically obtain lower test scores in mathematics than native speakers. Although this performance difference was related to the linguistic complexity of test items in some studies, other studies did not find linguistically demanding math items to be disproportionally more difficult for…
Investigating the Construct Measured by Banked Gap-Fill Items: Evidence from Eye-Tracking

Science.gov (United States)

McCray, Gareth; Brunfaut, Tineke

2018-01-01

This study investigates test-takers' processing while completing banked gap-fill tasks, designed to test reading proficiency, in order to test theoretically based expectations about the variation in cognitive processes of test-takers across levels of performance. Twenty-eight test-takers' eye traces on 24 banked gap-fill items (on six tasks) were…
The e-MSWS-12: improving the multiple sclerosis walking scale using item response theory.

Science.gov (United States)

Engelhard, Matthew M; Schmidt, Karen M; Engel, Casey E; Brenton, J Nicholas; Patek, Stephen D; Goldman, Myla D

2016-12-01

The Multiple Sclerosis Walking Scale (MSWS-12) is the predominant patient-reported measure of multiple sclerosis (MS) -elated walking ability, yet it had not been analyzed using item response theory (IRT), the emerging standard for patient-reported outcome (PRO) validation. This study aims to reduce MSWS-12 measurement error and facilitate computerized adaptive testing by creating an IRT model of the MSWS-12 and distributing it online. MSWS-12 responses from 284 subjects with MS were collected by mail and used to fit and compare several IRT models. Following model selection and assessment, subpopulations based on age and sex were tested for differential item functioning (DIF). Model comparison favored a one-dimensional graded response model (GRM). This model met fit criteria and explained 87 % of response variance. The performance of each MSWS-12 item was characterized using category response curves (CRCs) and item information. IRT-based MSWS-12 scores correlated with traditional MSWS-12 scores (r = 0.99) and timed 25-foot walk (T25FW) speed (r = -0.70). Item 2 showed DIF based on age (χ 2 = 19.02, df = 5, p Item 11 showed DIF based on sex (χ 2 = 13.76, df = 5, p = 0.02). MSWS-12 measurement error depends on walking ability, but could be lowered by improving or replacing items with low information or DIF. The e-MSWS-12 includes IRT-based scoring, error checking, and an estimated T25FW derived from MSWS-12 responses. It is available at https://ms-irt.shinyapps.io/e-MSWS-12 .
The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory

Science.gov (United States)

Sahin, Alper; Anil, Duygu

2017-01-01

This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…
Commercial grade item (CGI) dedication of MDR relays for nuclear safety related applications

Science.gov (United States)

Das, Ranjit K.; Julka, Anil; Modi, Govind

1994-08-01

MDR relays manufactured by Potter & Brumfield (P&B) have been used in various safety related applications in commercial nuclear power plants. These include emergency safety features (ESF) actuation systems, emergency core cooling systems (ECCS) actuation, and reactor protection systems. The MDR relays manufactured prior to May 1990 showed signs of generic failure due to corrosion and outgassing of coil varnish. P&B has made design changes to correct these problems in relays manufactured after May 1990. However, P&B does not manufacture the relays under any 10CFR50 Appendix B quality assurance (QA) program. They manufacture the relays under their commercial QA program and supply these as commercial grade items. This necessitates CGI Dedication of these relays for use in nuclear-safety-related applications. This paper presents a CGI dedication program that has been used to dedicate the MDR relays manufactured after been used to dedicate the MDR relays manufactured after May 1990. The program is in compliance with current Nuclear Regulatory Commission (NRC) and Electric Power Research Institute (EPRI) guidelines and applicable industry standards; it specifies the critical characteristics of the relays, provides the tests and analysis required to verify the critical characteristics, the acceptance criteria for the test results, performs source verification to quality P&B for its control of the critical characteristics, and provides documentation. The program provides reasonable assurance that the new MDR relays will perform their intended safety functions.
Estimating reliability coefficients with heterogeneous item weightings using Stata: A factor based approach

NARCIS (Netherlands)

Boermans, M.A.; Kattenberg, M.A.C.

2011-01-01

We show how to estimate a Cronbach's alpha reliability coefficient in Stata after running a principal component or factor analysis. Alpha evaluates to what extent items measure the same underlying content when the items are combined into a scale or used for latent variable. Stata allows for testing
Approximation Preserving Reductions among Item Pricing Problems

Science.gov (United States)

Hamane, Ryoso; Itoh, Toshiya; Tomita, Kouhei

When a store sells items to customers, the store wishes to determine the prices of the items to maximize its profit. Intuitively, if the store sells the items with low (resp. high) prices, the customers buy more (resp. less) items, which provides less profit to the store. So it would be hard for the store to decide the prices of items. Assume that the store has a set V of n items and there is a set E of m customers who wish to buy those items, and also assume that each item i ∈ V has the production cost di and each customer ej ∈ E has the valuation vj on the bundle ej ⊆ V of items. When the store sells an item i ∈ V at the price ri, the profit for the item i is pi = ri - di. The goal of the store is to decide the price of each item to maximize its total profit. We refer to this maximization problem as the item pricing problem. In most of the previous works, the item pricing problem was considered under the assumption that pi ≥ 0 for each i ∈ V, however, Balcan, et al. [In Proc. of WINE, LNCS 4858, 2007] introduced the notion of “loss-leader, ” and showed that the seller can get more total profit in the case that pi < 0 is allowed than in the case that pi < 0 is not allowed. In this paper, we derive approximation preserving reductions among several item pricing problems and show that all of them have algorithms with good approximation ratio.
Which dimensions of disability does the HIV Disability Questionnaire (HDQ) measure? A factor analysis.

Science.gov (United States)

O'Brien, Kelly K; Bayoumi, Ahmed M; Stratford, Paul; Solomon, Patricia

2015-01-01

To assess the dimensions of disability measured by the HIV Disability Questionnaire (HDQ), a newly developed 72-item self-administered questionnaire that describes the presence, severity and episodic nature of disability experienced by people living with HIV. We recruited adults living with HIV from hospital clinics, AIDS service organizations and a specialty hospital and administered the HDQ followed by a demographic questionnaire. We conducted an exploratory factor analysis using disability severity scores to determine the domains of disability in the HDQ. We used the following steps: (a) ensured correlations between items were >0.30 and 1.5 to determine the number of factors to retain; and d) used oblique rotation to simplify the factor loading matrix. We assigned items to factors based on factor loadings of >0.30. Of the 361 participants, 80% were men and 77% reported living with at least two concurrent health conditions in addition to HIV. The exploratory factor analysis suggested retaining six factors. Items related to symptoms and impairments loaded on three factors (physical [20 items], cognitive [3 items], and mental and emotional health [11 items]) and items related to worrying about the future, daily activities, and personal relationships loaded on three additional factors (uncertainty [14 items], difficulties with day-to-day activities [9 items], social inclusion [12 items]). The HDQ has six domains: physical symptoms and impairments; cognitive symptoms and impairments; mental and emotional health symptoms and impairments; uncertainty; difficulties with day-to-day activities and challenges to social inclusion. These domains establish the scoring structure for the dimensions of disability measured by the HDQ. Implications for Rehabilitation As individuals live longer and age with HIV, they may be living with the health-related consequences of HIV and concurrent health conditions, a concept that may be termed disability. Measuring disability is important
Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?

Science.gov (United States)

Sinharay, Sandip

2017-09-01

Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.
Brain activity is related to individual differences in the number of items stored in auditory short-term memory for pitch: evidence from magnetoencephalography.

Science.gov (United States)

Grimault, Stephan; Nolden, Sophie; Lefebvre, Christine; Vachon, François; Hyde, Krista; Peretz, Isabelle; Zatorre, Robert; Robitaille, Nicolas; Jolicoeur, Pierre

2014-07-01

We used magnetoencephalography (MEG) to examine brain activity related to the maintenance of non-verbal pitch information in auditory short-term memory (ASTM). We focused on brain activity that increased with the number of items effectively held in memory by the participants during the retention interval of an auditory memory task. We used very simple acoustic materials (i.e., pure tones that varied in pitch) that minimized activation from non-ASTM related systems. MEG revealed neural activity in frontal, temporal, and parietal cortices that increased with a greater number of items effectively held in memory by the participants during the maintenance of pitch representations in ASTM. The present results reinforce the functional role of frontal and temporal cortices in the retention of pitch information in ASTM. This is the first MEG study to provide both fine spatial localization and temporal resolution on the neural mechanisms of non-verbal ASTM for pitch in relation to individual differences in the capacity of ASTM. This research contributes to a comprehensive understanding of the mechanisms mediating the representation and maintenance of basic non-verbal auditory features in the human brain. Copyright © 2014 Elsevier Inc. All rights reserved.
Factoring handedness data: I. Item analysis.

Science.gov (United States)

Messinger, H B; Messinger, M I

1995-12-01

Recently in this journal Peters and Murphy challenged the validity of factor analyses done on bimodal handedness data, suggesting instead that right- and left-handers be studied separately. But bimodality may be avoidable if attention is paid to Oldfield's questionnaire format and instructions for the subjects. Two characteristics appear crucial: a two-column LEFT-RIGHT format for the body of the instrument and what we call Oldfield's Admonition: not to indicate strong preference for handedness item, such as write, unless "... the preference is so strong that you would never try to use the other hand unless absolutely forced to...". Attaining unimodality of an item distribution would seem to overcome the objections of Peters and Murphy. In a 1984 survey in Boston we used Oldfield's ten-item questionnaire exactly as published. This produced unimodal item distributions. With reflection of the five-point item scale and a logarithmic transformation, we achieved a degree of normalization for the items. Two surveys elsewhere based on Oldfield's 20-item list but with changes in the questionnaire format and the instructions, yielded markedly different item distributions with peaks at each extreme and sometimes in the middle as well.
Use of indicator items to monitor marine debris on a New Jersey beach from 1991 to 1996

Science.gov (United States)

Ribic, C.A.

1998-01-01

The US National Marine Debris Monitoring Program is using indicator items from beach surveys to identify whether amounts of marine debris are changing over time. Indicator items were selected through expert opinion and assumed to reflect the trend of all debris. We used monthly data from a 1991-1996 study of debris on a New Jersey beach to determine if indicator and non-indicator items showed similar trends. Total indicator debris levels did not change; this was true regardless of probable source. Non-indicator debris increased about 40% annually. Plastic non-indicator items increased regardless of whether items were whole items, cigarette filters, or pieces. Of the whole items, almost 50% were plastic lids, cups, and utensils, and about 25% were drug-related paraphernalia, tobacco-related products, plastic stirrers, pull rings, and fireworks. When indicator items are used in a monitoring programme to reflect total debris patterns, concordance of trends in indicator and non-indicator debris should be checked.
Item Modeling Concept Based on Multimedia Authoring

Directory of Open Access Journals (Sweden)

Janez Stergar

2008-09-01

Full Text Available In this paper a modern item design framework for computer based assessment based on Flash authoring environment will be introduced. Question design will be discussed as well as the multimedia authoring environment used for item modeling emphasized. Item type templates are a structured means of collecting and storing item information that can be used to improve the efficiency and security of the innovative item design process. Templates can modernize the item design, enhance and speed up the development process. Along with content creation, multimedia has vast potential for use in innovative testing. The introduced item design template is based on taxonomy of innovative items which have great potential for expanding the content areas and construct coverage of an assessment. The presented item design approach is based on GUI's – one for question design based on implemented item design templates and one for user interaction tracking/retrieval. The concept of user interfaces based on Flash technology will be discussed as well as implementation of the innovative approach of the item design forms with multimedia authoring. Also an innovative method for user interaction storage/retrieval based on PHP extending Flash capabilities in the proposed framework will be introduced.
A strategy for optimizing item-pool management

NARCIS (Netherlands)

Ariel, A.; van der Linden, Willem J.; Veldkamp, Bernard P.

2006-01-01

Item-pool management requires a balancing act between the input of new items into the pool and the output of tests assembled from it. A strategy for optimizing item-pool management is presented that is based on the idea of a periodic update of an optimal blueprint for the item pool to tune item
An Investigation of "Cloze" Items in the Measurement of Achievement in Foreign Languages.

Science.gov (United States)

Carroll, John B.; And Others

This study investigates the feasibility of using cloze procedure test items (in which a student supplies a word, letter, or phrase to fill a gap in a continuous text) for the written College Board foreign language achievement tests. An introduction which defines the problem, traces its history, and presents the overall design of the study is…

Item Construction Using Reflective, Formative, or Rasch Measurement Models: Implications for Group Work

Science.gov (United States)

Peterson, Christina Hamme; Gischlar, Karen L.; Peterson, N. Andrew

2017-01-01

Measures that accurately capture the phenomenon are critical to research and practice in group work. The vast majority of group-related measures were developed using the reflective measurement model rooted in classical test theory (CTT). Depending on the construct definition and the measure's purpose, the reflective model may not always be the…
Analysis of the differences between the accounting and tax treatment for items of property, plant and equipment: The Peruvian case

Directory of Open Access Journals (Sweden)

Oscar Alfredo Díaz Becerra

2012-12-01

Full Text Available This research work aims principally to make an analysis showing differences between the measurement and the recognition of items of property, plant and equipment. It focuses on the differences caused by existing differences between the treatment settled in the accounting standards and the one settled in the tax regulations related to Corporate Income Tax, for Peruvian case.A review of the related accounting standards and the standards established in the Peruvian Income Tax Law and its regulations have been considered in the current work. Thus, we are going to identify the main differences arising from the application of both standards regarding items of property, plant and equipment.Finally, we present the main conclusions drawn from this research.
Effect of Processing on Postprandial Glycemic Response and Consumer Acceptability of Lentil-Containing Food Items.

Science.gov (United States)

Ramdath, D Dan; Wolever, Thomas M S; Siow, Yaw Chris; Ryland, Donna; Hawke, Aileen; Taylor, Carla; Zahradka, Peter; Aliani, Michel

2018-05-11

The consumption of pulses is associated with many health benefits. This study assessed post-prandial blood glucose response (PPBG) and the acceptability of food items containing green lentils. In human trials we: (i) defined processing methods (boiling, pureeing, freezing, roasting, spray-drying) that preserve the PPBG-lowering feature of lentils; (ii) used an appropriate processing method to prepare lentil food items, and compared the PPBG and relative glycemic responses (RGR) of lentil and control foods; and (iii) conducted consumer acceptability of the lentil foods. Eight food items were formulated from either whole lentil puree (test) or instant potato (control). In separate PPBG studies, participants consumed fixed amounts of available carbohydrates from test foods, control foods, or a white bread standard. Finger prick blood samples were obtained at 0, 15, 30, 45, 60, 90, and 120 min after the first bite, analyzed for glucose, and used to calculate incremental area under the blood glucose response curve and RGR; glycemic index (GI) was measured only for processed lentils. Mean GI (± standard error of the mean) of processed lentils ranged from 25 ± 3 (boiled) to 66 ± 6 (spray-dried); the GI of spray-dried lentils was significantly ( p roasted lentil. Overall, lentil-based food items all elicited significantly lower RGR compared to potato-based items (40 ± 3 vs. 73 ± 3%; p chicken, chicken pot pie, and lemony parsley soup had the highest overall acceptability corresponding to "like slightly" to "like moderately". Processing influenced the PPBG of lentils, but food items formulated from lentil puree significantly attenuated PPBG. Formulation was associated with significant differences in sensory attributes.
Selection of material balance areas and item control areas

International Nuclear Information System (INIS)

1975-04-01

Section 70.58, ''Fundamental Nuclear Material Controls,'' of 10 CFR Part 70, ''Special Nuclear Material,'' requires certain licensees authorized to possess more than one effective kilogram of special nuclear material to establish Material Balance Areas (MBAs) or Item Control Areas (ICAs) for the physical and administrative control of nuclear materials. This section requires that: (1) each MBA be an identifiable physical area such that the quantity of nuclear material being moved into or out of the MBA is represented by a measured value; (2) the number of MBAs be sufficient to localize nuclear material losses or thefts and identify the mechanisms; (3) the custody of all nuclear material within an MBA or ICA be the responsibility of a single designated individual; and (4) ICAs be established according to the same criteria as MBAs except that control into and out of such areas would be by item identity and count for previously determined special nuclear material quantities, the validity of which must be ensured by tamper-safing unless the items are sealed sources. This guide describes bases acceptable to the NRC staff for the selection of material balance areas and item control areas. (U.S.)
Investigation of initial contamination for disposal medical infusion items and determination of sterilization dose

International Nuclear Information System (INIS)

Hu Jinhui; Xu Ziyan; Sun Naifeng; Yan Aoshuang; Gao Wei; Wang Binglin

1993-01-01

Statistical analyses on initial contamination of 624 disposal medical infusion items are made. The normal distribution of the initial contamination, the relation of initial contamination of inner and outer walls of disposal medical infusion items and the changes of initial contamination before irradiation are shown. The sterilized dose for disposal infusion is determined as 17.2 kGy using bioburden information. The SAL (sterility assurance level) dose is 10 6 . The SIP (device sample item proportion) is 1 and the average initial contamination is 7 CFU/item
Optimizing the Use of Response Times for Item Selection in Computerized Adaptive Testing

Science.gov (United States)

Choe, Edison M.; Kern, Justin L.; Chang, Hua-Hua

2018-01-01

Despite common operationalization, measurement efficiency of computerized adaptive testing should not only be assessed in terms of the number of items administered but also the time it takes to complete the test. To this end, a recent study introduced a novel item selection criterion that maximizes Fisher information per unit of expected response…
Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

Science.gov (United States)

Lee, Yi-Hsuan; Zhang, Jinming

2017-01-01

Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
Application of item response theory to achieve cross-cultural comparability of occupational stress measurement

NARCIS (Netherlands)

Tsutsumi, A.; Iwata, N.; Watanabe, N.; Jonge, de J.; Pikhart, H.; Férnandez-López, J.A.; Xu, Liying; Peter, R.; Knutsson, A.; Niedhammer, I.; Kawakami, N.; Siegrist, J.

2009-01-01

Our objective was to examine cross-cultural comparability of standard scales of the Effort-Reward Imbalance occupational stress scales by item response theory (IRT) analyses. Data were from 20,256 Japanese employees, 1464 Dutch nurses and nurses' aides, 2128 representative employees from
Summarizing activity limitations in children with chronic illnesses living in the community: a measurement study of scales using supplemented interRAI items

Directory of Open Access Journals (Sweden)

Phillips Charles D

2012-01-01

Full Text Available Abstract Background To test the validity and reliability of scales intended to measure activity limitations faced by children with chronic illnesses living in the community. The scales were based on information provided by caregivers to service program personnel almost exclusively trained as social workers. The items used to measure activity limitations were interRAI items supplemented so that they were more applicable to activity limitations in children with chronic illnesses. In addition, these analyses may shed light on the possibility of gathering functional information that can span the life course as well as spanning different care settings. Methods Analyses included testing the internal consistency, predictive, concurrent, discriminant and construct validity of two activity limitation scales. The scales were developed using assessment data gathered in the United States of America (USA from over 2,700 assessments of children aged 4 to 20 receiving Medicaid Early and Periodic Screening, Diagnostic and Treatment (EPSDT services, specifically Personal Care Services to assist children in overcoming activity limitations. The Medicaid program in the USA pays for health care services provided to children in low-income households. Data were collected in a single, large state in the southwestern USA in late 2008 and early 2009. A similar sample of children was assessed in 2010, and the analyses were replicated using this sample. Results The two scales exhibited excellent internal consistency. Evidence on the concurrent, predictive, discriminant, and construct validity of the proposed scales was strong. Quite importantly, scale scores were not correlated with (confounded with a child's developmental stage or age. The results for these scales and items were consistent across the two independent samples. Conclusions Unpaid caregivers, usually parents, can provide assessors lacking either medical or nursing training with reliable and valid information
Summarizing activity limitations in children with chronic illnesses living in the community: a measurement study of scales using supplemented interRAI items.

Science.gov (United States)

Phillips, Charles D; Patnaik, Ashweeta; Moudouni, Darcy K; Naiser, Emily; Dyer, James A; Hawes, Catherine; Fournier, Constance J; Miller, Thomas R; Elliott, Timothy R

2012-01-23

To test the validity and reliability of scales intended to measure activity limitations faced by children with chronic illnesses living in the community. The scales were based on information provided by caregivers to service program personnel almost exclusively trained as social workers. The items used to measure activity limitations were interRAI items supplemented so that they were more applicable to activity limitations in children with chronic illnesses. In addition, these analyses may shed light on the possibility of gathering functional information that can span the life course as well as spanning different care settings. Analyses included testing the internal consistency, predictive, concurrent, discriminant and construct validity of two activity limitation scales. The scales were developed using assessment data gathered in the United States of America (USA) from over 2,700 assessments of children aged 4 to 20 receiving Medicaid Early and Periodic Screening, Diagnostic and Treatment (EPSDT) services, specifically Personal Care Services to assist children in overcoming activity limitations. The Medicaid program in the USA pays for health care services provided to children in low-income households. Data were collected in a single, large state in the southwestern USA in late 2008 and early 2009. A similar sample of children was assessed in 2010, and the analyses were replicated using this sample. The two scales exhibited excellent internal consistency. Evidence on the concurrent, predictive, discriminant, and construct validity of the proposed scales was strong. Quite importantly, scale scores were not correlated with (confounded with) a child's developmental stage or age. The results for these scales and items were consistent across the two independent samples. Unpaid caregivers, usually parents, can provide assessors lacking either medical or nursing training with reliable and valid information on the activity limitations of children. One can summarize these
Concreteness effects in short-term memory: a test of the item-order hypothesis.

Science.gov (United States)

Roche, Jaclynn; Tolan, G Anne; Tehan, Gerald

2011-12-01

The following experiments explore word length and concreteness effects in short-term memory within an item-order processing framework. This framework asserts order memory is better for those items that are relatively easy to process at the item level. However, words that are difficult to process benefit at the item level for increased attention/resources being applied. The prediction of the model is that differential item and order processing can be detected in episodic tasks that differ in the degree to which item or order memory are required by the task. The item-order account has been applied to the word length effect such that there is a short word advantage in serial recall but a long word advantage in item recognition. The current experiment considered the possibility that concreteness effects might be explained within the same framework. In two experiments, word length (Experiment 1) and concreteness (Experiment 2) are examined using forward serial recall, backward serial recall, and item recognition. These results for word length replicate previous studies showing the dissociation in item and order tasks. The same was not true for the concreteness effect. In all three tasks concrete words were better remembered than abstract words. The concreteness effect cannot be explained in terms of an item-order trade off. PsycINFO Database Record (c) 2011 APA, all rights reserved.
SF-36 total score as a single measure of health-related quality of life: Scoping review.

Science.gov (United States)

Lins, Liliane; Carvalho, Fernando Martins

2016-01-01

According to the 36-Item Short Form Health Survey questionnaire developers, a global measure of health-related quality of life such as the "SF-36 Total/Global/Overall Score" cannot be generated from the questionnaire. However, studies keep on reporting such measure. This study aimed to evaluate the frequency and to describe some characteristics of articles reporting the SF-36 Total/Global/Overall Score in the scientific literature. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses method was adapted to a scoping review. We performed searches in PubMed, Web of Science, SCOPUS, BVS, and Cochrane Library databases for articles using such scores. We found 172 articles published between 1997 and 2015; 110 (64.0%) of them were published from 2010 onwards; 30.0% appeared in journals with Impact Factor 3.00 or greater. Overall, 129 (75.0%) out of the 172 studies did not specify the method for calculating the "SF-36 Total Score"; 13 studies did not specify their methods but referred to the SF-36 developers' studies or others; and 30 articles used different strategies for calculating such score, the most frequent being arithmetic averaging of the eight SF-36 domains scores. We concluded that the "SF-36 Total/Global/Overall Score" has been increasingly reported in the scientific literature. Researchers should be aware of this procedure and of its possible impacts upon human health.
Safety classification of items in Tianwan Nuclear Power Plant

International Nuclear Information System (INIS)

Sun Yongbin

2005-01-01

The principle of integrality, moderation and equilibrium should be considered in the safety classification of items in nuclear power plant. The basic ways for safety classification of items is to classify the safety function based on the effect of the outside enclosure damage of the items (parts) on the safety. Tianwan Nuclear Power Plant adopts Russian VVER-1000/428 type reactor, it safety classification mainly refers to Russian Guidelines and standards. The safety classification of the electric equipment refers to IEEE-308(80) standard, including 1E and Non 1E classification. The safety classification of the instrumentation and control equipment refers to GB/T 15474-1995 standard, including safety 1E, safety-related SR and NC non-safety classification. The safety classification of Tianwan Nuclear Power Plant has to be approved by NNSA and satisfy Chinese Nuclear Safety Guidelines. (authors)
Inter-item associations for the Brazilian version of the Deese/Roediger-McDermott paradigm

Directory of Open Access Journals (Sweden)

Luciano Grüdtner Buratto

2013-01-01

Full Text Available The emotional content of words can affect both true and false memory performance. One hypothesis suggests that the effects of emotion on memory stem from the semantic cohesion of these words. Emotional words are better remembered because they are more inter-related than neutral words (semantic cohesion hypothesis. Although support for this assumption has been found in tasks that measure true memory, less is known about how the structure of lexical knowledge affects emotional false memories. This is partially due to the scarcity of norms that capture the pre-existing knowledge structure of verbal materials commonly used to investigate emotional false memories, such as the Deese/Roediger-McDermott word lists. In this study, we present inter-item association norms for the 44 lists of the Brazilian version of the DRM paradigm. Free-association responses were collected from a sample of 1,042 undergraduates and were used to estimate the level of connectivity among the words present in the DRM lists. Connectivity measures were then used to test the semantic cohesion hypothesis. No significant correlations were found between the emotional measures (valence and arousal and the connectivity measures. The results do not give support to the semantic cohesion hypothesis and suggest that, for the Brazilian version of DRM lists, inter-item association and emotionality can be independently manipulated.
Psychometric properties of the PROMIS Physical Function item bank in patients receiving physical therapy.

Directory of Open Access Journals (Sweden)

Martine H P Crins

Full Text Available The Patient-Reported Outcomes Measurement Information System (PROMIS is a universally applicable set of instruments, including item banks, short forms and computer adaptive tests (CATs, measuring patient-reported health across different patient populations. PROMIS CATs are highly efficient and the use in practice is considered feasible with little administration time, offering standardized and routine patient monitoring. Before an item bank can be used as CAT, the psychometric properties of the item bank have to be examined. Therefore, the objective was to assess the psychometric properties of the Dutch-Flemish PROMIS Physical Function item bank (DF-PROMIS-PF in Dutch patients receiving physical therapy.Cross-sectional study.805 patients >18 years, who received any kind of physical therapy in primary care in the past year, completed the full DF-PROMIS-PF (121 items.Unidimensionality was examined by Confirmatory Factor Analysis and local dependence and monotonicity were evaluated. A Graded Response Model was fitted. Construct validity was examined with correlations between DF-PROMIS-PF T-scores and scores on two legacy instruments (SF-36 Health Survey Physical Functioning scale [SF36-PF10] and the Health Assessment Questionnaire Disability-Index [HAQ-DI]. Reliability (standard errors of theta was assessed.The results for unidimensionality were mixed (scaled CFI = 0.924, TLI = 0.923, RMSEA = 0.045, 1th factor explained 61.5% of variance. Some local dependence was found (8.2% of item pairs. The item bank showed a broad coverage of the physical function construct (threshold-parameters range: -4.28-2.33 and good construct validity (correlation with SF36-PF10 = 0.84 and HAQ-DI = -0.85. Furthermore, the DF-PROMIS-PF showed greater reliability over a broader score-range than the SF36-PF10 and HAQ-DI.The psychometric properties of the DF-PROMIS-PF item bank are sufficient. The DF-PROMIS-PF can now be used as short forms or CAT to measure the level of
Two items remembered as precisely as one: how integral features can improve visual working memory.

Science.gov (United States)

Bae, Gi Yeul; Flombaum, Jonathan I

2013-10-01

In the ongoing debate about the efficacy of visual working memory for more than three items, a consensus has emerged that memory precision declines as memory load increases from one to three. Many studies have reported that memory precision seems to be worse for two items than for one. We argue that memory for two items appears less precise than that for one only because two items present observers with a correspondence challenge that does not arise when only one item is stored--the need to relate observations to their corresponding memory representations. In three experiments, we prevented correspondence errors in two-item trials by varying sample items along task-irrelevant but integral (as opposed to separable) dimensions. (Initial experiments with a classic sorting paradigm identified integral feature relationships.) In three memory experiments, our manipulation produced equally precise representations of two items and of one item.
Investigating Separate and Concurrent Approaches for Item Parameter Drift in 3PL Item Response Theory Equating

Science.gov (United States)

Arce-Ferrer, Alvaro J.; Bulut, Okan

2017-01-01

This study examines separate and concurrent approaches to combine the detection of item parameter drift (IPD) and the estimation of scale transformation coefficients in the context of the common item nonequivalent groups design with the three-parameter item response theory equating. The study uses real and synthetic data sets to compare the two…
MMPI-2 Item Endorsements in Dissociative Identity Disorder vs. Simulators.

Science.gov (United States)

Brand, Bethany L; Chasson, Gregory S; Palermo, Cori A; Donato, Frank M; Rhodes, Kyle P; Voorhees, Emily F

2016-03-01

Elevated scores on some MMPI-2 (Minnesota Multiphasic Inventory-2) validity scales are common among patients with dissociative identity disorder (DID), which raises questions about the validity of their responses. Such patients show elevated scores on atypical answers (F), F-psychopathology (Fp), atypical answers in the second half of the test (FB), schizophrenia (Sc), and depression (D) scales, with Fp showing the greatest utility in distinguishing them from coached and uncoached DID simulators. In the current study, we investigated the items on the MMPI-2 F, Fp, FB, Sc, and D scales that were most and least commonly endorsed by participants with DID in our 2014 study and compared these responses with those of coached and uncoached DID simulators. The comparisons revealed that patients with DID most frequently endorsed items related to dissociation, trauma, depression, fearfulness, conflict within family, and self-destructiveness. The coached group more successfully imitated item endorsements of the DID group than did the uncoached group. However, both simulating groups, especially the uncoached group, frequently endorsed items that were uncommonly endorsed by the DID group. The uncoached group endorsed items consistent with popular media portrayals of people with DID being violent, delusional, and unlawful. These results suggest that item endorsement patterns can provide useful information to clinicians making determinations about whether an individual is presenting with DID or feigning. © 2016 American Academy of Psychiatry and the Law.
The Effects of Goal Relevance and Perceptual Features on Emotional Items and Associative Memory.

Science.gov (United States)

Mao, Wei B; An, Shu; Yang, Xiao F

2017-01-01

Showing an emotional item in a neutral background scene often leads to enhanced memory for the emotional item and impaired associative memory for background details. Meanwhile, both top-down goal relevance and bottom-up perceptual features played important roles in memory binding. We conducted two experiments and aimed to further examine the effects of goal relevance and perceptual features on emotional items and associative memory. By manipulating goal relevance (asking participants to categorize only each item image as living or non-living or to categorize each whole composite picture consisted of item image and background scene as natural scene or manufactured scene) and perceptual features (controlling visual contrast and visual familiarity) in two experiments, we found that both high goal relevance and salient perceptual features (high salience of items vs. high familiarity of items) could promote emotional item memory, but they had different effects on associative memory for emotional items and neutral backgrounds. Specifically, high goal relevance and high perceptual-salience of items could jointly impair the associative memory for emotional items and neutral backgrounds, while the effect of item familiarity on associative memory for emotional items would be modulated by goal relevance. High familiarity of items could increase associative memory for negative items and neutral backgrounds only in the low goal relevance condition. These findings suggest the effect of emotion on associative memory is not only related to attentional capture elicited by emotion, but also can be affected by goal relevance and perceptual features of stimulus.
Development of Rasch-based item banks for the assessment of work performance in patients with musculoskeletal diseases.

Science.gov (United States)

Mueller, Evelyn A; Bengel, Juergen; Wirtz, Markus A

2013-12-01

This study aimed to develop a self-description assessment instrument to measure work performance in patients with musculoskeletal diseases. In terms of the International Classification of Functioning, Disability and Health (ICF), work performance is defined as the degree of meeting the work demands (activities) at the actual workplace (environment). To account for the fact that work performance depends on the work demands of the job, we strived to develop item banks that allow a flexible use of item subgroups depending on the specific work demands of the patients' jobs. Item development included the collection of work tasks from literature and content validation through expert surveys and patient interviews. The resulting 122 items were answered by 621 patients with musculoskeletal diseases. Exploratory factor analysis to ascertain dimensionality and Rasch analysis (partial credit model) for each of the resulting dimensions were performed. Exploratory factor analysis resulted in four dimensions, and subsequent Rasch analysis led to the following item banks: 'impaired productivity' (15 items), 'impaired cognitive performance' (18), 'impaired coping with stress' (13) and 'impaired physical performance' (low physical workload 20 items, high physical workload 10 items). The item banks exhibited person separation indices (reliability) between 0.89 and 0.96. The assessment of work performance adds the activities component to the more commonly employed participation component of the ICF-model. The four item banks can be adapted to specific jobs where necessary without losing comparability of person measures, as the item banks are based on Rasch analysis.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.