WorldWideScience

Sample records for showed differential item

  1. Item-focussed Trees for the Identification of Items in Differential Item Functioning.

    Science.gov (United States)

    Tutz, Gerhard; Berger, Moritz

    2016-09-01

    A novel method for the identification of differential item functioning (DIF) by means of recursive partitioning techniques is proposed. We assume an extension of the Rasch model that allows for DIF being induced by an arbitrary number of covariates for each item. Recursive partitioning on the item level results in one tree for each item and leads to simultaneous selection of items and variables that induce DIF. For each item, it is possible to detect groups of subjects with different item difficulties, defined by combinations of characteristics that are not pre-specified. The way a DIF item is determined by covariates is visualized in a small tree and therefore easily accessible. An algorithm is proposed that is based on permutation tests. Various simulation studies, including the comparison with traditional approaches to identify items with DIF, show the applicability and the competitive performance of the method. Two applications illustrate the usefulness and the advantages of the new method.

  2. Gender-Based Differential Item Performance in Mathematics Achievement Items.

    Science.gov (United States)

    Doolittle, Allen E.; Cleary, T. Anne

    1987-01-01

    Eight randomly equivalent samples of high school seniors were each given a unique form of the ACT Assessment Mathematics Usage Test (ACTM). Signed measures of differential item performance (DIP) were obtained for each item in the eight ACTM forms. DIP estimates were analyzed and a significant item category effect was found. (Author/LMO)

  3. A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

    Science.gov (United States)

    Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

    2018-04-10

    To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading .3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.

  4. Language-related differential item functioning between English and German PROMIS Depression items is negligible.

    Science.gov (United States)

    Fischer, H Felix; Wahl, Inka; Nolte, Sandra; Liegl, Gregor; Brähler, Elmar; Löwe, Bernd; Rose, Matthias

    2017-12-01

    To investigate differential item functioning (DIF) of PROMIS Depression items between US and German samples we compared data from the US PROMIS calibration sample (n = 780), a German general population survey (n = 2,500) and a German clinical sample (n = 621). DIF was assessed in an ordinal logistic regression framework, with 0.02 as criterion for R 2 -change and 0.096 for Raju's non-compensatory DIF. Item parameters were initially fixed to the PROMIS Depression metric; we used plausible values to account for uncertainty in depression estimates. Only four items showed DIF. Accounting for DIF led to negligible effects for the full item bank as well as a post hoc simulated computer-adaptive test (German general population sample was considerably lower compared to the US reference value of 50. Overall, we found little evidence for language DIF between US and German samples, which could be addressed by either replacing the DIF items by items not showing DIF or by scoring the short form in German samples with the corrected item parameters reported. Copyright © 2016 John Wiley & Sons, Ltd.

  5. Verification of Differential Item Functioning (DIF) Status of West ...

    African Journals Online (AJOL)

    This study investigated test item bias and Differential Item Functioning (DIF) of West African ... items in chemistry function differentially with respect to gender and location. In Aba education zone of Abia, 50 secondary schools were purposively ...

  6. MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

    Science.gov (United States)

    Wang, Wen-Chung; Shih, Ching-Lin

    2010-01-01

    Three multiple indicators-multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods…

  7. A Bifactor Multidimensional Item Response Theory Model for Differential Item Functioning Analysis on Testlet-Based Items

    Science.gov (United States)

    Fukuhara, Hirotaka; Kamata, Akihito

    2011-01-01

    A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…

  8. 17 CFR 260.7a-16 - Inclusion of items, differentiation between items and answers, omission of instructions.

    Science.gov (United States)

    2010-04-01

    ... 17 Commodity and Securities Exchanges 3 2010-04-01 2010-04-01 false Inclusion of items, differentiation between items and answers, omission of instructions. 260.7a-16 Section 260.7a-16 Commodity and... INDENTURE ACT OF 1939 Formal Requirements § 260.7a-16 Inclusion of items, differentiation between items and...

  9. Evaluating construct validity of the second version of the Copenhagen Psychosocial Questionnaire through analysis of differential item functioning and differential item effect

    DEFF Research Database (Denmark)

    Bjorner, Jakob Bue; Pejtersen, Jan Hyld

    2010-01-01

    AIMS: To evaluate the construct validity of the Copenhagen Psychosocial Questionnaire II (COPSOQ II) by means of tests for differential item functioning (DIF) and differential item effect (DIE). METHODS: We used a Danish general population postal survey (n = 4,732 with 3,517 wage earners) with a ...

  10. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    Science.gov (United States)

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  11. Using Differential Item Functioning Procedures to Explore Sources of Item Difficulty and Group Performance Characteristics.

    Science.gov (United States)

    Scheuneman, Janice Dowd; Gerritz, Kalle

    1990-01-01

    Differential item functioning (DIF) methodology for revealing sources of item difficulty and performance characteristics of different groups was explored. A total of 150 Scholastic Aptitude Test items and 132 Graduate Record Examination general test items were analyzed. DIF was evaluated for males and females and Blacks and Whites. (SLD)

  12. An Effect Size Measure for Raju's Differential Functioning for Items and Tests

    Science.gov (United States)

    Wright, Keith D.; Oshima, T. C.

    2015-01-01

    This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…

  13. Effect of Differential Item Functioning on Test Equating

    Science.gov (United States)

    Kabasakal, Kübra Atalay; Kelecioglu, Hülya

    2015-01-01

    This study examines the effect of differential item functioning (DIF) items on test equating through multilevel item response models (MIRMs) and traditional IRMs. The performances of three different equating models were investigated under 24 different simulation conditions, and the variables whose effects were examined included sample size, test…

  14. Differential item functioning magnitude and impact measures from item response theory models.

    Science.gov (United States)

    Kleinman, Marjorie; Teresi, Jeanne A

    2016-01-01

    Measures of magnitude and impact of differential item functioning (DIF) at the item and scale level, respectively are presented and reviewed in this paper. Most measures are based on item response theory models. Magnitude refers to item level effect sizes, whereas impact refers to differences between groups at the scale score level. Reviewed are magnitude measures based on group differences in the expected item scores and impact measures based on differences in the expected scale scores. The similarities among these indices are demonstrated. Various software packages are described that provide magnitude and impact measures, and new software presented that computes all of the available statistics conveniently in one program with explanations of their relationships to one another.

  15. Does Gender-Specific Differential Item Functioning Affect the Structure in Vocational Interest Inventories?

    Science.gov (United States)

    Beinicke, Andrea; Pässler, Katja; Hell, Benedikt

    2014-01-01

    The study investigates consequences of eliminating items showing gender-specific differential item functioning (DIF) on the psychometric structure of a standard RIASEC interest inventory. Holland's hexagonal model was tested for structural invariance using a confirmatory methodological approach (confirmatory factor analysis and randomization…

  16. Few items in the thyroid-related quality of life instrument ThyPRO exhibited differential item functioning

    DEFF Research Database (Denmark)

    Watt, Torquil; Grønvold, Mogens; Hegedüs, Laszlo

    2014-01-01

    To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis.......To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis....

  17. Differential item functioning of the UWES-17 in South Africa

    Directory of Open Access Journals (Sweden)

    Leanne Goliath-Yarde

    2011-11-01

    Research purpose: This study assesses the Differential Item Functioning (DIF of the Utrecht Work Engagement Scale (UWES-17 for different South African cultural groups in a South African company. Motivation for the study: Organisations are using the UWES-17 more and more in South Africa to assess work engagement. Therefore, research evidence from psychologists or assessment practitioners on its DIF across different cultural groups is necessary. Research design, approach and method: The researchers conducted a Secondary Data Analysis (SDA on the UWES-17 sample (n = 2429 that they obtained from a cross-sectional survey undertaken in a South African Information and Communication Technology (ICT sector company (n = 24 134. Quantitative item data on the UWES-17 scale enabled the authors to address the research question. Main findings: The researchers found uniform and/or non-uniform DIF on five of the vigour items, four of the dedication items and two of the absorption items. This also showed possible Differential Test Functioning (DTF on the vigour and dedication dimensions. Practical/managerial implications: Based on the DIF, the researchers suggested that organisations should not use the UWES-17 comparatively for different cultural groups or employment decisions in South Africa. Contribution/value add: The study provides evidence on DIF and possible DTF for the UWES-17. However, it also raises questions about possible interaction effects that need further investigation.

  18. Evaluation of psychometric properties and differential item functioning of 8-item Child Perceptions Questionnaires using item response theory.

    Science.gov (United States)

    Yau, David T W; Wong, May C M; Lam, K F; McGrath, Colman

    2015-08-19

    Four-factor structure of the two 8-item short forms of Child Perceptions Questionnaire CPQ11-14 (RSF:8 and ISF:8) has been confirmed. However, the sum scores are typically reported in practice as a proxy of Oral health-related Quality of Life (OHRQoL), which implied a unidimensional structure. This study first assessed the unidimensionality of 8-item short forms of CPQ11-14. Item response theory (IRT) was employed to offer an alternative and complementary approach of validation and to overcome the limitations of classical test theory assumptions. A random sample of 649 12-year-old school children in Hong Kong was analyzed. Unidimensionality of the scale was tested by confirmatory factor analysis (CFA), principle component analysis (PCA) and local dependency (LD) statistic. Graded response model was fitted to the data. Contribution of each item to the scale was assessed by item information function (IIF). Reliability of the scale was assessed by test information function (TIF). Differential item functioning (DIF) across gender was identified by Wald test and expected score functions. Both CPQ11-14 RSF:8 and ISF:8 did not deviate much from the unidimensionality assumption. Results from CFA indicated acceptable fit of the one-factor model. PCA indicated that the first principle component explained >30 % of the total variation with high factor loadings for both RSF:8 and ISF:8. Almost all LD statistic items suggesting little contribution of information to the scale and item removal caused little practical impact. Comparing the TIFs, RSF:8 showed slightly better information than ISF:8. In addition to oral symptoms items, the item "Concerned with what other people think" demonstrated a uniform DIF (p Items related to oral symptoms were not informative to OHRQoL and deletion of these items is suggested. The impact of DIF across gender on the overall score was minimal. CPQ11-14 RSF:8 performed slightly better than ISF:8 in measurement precision. The 6-item short forms

  19. Differential Weighting of Items to Improve University Admission Test Validity

    Directory of Open Access Journals (Sweden)

    Eduardo Backhoff Escudero

    2001-05-01

    Full Text Available This paper gives an evaluation of different ways to increase university admission test criterion-related validity, by differentially weighting test items. We compared four methods of weighting multiple-choice items of the Basic Skills and Knowledge Examination (EXHCOBA: (1 punishing incorrect responses by a constant factor, (2 weighting incorrect responses, considering the levels of error, (3 weighting correct responses, considering the item’s difficulty, based on the Classic Measurement Theory, and (4 weighting correct responses, considering the item’s difficulty, based on the Item Response Theory. Results show that none of these methods increased the instrument’s predictive validity, although they did improve its concurrent validity. It was concluded that it is appropriate to score the test by simply adding up correct responses.

  20. Item Response Theory with Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model

    Science.gov (United States)

    Tay, Louis; Huang, Qiming; Vermunt, Jeroen K.

    2016-01-01

    In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…

  1. Why Consumers Misattribute Sponsorships to Non-Sponsor Brands: Differential Roles of Item and Relational Communications.

    Science.gov (United States)

    Weeks, Clinton S; Humphreys, Michael S; Cornwell, T Bettina

    2018-02-01

    Brands engaged in sponsorship of events commonly have objectives that depend on consumer memory for the sponsor-event relationship (e.g., sponsorship awareness). Consumers however, often misattribute sponsorships to nonsponsor competitor brands, indicating erroneous memory for these relationships. The current research uses an item and relational memory framework to reveal sponsor brands may inadvertently foster this misattribution when they communicate relational linkages to events. Effects can be explained via differential roles of communicating item information (information that supports processing item distinctiveness) versus relational information (information that supports processing relationships among items) in contributing to memory outcomes. Experiment 1 uses event-cued brand recall to show that correct memory retrieval is best supported by communicating relational information when sponsorship relationships are not obvious (low congruence). In contrast, correct retrieval is best supported by communicating item information when relationships are obvious (high congruence). Experiment 2 uses brand-cued event recall to show that, against conventional marketing recommendations, relational information increases misattribution, whereas item information guards against misattribution. Results suggest sponsor brands must distinguish between item and relational communications to enhance correct retrieval and limit misattribution. Methodologically, the work shows that choice of cueing direction is critical in differentially revealing patterns of correct and incorrect retrieval with pair relationships. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  2. A scale purification procedure for evaluation of differential item functioning

    NARCIS (Netherlands)

    Khalid, Muhammad Naveed; Glas, Cornelis A.W.

    2014-01-01

    Item bias or differential item functioning (DIF) has an important impact on the fairness of psychological and educational testing. In this paper, DIF is seen as a lack of fit to an item response (IRT) model. Inferences about the presence and importance of DIF require a process of so-called test

  3. A more general model for testing measurement invariance and differential item functioning.

    Science.gov (United States)

    Bauer, Daniel J

    2017-09-01

    The evaluation of measurement invariance is an important step in establishing the validity and comparability of measurements across individuals. Most commonly, measurement invariance has been examined using 1 of 2 primary latent variable modeling approaches: the multiple groups model or the multiple-indicator multiple-cause (MIMIC) model. Both approaches offer opportunities to detect differential item functioning within multi-item scales, and thereby to test measurement invariance, but both approaches also have significant limitations. The multiple groups model allows 1 to examine the invariance of all model parameters but only across levels of a single categorical individual difference variable (e.g., ethnicity). In contrast, the MIMIC model permits both categorical and continuous individual difference variables (e.g., sex and age) but permits only a subset of the model parameters to vary as a function of these characteristics. The current article argues that moderated nonlinear factor analysis (MNLFA) constitutes an alternative, more flexible model for evaluating measurement invariance and differential item functioning. We show that the MNLFA subsumes and combines the strengths of the multiple group and MIMIC models, allowing for a full and simultaneous assessment of measurement invariance and differential item functioning across multiple categorical and/or continuous individual difference variables. The relationships between the MNLFA model and the multiple groups and MIMIC models are shown mathematically and via an empirical demonstration. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  4. "Detecting Differential Item Functioning and Differential Step Functioning due to Differences that ""Should"" Matter"

    Directory of Open Access Journals (Sweden)

    Tess Miller

    2010-07-01

    Full Text Available This study illustrates the use of differential item functioning (DIF and differential step functioning (DSF analyses to detect differences in item difficulty that are related to experiences of examinees, such as their teachers' instructional practices, that are relevant to the knowledge, skill, or ability the test is intended to measure. This analysis is in contrast to the typical use of DIF or DSF to detect differences related to characteristics of examinees, such as gender, language, or cultural knowledge, that should be irrelevant. Using data from two forms of Ontario's Grade 9 Assessment of Mathematics, analyses were performed comparing groups of students defined by their teachers' instructional practices. All constructed-response items were tested for DIF using the Mantel Chi-Square, standardized Liu Agresti cumulative common log-odds ratio, and standardized Cox's noncentrality parameter. Items exhibiting moderate to large DIF were subsequently tested for DSF. In contrast to typical DIF or DSF analyses, which inform item development, these analyses have the potential to inform instructional practice.

  5. Detection of differential item functioning using Lagrange multiplier tests

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    1996-01-01

    In this paper it is shown that differential item functioning can be evaluated using the Lagrange multiplier test or C. R. Rao's efficient score test. The test is presented in the framework of a number of item response theory (IRT) models such as the Rasch model, the one-parameter logistic model, the

  6. Detection of Uniform and Nonuniform Differential Item Functioning by Item-Focused Trees

    Science.gov (United States)

    Berger, Moritz; Tutz, Gerhard

    2016-01-01

    Detection of differential item functioning (DIF) by use of the logistic modeling approach has a long tradition. One big advantage of the approach is that it can be used to investigate nonuniform (NUDIF) as well as uniform DIF (UDIF). The classical approach allows one to detect DIF by distinguishing between multiple groups. We propose an…

  7. Use of multilevel logistic regression to identify the causes of differential item functioning.

    Science.gov (United States)

    Balluerka, Nekane; Gorostiaga, Arantxa; Gómez-Benito, Juana; Hidalgo, María Dolores

    2010-11-01

    Given that a key function of tests is to serve as evaluation instruments and for decision making in the fields of psychology and education, the possibility that some of their items may show differential behaviour is a major concern for psychometricians. In recent decades, important progress has been made as regards the efficacy of techniques designed to detect this differential item functioning (DIF). However, the findings are scant when it comes to explaining its causes. The present study addresses this problem from the perspective of multilevel analysis. Starting from a case study in the area of transcultural comparisons, multilevel logistic regression is used: 1) to identify the item characteristics associated with the presence of DIF; 2) to estimate the proportion of variation in the DIF coefficients that is explained by these characteristics; and 3) to evaluate alternative explanations of the DIF by comparing the explanatory power or fit of different sequential models. The comparison of these models confirmed one of the two alternatives (familiarity with the stimulus) and rejected the other (the topic area) as being a cause of differential functioning with respect to the compared groups.

  8. Do people with and without medical conditions respond similarly to the short health anxiety inventory? An assessment of differential item functioning using item response theory.

    Science.gov (United States)

    LeBouthillier, Daniel M; Thibodeau, Michel A; Alberts, Nicole M; Hadjistavropoulos, Heather D; Asmundson, Gordon J G

    2015-04-01

    Individuals with medical conditions are likely to have elevated health anxiety; however, research has not demonstrated how medical status impacts response patterns on health anxiety measures. Measurement bias can undermine the validity of a questionnaire by overestimating or underestimating scores in groups of individuals. We investigated whether the Short Health Anxiety Inventory (SHAI), a widely-used measure of health anxiety, exhibits medical condition-based bias on item and subscale levels, and whether the SHAI subscales adequately assess the health anxiety continuum. Data were from 963 individuals with diabetes, breast cancer, or multiple sclerosis, and 372 healthy individuals. Mantel-Haenszel tests and item characteristic curves were used to classify the severity of item-level differential item functioning in all three medical groups compared to the healthy group. Test characteristic curves were used to assess scale-level differential item functioning and whether the SHAI subscales adequately assess the health anxiety continuum. Nine out of 14 items exhibited differential item functioning. Two items exhibited differential item functioning in all medical groups compared to the healthy group. In both Thought Intrusion and Fear of Illness subscales, differential item functioning was associated with mildly deflated scores in medical groups with very high levels of the latent traits. Fear of Illness items poorly discriminated between individuals with low and very low levels of the latent trait. While individuals with medical conditions may respond differentially to some items, clinicians and researchers can confidently use the SHAI with a variety of medical populations without concern of significant bias. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. Differential item functioning analysis of the Vanderbilt Expertise Test for cars.

    Science.gov (United States)

    Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W; Van Gulick, Ana Beth; Gauthier, Isabel

    2015-01-01

    The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.

  10. Identifying Country-Specific Cultures of Physics Education: A differential item functioning approach

    Science.gov (United States)

    Mesic, Vanes

    2012-11-01

    In international large-scale assessments of educational outcomes, student achievement is often represented by unidimensional constructs. This approach allows for drawing general conclusions about country rankings with respect to the given achievement measure, but it typically does not provide specific diagnostic information which is necessary for systematic comparisons and improvements of educational systems. Useful information could be obtained by exploring the differences in national profiles of student achievement between low-achieving and high-achieving countries. In this study, we aimed to identify the relative weaknesses and strengths of eighth graders' physics achievement in Bosnia and Herzegovina in comparison to the achievement of their peers from Slovenia. For this purpose, we ran a secondary analysis of Trends in International Mathematics and Science Study (TIMSS) 2007 data. The student sample consisted of 4,220 students from Bosnia and Herzegovina and 4,043 students from Slovenia. After analysing the cognitive demands of TIMSS 2007 physics items, the correspondent differential item functioning (DIF)/differential group functioning contrasts were estimated. Approximately 40% of items exhibited large DIF contrasts, indicating significant differences between cultures of physics education in Bosnia and Herzegovina and Slovenia. The relative strength of students from Bosnia and Herzegovina showed to be mainly associated with the topic area 'Electricity and magnetism'. Classes of items which required the knowledge of experimental method, counterintuitive thinking, proportional reasoning and/or the use of complex knowledge structures proved to be differentially easier for students from Slovenia. In the light of the presented results, the common practice of ranking countries with respect to universally established cognitive categories seems to be potentially misleading.

  11. Mixture Item Response Theory-MIMIC Model: Simultaneous Estimation of Differential Item Functioning for Manifest Groups and Latent Classes

    Science.gov (United States)

    Bilir, Mustafa Kuzey

    2009-01-01

    This study uses a new psychometric model (mixture item response theory-MIMIC model) that simultaneously estimates differential item functioning (DIF) across manifest groups and latent classes. Current DIF detection methods investigate DIF from only one side, either across manifest groups (e.g., gender, ethnicity, etc.), or across latent classes…

  12. Re-evaluating a vision-related quality of life questionnaire with item response theory (IRT and differential item functioning (DIF analyses

    Directory of Open Access Journals (Sweden)

    Knol Dirk L

    2011-09-01

    Full Text Available Abstract Background For the Low Vision Quality Of Life questionnaire (LVQOL it is unknown whether the psychometric properties are satisfactory when an item response theory (IRT perspective is considered. This study evaluates some essential psychometric properties of the LVQOL questionnaire in an IRT model, and investigates differential item functioning (DIF. Methods Cross-sectional data were used from an observational study among visually-impaired patients (n = 296. Calibration was performed for every dimension of the LVQOL in the graded response model. Item goodness-of-fit was assessed with the S-X2-test. DIF was assessed on relevant background variables (i.e. age, gender, visual acuity, eye condition, rehabilitation type and administration type with likelihood-ratio tests for DIF. The magnitude of DIF was interpreted by assessing the largest difference in expected scores between subgroups. Measurement precision was assessed by presenting test information curves; reliability with the index of subject separation. Results All items of the LVQOL dimensions fitted the model. There was significant DIF on several items. For two items the maximum difference between expected scores exceeded one point, and DIF was found on multiple relevant background variables. Item 1 'Vision in general' from the "Adjustment" dimension and item 24 'Using tools' from the "Reading and fine work" dimension were removed. Test information was highest for the "Reading and fine work" dimension. Indices for subject separation ranged from 0.83 to 0.94. Conclusions The items of the LVQOL showed satisfactory item fit to the graded response model; however, two items were removed because of DIF. The adapted LVQOL with 21 items is DIF-free and therefore seems highly appropriate for use in heterogeneous populations of visually impaired patients.

  13. Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

    Science.gov (United States)

    Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina

    2015-06-01

    This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism. © The Author(s) 2014.

  14. Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

    Science.gov (United States)

    Suh, Youngsuk

    2016-01-01

    This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…

  15. Exploring differential item functioning (DIF) with the Rasch model: A comparison of gender differences on eighth-grade science items in the United States and Spain

    Science.gov (United States)

    Calvert, Tasha

    Despite the attention that has been given to gender and science, boys continue to outperform girls in science achievement, particularly by the end of secondary school. Because it is unclear whether gender differences have narrowed over time (Leder, 1992; Willingham & Cole, 1997), it is important to continue a line of inquiry into the nature of gender differences, specifically at the international level. The purpose of this study was to investigate gender differences in science achievement across two countries: United States and Spain. A secondary purpose was to demonstrate an alternative method for exploring gender differences based on the many-faceted Rasch model (1980). A secondary analysis of the data from the Third International Mathematics and Science Study (TIMSS) was used to examine the relationship between gender DIF (differential item functioning) and item characteristics (item type, content, and performance expectation) across both countries. Nationally representative samples of eighth grade students in the United States and Spain who participated in TIMSS were analyzed to answer the research questions in this study. In both countries, girls showed an advantage over boys on life science items and most extended response items, whereas boys, by and large, had an advantage on earth science, physics, and chemistry items. However, even within areas that favored boys, such as physics, there were items that were differentially easier for girls. In general, patterns in gender differences were similar across both countries although there were a few differences between the countries on individual items. It was concluded that simply looking at mean differences does not provide an adequate understanding of the nature of gender differences in science achievement.

  16. Differential item functioning of the patient-reported outcomes information system (PROMIS®) pain interference item bank by language (Spanish versus English).

    Science.gov (United States)

    Paz, Sylvia H; Spritzer, Karen L; Reise, Steven P; Hays, Ron D

    2017-06-01

    About 70% of Latinos, 5 years old or older, in the United States speak Spanish at home. Measurement equivalence of the PROMIS ® pain interference (PI) item bank by language of administration (English versus Spanish) has not been evaluated. A sample of 527 adult Spanish-speaking Latinos completed the Spanish version of the 41-item PROMIS ® pain interference item bank. We evaluate dimensionality, monotonicity and local independence of the Spanish-language items. Then we evaluate differential item functioning (DIF) using ordinal logistic regression with item response theory scores estimated from DIF-free "anchor" items. One of the 41 items in the Spanish version of the PROMIS ® PI item bank was identified as having significant uniform DIF. English- and Spanish-speaking subjects with the same level of pain interference responded differently to 1 of the 41 items in the PROMIS ® PI item bank. This item was not retained due to proprietary issues. The original English language item parameters can be used when estimating PROMIS ® PI scores.

  17. Testing for Nonuniform Differential Item Functioning with Multiple Indicator Multiple Cause Models

    Science.gov (United States)

    Woods, Carol M.; Grimm, Kevin J.

    2011-01-01

    In extant literature, multiple indicator multiple cause (MIMIC) models have been presented for identifying items that display uniform differential item functioning (DIF) only, not nonuniform DIF. This article addresses, for apparently the first time, the use of MIMIC models for testing both uniform and nonuniform DIF with categorical indicators. A…

  18. Geriatric Anxiety Scale: item response theory analysis, differential item functioning, and creation of a ten-item short form (GAS-10).

    Science.gov (United States)

    Mueller, Anne E; Segal, Daniel L; Gavett, Brandon; Marty, Meghan A; Yochim, Brian; June, Andrea; Coolidge, Frederick L

    2015-07-01

    The Geriatric Anxiety Scale (GAS; Segal et al. (Segal, D. L., June, A., Payne, M., Coolidge, F. L. and Yochim, B. (2010). Journal of Anxiety Disorders, 24, 709-714. doi:10.1016/j.janxdis.2010.05.002) is a self-report measure of anxiety that was designed to address unique issues associated with anxiety assessment in older adults. This study is the first to use item response theory (IRT) to examine the psychometric properties of a measure of anxiety in older adults. A large sample of older adults (n = 581; mean age = 72.32 years, SD = 7.64 years, range = 60 to 96 years; 64% women; 88% European American) completed the GAS. IRT properties were examined. The presence of differential item functioning (DIF) or measurement bias by age and sex was assessed, and a ten-item short form of the GAS (called the GAS-10) was created. All GAS items had discrimination parameters of 1.07 or greater. Items from the somatic subscale tended to have lower discrimination parameters than items on the cognitive or affective subscales. Two items were flagged for DIF, but the impact of the DIF was negligible. Women scored significantly higher than men on the GAS and its subscales. Participants in the young-old group (60 to 79 years old) scored significantly higher on the cognitive subscale than participants in the old-old group (80 years old and older). Results from the IRT analyses indicated that the GAS and GAS-10 have strong psychometric properties among older adults. We conclude by discussing implications and future research directions.

  19. Statistical and extra-statistical considerations in differential item functioning analyses

    Directory of Open Access Journals (Sweden)

    G. K. Huysamen

    2004-10-01

    Full Text Available This article briefly describes the main procedures for performing differential item functioning (DIF analyses and points out some of the statistical and extra-statistical implications of these methods. Research findings on the sources of DIF, including those associated with translated tests, are reviewed. As DIF analyses are oblivious of correlations between a test and relevant criteria, the elimination of differentially functioning items does not necessarily improve predictive validity or reduce any predictive bias. The implications of the results of past DIF research for test development in the multilingual and multi-cultural South African society are considered. Opsomming Hierdie artikel beskryf kortliks die hoofprosedures vir die ontleding van differensiële itemfunksionering (DIF en verwys na sommige van die statistiese en buite-statistiese implikasies van hierdie metodes. ’n Oorsig word verskaf van navorsingsbevindings oor die bronne van DIF, insluitend dié by vertaalde toetse. Omdat DIF-ontledings nie die korrelasies tussen ’n toets en relevante kriteria in ag neem nie, sal die verwydering van differensieel-funksionerende items nie noodwendig voorspellingsgeldigheid verbeter of voorspellingsydigheid verminder nie. Die implikasies van vorige DIF-navorsingsbevindings vir toetsontwikkeling in die veeltalige en multikulturele Suid-Afrikaanse gemeenskap word oorweeg.

  20. Detection of differential item functioning using Lagrange multiplier tests

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    1998-01-01

    Abstract: In the present paper it is shown that differential item functioning can be evaluated using the Lagrange multiplier test or Rao’s efficient score test. The test is presented in the framework of a number of IRT models such as the Rasch model, the OPLM, the 2-parameter logistic model, the

  1. Differential Item Functioning Analysis of the Mental, Emotional, and Bodily Toughness Inventory

    Science.gov (United States)

    Gao, Yong; Mack, Mick G.; Ragan, Moira A.; Ragan, Brian

    2012-01-01

    In this study the authors used differential item functioning analysis to examine if there were items in the Mental, Emotional, and Bodily Toughness Inventory functioning differently across gender and athletic membership. A total of 444 male (56.3%) and female (43.7%) participants (30.9% athletes and 69.1% non-athletes) responded to the Mental,…

  2. Few items in the thyroid-related quality of life instrument ThyPRO exhibited differential item functioning.

    Science.gov (United States)

    Watt, Torquil; Groenvold, Mogens; Hegedüs, Laszlo; Bonnema, Steen Joop; Rasmussen, Åse Krogh; Feldt-Rasmussen, Ulla; Bjorner, Jakob Bue

    2014-02-01

    To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis. A total of 838 patients with benign thyroid diseases completed the ThyPRO questionnaire (84 five-point items, 13 scales). Uniform and nonuniform DIF were investigated using ordinal logistic regression, testing for both statistical significance and magnitude (∆R(2) > 0.02). Scale level was estimated by the sum score, after purification. Twenty instances of DIF in 17 of the 84 items were found. Eight according to diagnosis, where the goiter scale was the one most affected, possibly due to differing perceptions in patients with auto-immune thyroid diseases compared to patients with simple goiter. Eight DIFs according to age were found, of which 5 were in positively worded items, which younger patients were more likely to endorse; one according to gender: women were more likely to report crying, and three according to educational level. The vast majority of DIF had only minor influence on the scale scores (0.1-2.3 points on the 0-100 scales), but two DIF corresponded to a difference of 4.6 and 9.8, respectively. Ordinal logistic regression identified DIF in 17 of 84 items. The potential impact of this on the present scales was low, but items displaying DIF could be avoided when developing abbreviated scales, where the potential impact of DIF (due to fewer items) will be larger.

  3. Assessing Differential Item Functioning on the Test of Relational Reasoning

    Directory of Open Access Journals (Sweden)

    Denis Dumas

    2018-03-01

    Full Text Available The test of relational reasoning (TORR is designed to assess the ability to identify complex patterns within visuospatial stimuli. The TORR is designed for use in school and university settings, and therefore, its measurement invariance across diverse groups is critical. In this investigation, a large sample, representative of a major university on key demographic variables, was collected, and the resulting data were analyzed using a multi-group, multidimensional item-response theory model-comparison procedure. No significant differential item functioning was found on any of the TORR items across any of the demographic groups of interest. This finding is interpreted as evidence of the cultural fairness of the TORR, and potential test-development choices that may have contributed to that cultural fairness are discussed.

  4. Use of differential item functioning analysis to assess the equivalence of translations of a questionnaire

    NARCIS (Netherlands)

    Petersen, Morten Aa; Groenvold, Mogens; Bjorner, Jakob B.; Aaronson, Neil; Conroy, Thierry; Cull, Ann; Fayers, Peter; Hjermstad, Marianne; Sprangers, Mirjam; Sullivan, Marianne

    2003-01-01

    In cross-national comparisons based on questionnaires, accurate translations are necessary to obtain valid results. Differential item functioning (DIF) analysis can be used to test whether translations of items in multi-item scales are equivalent to the original. In data from 10,815 respondents

  5. Stepwise Analysis of Differential Item Functioning Based on Multiple-Group Partial Credit Model.

    Science.gov (United States)

    Muraki, Eiji

    1999-01-01

    Extended an Item Response Theory (IRT) method for detection of differential item functioning to the partial credit model and applied the method to simulated data using a stepwise procedure. Then applied the stepwise DIF analysis based on the multiple-group partial credit model to writing trend data from the National Assessment of Educational…

  6. Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

    Science.gov (United States)

    Lee, Yi-Hsuan; Zhang, Jinming

    2017-01-01

    Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

  7. Consolidation differentially modulates schema effects on memory for items and associations.

    Science.gov (United States)

    van Kesteren, Marlieke T R; Rijpkema, Mark; Ruiter, Dirk J; Fernández, Guillén

    2013-01-01

    Newly learned information that is congruent with a preexisting schema is often better remembered than information that is incongruent. This schema effect on memory has previously been associated to more efficient encoding and consolidation mechanisms. However, this effect is not always consistently supported in the literature, with differential schema effects reported for different types of memory, different retrieval cues, and the possibility of time-dependent effects related to consolidation processes. To examine these effects more directly, we tested participants on two different types of memory (item recognition and associative memory) for newly encoded visuo-tactile associations at different study-test intervals, thus probing memory retrieval accuracy for schema-congruent and schema-incongruent items and associations at different time points (t = 0, t = 20, and t = 48 hours) after encoding. Results show that the schema effect on visual item recognition only arises after consolidation, while the schema effect on associative memory is already apparent immediately after encoding, persisting, but getting smaller over time. These findings give further insight into different factors influencing the schema effect on memory, and can inform future schema experiments by illustrating the value of considering effects of memory type and consolidation on schema-modulated retrieval.

  8. Consolidation differentially modulates schema effects on memory for items and associations.

    Directory of Open Access Journals (Sweden)

    Marlieke T R van Kesteren

    Full Text Available Newly learned information that is congruent with a preexisting schema is often better remembered than information that is incongruent. This schema effect on memory has previously been associated to more efficient encoding and consolidation mechanisms. However, this effect is not always consistently supported in the literature, with differential schema effects reported for different types of memory, different retrieval cues, and the possibility of time-dependent effects related to consolidation processes. To examine these effects more directly, we tested participants on two different types of memory (item recognition and associative memory for newly encoded visuo-tactile associations at different study-test intervals, thus probing memory retrieval accuracy for schema-congruent and schema-incongruent items and associations at different time points (t = 0, t = 20, and t = 48 hours after encoding. Results show that the schema effect on visual item recognition only arises after consolidation, while the schema effect on associative memory is already apparent immediately after encoding, persisting, but getting smaller over time. These findings give further insight into different factors influencing the schema effect on memory, and can inform future schema experiments by illustrating the value of considering effects of memory type and consolidation on schema-modulated retrieval.

  9. Assessment of Differential Item Functioning in the Experiences of Discrimination Index

    Science.gov (United States)

    Cunningham, Timothy J.; Berkman, Lisa F.; Gortmaker, Steven L.; Kiefe, Catarina I.; Jacobs, David R.; Seeman, Teresa E.; Kawachi, Ichiro

    2011-01-01

    The psychometric properties of instruments used to measure self-reported experiences of discrimination in epidemiologic studies are rarely assessed, especially regarding construct validity. The authors used 2000–2001 data from the Coronary Artery Risk Development in Young Adults (CARDIA) Study to examine differential item functioning (DIF) in 2 versions of the Experiences of Discrimination (EOD) Index, an index measuring self-reported experiences of racial/ethnic and gender discrimination. DIF may confound interpretation of subgroup differences. Large DIF was observed for 2 of 7 racial/ethnic discrimination items: White participants reported more racial/ethnic discrimination for the “at school” item, and black participants reported more racial/ethnic discrimination for the “getting housing” item. The large DIF by race/ethnicity in the index for racial/ethnic discrimination probably reflects item impact and is the result of valid group differences between blacks and whites regarding their respective experiences of discrimination. The authors also observed large DIF by race/ethnicity for 3 of 7 gender discrimination items. This is more likely to have been due to item bias. Users of the EOD Index must consider the advantages and disadvantages of DIF adjustment (omitting items, constructing separate measures, and retaining items). The EOD Index has substantial usefulness as an instrument that can assess self-reported experiences of discrimination. PMID:22038104

  10. Gender-based Differential Item Functioning in the Application of the Theory of Planned Behavior for the Study of Entrepreneurial Intentions.

    Science.gov (United States)

    Zampetakis, Leonidas A; Bakatsaki, Maria; Litos, Charalambos; Kafetsios, Konstantinos G; Moustakis, Vassilis

    2017-01-01

    Over the past years the percentage of female entrepreneurs has increased, yet it is still far below of that for males. Although various attempts have been made to explain differences in mens' and women's entrepreneurial attitudes and intentions, the extent to which those differences are due to self-report biases has not been yet considered. The present study utilized Differential Item Functioning (DIF) to compare men and women's reporting on entrepreneurial intentions. DIF occurs in situations where members of different groups show differing probabilities of endorsing an item despite possessing the same level of the ability that the item is intended to measure. Drawing on the theory of planned behavior (TPB), the present study investigated whether constructs such as entrepreneurial attitudes, perceived behavioral control, subjective norms and intention would show gender differences and whether these gender differences could be explained by DIF. Using DIF methods on a dataset of 1800 Greek participants (50.4% female) indicated that differences at the item-level are almost non-existent. Moreover, the differential test functioning (DTF) analysis, which allows assessing the overall impact of DIF effects with all items being taken into account simultaneously, suggested that the effect of DIF across all the items for each scale was negligible. Future research should consider that measurement invariance can be assumed when using TPB constructs for the study of entrepreneurial motivation independent of gender.

  11. A Generalized Logistic Regression Procedure to Detect Differential Item Functioning among Multiple Groups

    Science.gov (United States)

    Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul

    2011-01-01

    We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

  12. Detection of Differential Item Functioning with Nonlinear Regression: A Non-IRT Approach Accounting for Guessing

    Science.gov (United States)

    Drabinová, Adéla; Martinková, Patrícia

    2017-01-01

    In this article we present a general approach not relying on item response theory models (non-IRT) to detect differential item functioning (DIF) in dichotomous items with presence of guessing. The proposed nonlinear regression (NLR) procedure for DIF detection is an extension of method based on logistic regression. As a non-IRT approach, NLR can…

  13. Gender-based Differential Item Functioning in the Application of the Theory of Planned Behavior for the Study of Entrepreneurial Intentions

    Science.gov (United States)

    Zampetakis, Leonidas A.; Bakatsaki, Maria; Litos, Charalambos; Kafetsios, Konstantinos G.; Moustakis, Vassilis

    2017-01-01

    Over the past years the percentage of female entrepreneurs has increased, yet it is still far below of that for males. Although various attempts have been made to explain differences in mens’ and women’s entrepreneurial attitudes and intentions, the extent to which those differences are due to self-report biases has not been yet considered. The present study utilized Differential Item Functioning (DIF) to compare men and women’s reporting on entrepreneurial intentions. DIF occurs in situations where members of different groups show differing probabilities of endorsing an item despite possessing the same level of the ability that the item is intended to measure. Drawing on the theory of planned behavior (TPB), the present study investigated whether constructs such as entrepreneurial attitudes, perceived behavioral control, subjective norms and intention would show gender differences and whether these gender differences could be explained by DIF. Using DIF methods on a dataset of 1800 Greek participants (50.4% female) indicated that differences at the item-level are almost non-existent. Moreover, the differential test functioning (DTF) analysis, which allows assessing the overall impact of DIF effects with all items being taken into account simultaneously, suggested that the effect of DIF across all the items for each scale was negligible. Future research should consider that measurement invariance can be assumed when using TPB constructs for the study of entrepreneurial motivation independent of gender. PMID:28386244

  14. Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression

    DEFF Research Database (Denmark)

    Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.

    2010-01-01

    Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise...

  15. Differential item functional analysis on pedagogic and content knowledge (PCK) questionnaire for Indonesian teachers using RASCH model

    Science.gov (United States)

    Rahmani, B. D.

    2018-01-01

    The purpose of this paper is to evaluate Indonesian senior high school teacher’s pedagogical content knowledge also their perception toward curriculum changing in West Java Indonesia. The data used in this study were derived from a questionnaire survey conducted among teachers in Bandung, West Java. A total of 61 usable responses were collected. The Differential Item Functioning (DIFF) was used to analyze the data whether the item had a difference or not toward gender, education background also on school location. However, the result showed that there was no any significant difference on gender and school location toward the item response but educational background. As a conclusion, the teacher’s educational background influence on giving the response to the questionnaire. Therefore, it is suggested in the future to construct the items on the questionnaire which is coped the differences of the participant particularly the educational background.

  16. DRD4 long allele carriers show heightened attention to high-priority items relative to low-priority items.

    Science.gov (United States)

    Gorlick, Marissa A; Worthy, Darrell A; Knopik, Valerie S; McGeary, John E; Beevers, Christopher G; Maddox, W Todd

    2015-03-01

    Humans with seven or more repeats in exon III of the DRD4 gene (long DRD4 carriers) sometimes demonstrate impaired attention, as seen in attention-deficit hyperactivity disorder, and at other times demonstrate heightened attention, as seen in addictive behavior. Although the clinical effects of DRD4 are the focus of much work, this gene may not necessarily serve as a "risk" gene for attentional deficits, but as a plasticity gene where attention is heightened for priority items in the environment and impaired for minor items. Here we examine the role of DRD4 in two tasks that benefit from selective attention to high-priority information. We examine a category learning task where performance is supported by focusing on features and updating verbal rules. Here, selective attention to the most salient features is associated with good performance. In addition, we examine the Operation Span (OSPAN) task, a working memory capacity task that relies on selective attention to update and maintain items in memory while also performing a secondary task. Long DRD4 carriers show superior performance relative to short DRD4 homozygotes (six or less tandem repeats) in both the category learning and OSPAN tasks. These results suggest that DRD4 may serve as a "plasticity" gene where individuals with the long allele show heightened selective attention to high-priority items in the environment, which can be beneficial in the appropriate context.

  17. The MIMIC Method with Scale Purification for Detecting Differential Item Functioning

    Science.gov (United States)

    Wang, Wen-Chung; Shih, Ching-Lin; Yang, Chih-Chien

    2009-01-01

    This study implements a scale purification procedure onto the standard MIMIC method for differential item functioning (DIF) detection and assesses its performance through a series of simulations. It is found that the MIMIC method with scale purification (denoted as M-SP) outperforms the standard MIMIC method (denoted as M-ST) in controlling…

  18. The practical impact of differential item functioning analyses in a health-related quality of life instrument

    DEFF Research Database (Denmark)

    Scott, Neil W; Fayers, Peter M; Aaronson, Neil K

    2009-01-01

    Differential item functioning (DIF) analyses are commonly used to evaluate health-related quality of life (HRQoL) instruments. There is, however, a lack of consensus as to how to assess the practical impact of statistically significant DIF results.......Differential item functioning (DIF) analyses are commonly used to evaluate health-related quality of life (HRQoL) instruments. There is, however, a lack of consensus as to how to assess the practical impact of statistically significant DIF results....

  19. Differential Item Functioning of the Psychological Domain of the Menopause Rating Scale

    Science.gov (United States)

    Portela-Buelvas, Katherin; Oviedo, Heidi C.; Herazo, Edwin; Campo-Arias, Adalberto

    2016-01-01

    Introduction. Quality of life could be quantified with the Menopause Rating Scale (MRS), which evaluates the severity of somatic, psychological, and urogenital symptoms in menopause. However, differential item functioning (DIF) analysis has not been applied previously. Objective. To establish the DIF of the psychological domain of the MRS in Colombian women. Methods. 4,009 women aged between 40 and 59 years, who participated in the CAVIMEC (Calidad de Vida en la Menopausia y Etnias Colombianas) project, were included. Average age was 49.0 ± 5.9 years. Women were classified in mestizo, Afro-Colombian, and indigenous. The results were presented as averages and standard deviation (X ± SD). A p value <0.001 was considered statistically significant. Results. In mestizo women, the highest X ± SD were obtained in physical and mental exhaustion (PME) (0.86 ± 0.93) and the lowest ones in anxiety (0.44 ± 0.79). In Afro-Colombian women, an average score of 0.99 ± 1.07 for PME and 0.63 ± 0.88 for anxiety was gotten. Indigenous women obtained an increased average score for PME (1.33 ± 0.93). The lowest score was evidenced in depressive mood (0.50 ± 0.81), which is different from other Colombian women (p < 0.001). Conclusions. The psychological items of the MRS show differential functioning according to the ethnic group, which may induce systematic error in the measurement of the construct. PMID:27847825

  20. Differential Item Functioning of the Psychological Domain of the Menopause Rating Scale.

    Science.gov (United States)

    Monterrosa-Castro, Alvaro; Portela-Buelvas, Katherin; Oviedo, Heidi C; Herazo, Edwin; Campo-Arias, Adalberto

    2016-01-01

    Introduction. Quality of life could be quantified with the Menopause Rating Scale (MRS), which evaluates the severity of somatic, psychological, and urogenital symptoms in menopause. However, differential item functioning (DIF) analysis has not been applied previously. Objective . To establish the DIF of the psychological domain of the MRS in Colombian women. Methods . 4,009 women aged between 40 and 59 years, who participated in the CAVIMEC (Calidad de Vida en la Menopausia y Etnias Colombianas) project, were included. Average age was 49.0 ± 5.9 years. Women were classified in mestizo, Afro-Colombian, and indigenous. The results were presented as averages and standard deviation ( X ± SD). A p value <0.001 was considered statistically significant. Results . In mestizo women, the highest X ± SD were obtained in physical and mental exhaustion (PME) (0.86 ± 0.93) and the lowest ones in anxiety (0.44 ± 0.79). In Afro-Colombian women, an average score of 0.99 ± 1.07 for PME and 0.63 ± 0.88 for anxiety was gotten. Indigenous women obtained an increased average score for PME (1.33 ± 0.93). The lowest score was evidenced in depressive mood (0.50 ± 0.81), which is different from other Colombian women ( p < 0.001). Conclusions . The psychological items of the MRS show differential functioning according to the ethnic group, which may induce systematic error in the measurement of the construct.

  1. Overcoming the effects of differential skewness of test items in scale construction

    Directory of Open Access Journals (Sweden)

    Johann M. Schepers

    2004-10-01

    Full Text Available The principal objective of the study was to develop a procedure for overcoming the effects of differential skewness of test items in scale construction. It was shown that the degree of skewness of test items places an upper limit on the correlations between the items, regardless of the contents of the items. If the items are ordered in terms of skewness the resulting inter correlation matrix forms a simplex or a pseudo simplex. Factoring such a matrix results in a multiplicity of factors, most of which are artifacts. A procedure for overcoming this problem was demonstrated with items from the Locus of Control Inventory (Schepers, 1995. The analysis was based on a sample of 1662 first year university students. Opsomming Die hoofdoel van die studie was om ’n prosedure te ontwikkel om die gevolge van differensiële skeefheid van toetsitems, in skaalkonstruksie, teen te werk. Daar is getoon dat die graad van skeefheid van toetsitems ’n boonste grens plaas op die korrelasies tussen die items ongeag die inhoud daarvan. Indien die items gerangskik word volgens graad van skeefheid, sal die interkorrelasiematriks van die items ’n simpleks of pseudosimpleks vorm. Indien so ’n matriks aan faktorontleding onderwerp word, lei dit tot ’n veelheid van faktore waarvan die meerderheid artefakte is. ’n Prosedure om hierdie probleem te bowe te kom, is gedemonstreer met behulp van die items van die Lokus van Beheer-vraelys (Schepers, 1995. Die ontledings is op ’n steekproef van 1662 eerstejaaruniversiteitstudente gebaseer.

  2. Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS: An item response theory approach

    Directory of Open Access Journals (Sweden)

    JOSEPH P. EIMICKE

    2009-06-01

    Full Text Available The aims of this paper are to present findings related to differential item functioning (DIF in the Patient Reported Outcome Measurement Information System (PROMIS depression item bank, and to discuss potential threats to the validity of results from studies of DIF. The 32 depression items studied were modified from several widely used instruments. DIF analyses of gender, age and education were performed using a sample of 735 individuals recruited by a survey polling firm. DIF hypotheses were generated by asking content experts to indicate whether or not they expected DIF to be present, and the direction of the DIF with respect to the studied comparison groups. Primary analyses were conducted using the graded item response model (for polytomous, ordered response category data with likelihood ratio tests of DIF, accompanied by magnitude measures. Sensitivity analyses were performed using other item response models and approaches to DIF detection. Despite some caveats, the items that are recommended for exclusion or for separate calibration were "I felt like crying" and "I had trouble enjoying things that I used to enjoy." The item, "I felt I had no energy," was also flagged as evidencing DIF, and recommended for additional review. On the one hand, false DIF detection (Type 1 error was controlled to the extent possible by ensuring model fit and purification. On the other hand, power for DIF detection might have been compromised by several factors, including sparse data and small sample sizes. Nonetheless, practical and not just statistical significance should be considered. In this case the overall magnitude and impact of DIF was small for the groups studied, although impact was relatively large for some individuals.

  3. Detection of Differential Item Functioning with Nonlinear Regression: A Non-IRT Approach Accounting for Guessing

    Czech Academy of Sciences Publication Activity Database

    Drabinová, Adéla; Martinková, Patrícia

    2017-01-01

    Roč. 54, č. 4 (2017), s. 498-517 ISSN 0022-0655 R&D Projects: GA ČR GJ15-15856Y Institutional support: RVO:67985807 Keywords : differential item functioning * non-linear regression * logistic regression * item response theory Subject RIV: AM - Education OBOR OECD: Statistics and probability Impact factor: 0.979, year: 2016

  4. Measurement equivalence and differential item functioning in family psychology.

    Science.gov (United States)

    Bingenheimer, Jeffrey B; Raudenbush, Stephen W; Leventhal, Tama; Brooks-Gunn, Jeanne

    2005-09-01

    Several hypotheses in family psychology involve comparisons of sociocultural groups. Yet the potential for cross-cultural inequivalence in widely used psychological measurement instruments threatens the validity of inferences about group differences. Methods for dealing with these issues have been developed via the framework of item response theory. These methods deal with an important type of measurement inequivalence, called differential item functioning (DIF). The authors introduce DIF analytic methods, linking them to a well-established framework for conceptualizing cross-cultural measurement equivalence in psychology (C.H. Hui and H.C. Triandis, 1985). They illustrate the use of DIF methods using data from the Project on Human Development in Chicago Neighborhoods (PHDCN). Focusing on the Caregiver Warmth and Environmental Organization scales from the PHDCN's adaptation of the Home Observation for Measurement of the Environment Inventory, the authors obtain results that exemplify the range of outcomes that may result when these methods are applied to psychological measurement instruments. (c) 2005 APA, all rights reserved

  5. Secondary Psychometric Examination of the Dimensional Obsessive-Compulsive Scale: Classical Testing, Item Response Theory, and Differential Item Functioning.

    Science.gov (United States)

    Thibodeau, Michel A; Leonard, Rachel C; Abramowitz, Jonathan S; Riemann, Bradley C

    2015-12-01

    The Dimensional Obsessive-Compulsive Scale (DOCS) is a promising measure of obsessive-compulsive disorder (OCD) symptoms but has received minimal psychometric attention. We evaluated the utility and reliability of DOCS scores. The study included 832 students and 300 patients with OCD. Confirmatory factor analysis supported the originally proposed four-factor structure. DOCS total and subscale scores exhibited good to excellent internal consistency in both samples (α = .82 to α = .96). Patient DOCS total scores reduced substantially during treatment (t = 16.01, d = 1.02). DOCS total scores discriminated between students and patients (sensitivity = 0.76, 1 - specificity = 0.23). The measure did not exhibit gender-based differential item functioning as tested by Mantel-Haenszel chi-square tests. Expected response options for each item were plotted as a function of item response theory and demonstrated that DOCS scores incrementally discriminate OCD symptoms ranging from low to extremely high severity. Incremental differences in DOCS scores appear to represent unbiased and reliable differences in true OCD symptom severity. © The Author(s) 2014.

  6. Gender Invariance of the Gambling Behavior Scale for Adolescents (GBS-A): An Analysis of Differential Item Functioning Using Item Response Theory.

    Science.gov (United States)

    Donati, Maria Anna; Chiesi, Francesca; Izzo, Viola A; Primi, Caterina

    2017-01-01

    As there is a lack of evidence attesting the equivalent item functioning across genders for the most employed instruments used to measure pathological gambling in adolescence, the present study was aimed to test the gender invariance of the Gambling Behavior Scale for Adolescents (GBS-A), a new measurement tool to assess the severity of Gambling Disorder (GD) in adolescents. The equivalence of the items across genders was assessed by analyzing Differential Item Functioning within an Item Response Theory framework. The GBS-A was administered to 1,723 adolescents, and the graded response model was employed. The results attested the measurement equivalence of the GBS-A when administered to male and female adolescent gamblers. Overall, findings provided evidence that the GBS-A is an effective measurement tool of the severity of GD in male and female adolescents and that the scale was unbiased and able to relieve truly gender differences. As such, the GBS-A can be profitably used in educational interventions and clinical treatments with young people.

  7. Item bias detection in the Hospital Anxiety and Depression Scale using structural equation modeling: comparison with other item bias detection methods

    NARCIS (Netherlands)

    Verdam, M.G.E.; Oort, F.J.; Sprangers, M.A.G.

    Purpose Comparison of patient-reported outcomes may be invalidated by the occurrence of item bias, also known as differential item functioning. We show two ways of using structural equation modeling (SEM) to detect item bias: (1) multigroup SEM, which enables the detection of both uniform and

  8. Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.

    Science.gov (United States)

    Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald

    2006-11-01

    We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.

  9. Parent Ratings of ADHD Symptoms: Generalized Partial Credit Model Analysis of Differential Item Functioning across Gender

    Science.gov (United States)

    Gomez, Rapson

    2012-01-01

    Objective: Generalized partial credit model, which is based on item response theory (IRT), was used to test differential item functioning (DIF) for the "Diagnostic and Statistical Manual of Mental Disorders" (4th ed.), inattention (IA), and hyperactivity/impulsivity (HI) symptoms across boys and girls. Method: To accomplish this, parents completed…

  10. Determination of a Differential Item Functioning Procedure Using the Hierarchical Generalized Linear Model

    Directory of Open Access Journals (Sweden)

    Tülin Acar

    2012-01-01

    Full Text Available The aim of this research is to compare the result of the differential item functioning (DIF determining with hierarchical generalized linear model (HGLM technique and the results of the DIF determining with logistic regression (LR and item response theory–likelihood ratio (IRT-LR techniques on the test items. For this reason, first in this research, it is determined whether the students encounter DIF with HGLM, LR, and IRT-LR techniques according to socioeconomic status (SES, in the Turkish, Social Sciences, and Science subtest items of the Secondary School Institutions Examination. When inspecting the correlations among the techniques in terms of determining the items having DIF, it was discovered that there was significant correlation between the results of IRT-LR and LR techniques in all subtests; merely in Science subtest, the results of the correlation between HGLM and IRT-LR techniques were found significant. DIF applications can be made on test items with other DIF analysis techniques that were not taken to the scope of this research. The analysis results, which were determined by using the DIF techniques in different sample sizes, can be compared.

  11. Racial differences in hypertension knowledge: effects of differential item functioning.

    Science.gov (United States)

    Ayotte, Brian J; Trivedi, Ranak; Bosworth, Hayden B

    2009-01-01

    Health-related knowledge is an important component in the self-management of chronic illnesses. The objective of this study was to more accurately assess racial differences in hypertension knowledge by using a latent variable modeling approach that controlled for sociodemographic factors and accounted for measurement issues in the assessment of hypertension knowledge. Cross-sectional data from 1,177 participants (45% African American; 35% female) were analyzed using a multiple indicator multiple causes (MIMIC) modeling approach. Available sociodemographic data included race, education, sex, financial status, and age. All participants completed six items on a hypertension knowledge questionnaire. Overall, the final model suggested that females, Whites, and patients with at least a high school diploma had higher latent knowledge scores than males, African Americans, and patients with less than a high school diploma, respectively. The model also detected differential item functioning (DIF) based on race for two of the items. Specifically, the error rate for African Americans was lower than would be expected given the lower level of latent knowledge on the items, on the questions related to: (a) the association between high blood pressure and kidney disease, and (b) the increased risk African Americans have for developing hypertension. Not accounting for DIF resulted in the difference between Whites and African Americans to be underestimated. These results are discussed in the context of the need for careful measurement of health-related constructs, and how measurement-related issues can result in an inaccurate estimation of racial differences in hypertension knowledge.

  12. Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression

    DEFF Research Database (Denmark)

    Scott, Neil W; Fayers, Peter M; Aaronson, Neil K

    2010-01-01

    Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise ...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....

  13. Differential items functioning to assess aggressiveness in college students / Funcionamento diferencial de itens para avaliar a agressividade de universitários

    Directory of Open Access Journals (Sweden)

    Fermino Fernandes Sisto

    2008-01-01

    Full Text Available In this research evidences of construct validity were searched analyzing the differential functioning items related to aggressiveness. The participants were 445 college students of both genders, attending the courses of Engineering, Computing and Psychology. The scale of aggressiveness composed by 81 items was collectively applied, in the classroom, to the students who consented to participate in the study. The items of the instrument were studied by means of the Rasch model. Twenty-eight items presented differential functioning item, 15 were characterized as typical for females and 13 for males. The reliability coefficients were 0.99 to the items and 0.86 to the persons. It was concluded that the aggressiveness can be measured separately on the basis of gender.

  14. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    Science.gov (United States)

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  15. Assessment of Differential Item Functioning in Health-Related Outcomes: A Simulation and Empirical Analysis with Hierarchical Polytomous Data

    Directory of Open Access Journals (Sweden)

    Zahra Sharafi

    2017-01-01

    Full Text Available Background. The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning (DIF in the presence of multilevel data and polytomously scored items. The assessment of DIF with multilevel data (e.g., patients nested within hospitals, hospitals nested within districts from large-scale assessment programs has received considerable attention but very few studies evaluated the effect of hierarchical structure of data on DIF detection for polytomously scored items. Methods. The ordinal logistic regression (OLR and hierarchical ordinal logistic regression (HOLR were utilized to assess DIF in simulated and real multilevel polytomous data. Six factors (DIF magnitude, grouping variable, intraclass correlation coefficient, number of clusters, number of participants per cluster, and item discrimination parameter with a fully crossed design were considered in the simulation study. Furthermore, data of Pediatric Quality of Life Inventory™ (PedsQL™ 4.0 collected from 576 healthy school children were analyzed. Results. Overall, results indicate that both methods performed equivalently in terms of controlling Type I error and detection power rates. Conclusions. The current study showed negligible difference between OLR and HOLR in detecting DIF with polytomously scored items in a hierarchical structure. Implications and considerations while analyzing real data were also discussed.

  16. Cross-cultural and sex differences in the Emotional Skills and Competence Questionnaire scales: Challenges of differential item functioning analyses

    Directory of Open Access Journals (Sweden)

    Bo Molander

    2009-11-01

    Full Text Available University students in Croatia, Slovenia, and Sweden (N = 1129 were examined by means of the Emotional Skills and Competence Questionnaire (Takšić, 1998. Results showed a significant effect for the sex factor only on the total-score scale, women scoring higher than men, but significant effects were obtained for country, as well as for sex, on the Express and Label (EL and Perceive and Understand (PU subscales. Sweden showed higher scores than Croatia and Slovenia on the EL scale, and Slovenia showed higher scores than Croatia and Sweden on the PU scale. In subsequent analyses of differential item functioning (DIF, comparisons were carried out for pairs of countries. The analyses revealed that a large proportion of the items in the total-score scale were potentially biased, most so for the Croatian-Swedish comparison, less for the Slovenian-Swedish comparison, and least for the Croatian-Slovenian comparison. These findings give doubts about the validity of mean score differences in comparisons of countries. However, DIF analyses of sex differences within each country show very few DIF items, indicating that the ESCQ instrument works well within each cultural/linguistic setting. Possible explanations of the findings are discussed, and improvements for future studies are suggested.

  17. Assessing the Straightforwardly-Worded Brief Fear of Negative Evaluation Scale for Differential Item Functioning Across Gender and Ethnicity.

    Science.gov (United States)

    Harpole, Jared K; Levinson, Cheri A; Woods, Carol M; Rodebaugh, Thomas L; Weeks, Justin W; Brown, Patrick J; Heimberg, Richard G; Menatti, Andrew R; Blanco, Carlos; Schneier, Franklin; Liebowitz, Michael

    2015-06-01

    The Brief Fear of Negative Evaluation Scale (BFNE; Leary Personality and Social Psychology Bulletin , 9, 371-375, 1983) assesses fear and worry about receiving negative evaluation from others. Rodebaugh et al. Psychological Assessment, 16 , 169-181, (2004) found that the BFNE is composed of a reverse-worded factor (BFNE-R) and straightforwardly-worded factor (BFNE-S). Further, they found the BFNE-S to have better psychometric properties and provide more information than the BFNE-R. Currently there is a lack of research regarding the measurement invariance of the BFNE-S across gender and ethnicity with respect to item thresholds. The present study uses item response theory (IRT) to test the BFNE-S for differential item functioning (DIF) related to gender and ethnicity (White, Asian, and Black). Six data sets consisting of clinical, community, and undergraduate participants were utilized ( N =2,109). The factor structure of the BFNE-S was confirmed using categorical confirmatory factor analysis, IRT model assumptions were tested, and the BFNE-S was evaluated for DIF. Item nine demonstrated significant non-uniform DIF between White and Black participants. No other items showed significant uniform or non-uniform DIF across gender or ethnicity. Results suggest the BFNE-S can be used reliably with men and women and Asian and White participants. More research is needed to understand the implications of using the BFNE-S with Black participants.

  18. Detection of Differential Item Functioning on the Kirton Adaption-Innovation Inventory Using Multiple-Group Mean and Covariance Structure Analyses.

    Science.gov (United States)

    Chan, David

    2000-01-01

    Demonstrates how the mean and covariance structure analysis model of D. Sorbom (1974) can be used to detect uniform and nonuniform differential item functioning (DIF) on polytomous ordered response items assumed to approximate a continuous scale. Uses results from 773 civil service employees administered the Kirton Adaption-Innovation Inventory…

  19. Standard Errors for National Trends in International Large-Scale Assessments in the Case of Cross-National Differential Item Functioning

    Science.gov (United States)

    Sachse, Karoline A.; Haag, Nicole

    2017-01-01

    Standard errors computed according to the operational practices of international large-scale assessment studies such as the Programme for International Student Assessment's (PISA) or the Trends in International Mathematics and Science Study (TIMSS) may be biased when cross-national differential item functioning (DIF) and item parameter drift are…

  20. An Examination of Differential Item Functioning on the Vanderbilt Assessment of Leadership in Education

    Science.gov (United States)

    Polikoff, Morgan S.; May, Henry; Porter, Andrew C.; Elliott, Stephen N.; Goldring, Ellen; Murphy, Joseph

    2009-01-01

    The Vanderbilt Assessment of Leadership in Education is a 360-degree assessment of the effectiveness of principals' learning-centered leadership behaviors. In this report, we present results from a differential item functioning (DIF) study of the assessment. Using data from a national field trial, we searched for evidence of DIF on school level,…

  1. An emotional functioning item bank of 24 items for computerized adaptive testing (CAT) was established

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Gamper, Eva-Maria; Costantini, Anna

    2016-01-01

    of the widely used EORTC Quality of Life questionnaire (QLQ-C30). STUDY DESIGN AND SETTING: On the basis of literature search and evaluations by international samples of experts and cancer patients, 38 candidate items were developed. The psychometric properties of the items were evaluated in a large...... international sample of cancer patients. This included evaluations of dimensionality, item response theory (IRT) model fit, differential item functioning (DIF), and of measurement precision/statistical power. RESULTS: Responses were obtained from 1,023 cancer patients from four countries. The evaluations showed...... that 24 items could be included in a unidimensional IRT model. DIF did not seem to have any significant impact on the estimation of EF. Evaluations indicated that the CAT measure may reduce sample size requirements by up to 50% compared to the QLQ-C30 EF scale without reducing power. CONCLUSION...

  2. Examining Multiple Sources of Differential Item Functioning on the Clinician & Group CAHPS® Survey

    Science.gov (United States)

    Rodriguez, Hector P; Crane, Paul K

    2011-01-01

    Objective To evaluate psychometric properties of a widely used patient experience survey. Data Sources English-language responses to the Clinician & Group Consumer Assessment of Healthcare Providers and Systems (CG-CAHPS®) survey (n = 12,244) from a 2008 quality improvement initiative involving eight southern California medical groups. Methods We used an iterative hybrid ordinal logistic regression/item response theory differential item functioning (DIF) algorithm to identify items with DIF related to patient sociodemographic characteristics, duration of the physician–patient relationship, number of physician visits, and self-rated physical and mental health. We accounted for all sources of DIF and determined its cumulative impact. Principal Findings The upper end of the CG-CAHPS® performance range is measured with low precision. With sensitive settings, some items were found to have DIF. However, overall DIF impact was negligible, as 0.14 percent of participants had salient DIF impact. Latinos who spoke predominantly English at home had the highest prevalence of salient DIF impact at 0.26 percent. Conclusions The CG-CAHPS® functions similarly across commercially insured respondents from diverse backgrounds. Consequently, previously documented racial and ethnic group differences likely reflect true differences rather than measurement bias. The impact of low precision at the upper end of the scale should be clarified. PMID:22092021

  3. Psychometric evaluation of Persian Nomophobia Questionnaire: Differential item functioning and measurement invariance across gender.

    Science.gov (United States)

    Lin, Chung-Ying; Griffiths, Mark D; Pakpour, Amir H

    2018-03-01

    Background and aims Research examining problematic mobile phone use has increased markedly over the past 5 years and has been related to "no mobile phone phobia" (so-called nomophobia). The 20-item Nomophobia Questionnaire (NMP-Q) is the only instrument that assesses nomophobia with an underlying theoretical structure and robust psychometric testing. This study aimed to confirm the construct validity of the Persian NMP-Q using Rasch and confirmatory factor analysis (CFA) models. Methods After ensuring the linguistic validity, Rasch models were used to examine the unidimensionality of each Persian NMP-Q factor among 3,216 Iranian adolescents and CFAs were used to confirm its four-factor structure. Differential item functioning (DIF) and multigroup CFA were used to examine whether males and females interpreted the NMP-Q similarly, including item content and NMP-Q structure. Results Each factor was unidimensional according to the Rach findings, and the four-factor structure was supported by CFA. Two items did not quite fit the Rasch models (Item 14: "I would be nervous because I could not know if someone had tried to get a hold of me;" Item 9: "If I could not check my smartphone for a while, I would feel a desire to check it"). No DIF items were found across gender and measurement invariance was supported in multigroup CFA across gender. Conclusions Due to the satisfactory psychometric properties, it is concluded that the Persian NMP-Q can be used to assess nomophobia among adolescents. Moreover, NMP-Q users may compare its scores between genders in the knowledge that there are no score differences contributed by different understandings of NMP-Q items.

  4. Exploring differential item functioning in the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC

    Directory of Open Access Journals (Sweden)

    Pollard Beth

    2012-12-01

    Full Text Available Abstract Background The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC is a widely used patient reported outcome in osteoarthritis. An important, but frequently overlooked, aspect of validating health outcome measures is to establish if items exhibit differential item functioning (DIF. That is, if respondents have the same underlying level of an attribute, does the item give the same score in different subgroups or is it biased towards one subgroup or another. The aim of the study was to explore DIF in the Likert format WOMAC for the first time in a UK osteoarthritis population with respect to demographic, social, clinical and psychological factors. Methods The sample comprised a community sample of 763 people with osteoarthritis who participated in the Somerset and Avon Survey of Health. The WOMAC was explored for DIF by gender, age, social deprivation, social class, employment status, distress, body mass index and clinical factors. Ordinal regression models were used to identify DIF items. Results After adjusting for age, two items were identified for the physical functioning subscale as having DIF with age identified as the DIF factor for 2 items, gender for 1 item and body mass index for 1 item. For the WOMAC pain subscale, for people with hip osteoarthritis one item was identified with age-related DIF. The impact of the DIF items rarely had a significant effect on the conclusions of group comparisons. Conclusions Overall, the WOMAC performed well with only a small number of DIF items identified. However, as DIF items were identified in for the WOMAC physical functioning subscale it would be advisable to analyse data taking into account the possible impact of the DIF items when weight, gender or especially age effects, are the focus of interest in UK-based osteoarthritis studies. Similarly for the WOMAC pain subscale in people with hip osteoarthritis it would be worthwhile to analyse data taking into account the

  5. A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales

    DEFF Research Database (Denmark)

    Scott, Neil W.; Fayers, Peter M.; Bottomley, Andrew

    2009-01-01

    Differential item functioning (DIF) analyses are increasingly used to evaluate health-related quality of life (HRQoL) instruments, which often include relatively short subscales. Computer simulations were used to explore how various factors including scale length affect analysis of DIF by ordinal...... logistic regression....

  6. Differential Item Functioning (DIF) Etnis pada Big Five Inventory (BFI) versi Adaptasi Fakultas Psikologi Universitas Sumatera Utara

    OpenAIRE

    Manik, Hitler

    2014-01-01

    Big Five Inventory (BFI) is one of personality test had been adapted into Indonesia language. More research had been developed to adapt the Indonesian Big Five Inventory. The purpose of this research is to check whether BFI’s personality test is fair if apply to ethnic of Batak Toba and Java. Therefore, examination of BFI’s items is needed. In psychology, especially in psychometric study, it is called Differential Item Functioning (DIF). Subject in this research is 327 people around 18 to 40 ...

  7. Differential Item Functioning (DIF) among Spanish-Speaking English Language Learners (ELLs) in State Science Tests

    Science.gov (United States)

    Ilich, Maria O.

    Psychometricians and test developers evaluate standardized tests for potential bias against groups of test-takers by using differential item functioning (DIF). English language learners (ELLs) are a diverse group of students whose native language is not English. While they are still learning the English language, they must take their standardized tests for their school subjects, including science, in English. In this study, linguistic complexity was examined as a possible source of DIF that may result in test scores that confound science knowledge with a lack of English proficiency among ELLs. Two years of fifth-grade state science tests were analyzed for evidence of DIF using two DIF methods, Simultaneous Item Bias Test (SIBTest) and logistic regression. The tests presented a unique challenge in that the test items were grouped together into testlets---groups of items referring to a scientific scenario to measure knowledge of different science content or skills. Very large samples of 10, 256 students in 2006 and 13,571 students in 2007 were examined. Half of each sample was composed of Spanish-speaking ELLs; the balance was comprised of native English speakers. The two DIF methods were in agreement about the items that favored non-ELLs and the items that favored ELLs. Logistic regression effect sizes were all negligible, while SIBTest flagged items with low to high DIF. A decrease in socioeconomic status and Spanish-speaking ELL diversity may have led to inconsistent SIBTest effect sizes for items used in both testing years. The DIF results for the testlets suggested that ELLs lacked sufficient opportunity to learn science content. The DIF results further suggest that those constructed response test items requiring the student to draw a conclusion about a scientific investigation or to plan a new investigation tended to favor ELLs.

  8. A Differential Item Functional Analysis by Age of Perceived Interpersonal Discrimination in a Multi-racial/ethnic Sample of Adults.

    Science.gov (United States)

    Owens, Sherry; Kristjansson, Alfgeir L; Hunte, Haslyn E R

    2015-11-05

    We investigated whether individual items on the nine item William's Perceived Everyday Discrimination Scale (EDS) functioned differently by age (ethnic group. Overall, Asian and Hispanic respondents reported less discrimination than Whites; on the other hand, African Americans and Black Caribbeans reported more discrimination than Whites. Regardless of race/ethnicity, the younger respondents (aged ethnicity, the results were mixed for 19 out of 45 tests of DIF (40%). No differences in item function were observed among Black Caribbeans. "Being called names or insulted" and others acting as "if they are afraid" of the respondents were the only two items that did not exhibit differential item functioning by age across all racial/ethnic groups. Overall, our findings suggest that the EDS scale should be used with caution in multi-age multi-racial/ethnic samples.

  9. Identifying group-sensitive physical activities: a differential item functioning analysis of NHANES data.

    Science.gov (United States)

    Gao, Yong; Zhu, Weimo

    2011-05-01

    The purpose of this study was to identify subgroup-sensitive physical activities (PA) using differential item functioning (DIF) analysis. A sub-unweighted sample of 1857 (men=923 and women=934) from the 2003-2004 National Health and Nutrition Examination Survey PA questionnaire data was used for the analyses. Using the Mantel-Haenszel, the simultaneous item bias test, and the ANOVA DIF methods, 33 specific leisure-time moderate and/or vigorous PA (MVPA) items were analyzed for DIF across race/ethnicity, gender, education, income, and age groups. Many leisure-time MVPA items were identified as large DIF items. When participating in the same amount of leisure-time MVPA, non-Hispanic blacks were more likely to participate in basketball and dance activities than non-Hispanic whites (NHW); NHW were more likely to participated in golf and hiking than non-Hispanic blacks; Hispanics were more likely to participate in dancing, hiking, and soccer than NHW, whereas NHW were more likely to engage in bicycling, golf, swimming, and walking than Hispanics; women were more likely to participate in aerobics, dancing, stretching, and walking than men, whereas men were more likely to engage in basketball, fishing, golf, running, soccer, weightlifting, and hunting than women; educated persons were more likely to participate in jogging and treadmill exercise than less educated persons; persons with higher incomes were more likely to engage in golf than those with lower incomes; and adults (20-59 yr) were more likely to participate in basketball, dancing, jogging, running, and weightlifting than older adults (60+ yr), whereas older adults were more likely to participate in walking and golf than younger adults. DIF methods are able to identify subgroup-sensitive PA and thus provide useful information to help design group-sensitive, targeted interventions for disadvantaged PA subgroups. © 2011 by the American College of Sports Medicine

  10. Why Japanese workers show low work engagement: An item response theory analysis of the Utrecht Work Engagement scale.

    Science.gov (United States)

    Shimazu, Akihito; Schaufeli, Wilmar B; Miyanaka, Daisuke; Iwata, Noboru

    2010-11-05

    With the globalization of occupational health psychology, more and more researchers are interested in applying employee well-being like work engagement (i.e., a positive, fulfilling, work-related state of mind that is characterized by vigor, dedication, and absorption) to diverse populations. Accurate measurement contributes to our further understanding and to the generalizability of the concept of work engagement across different cultures. The present study investigated the measurement accuracy of the Japanese and the original Dutch versions of the Utrecht Work Engagement Scale (9-item version, UWES-9) and the comparability of this scale between both countries. Item Response Theory (IRT) was applied to the data from Japan (N = 2,339) and the Netherlands (N = 13,406). Reliability of the scale was evaluated at various levels of the latent trait (i.e., work engagement) based the test information function (TIF) and the standard error of measurement (SEM). The Japanese version had difficulty in differentiating respondents with extremely low work engagement, whereas the original Dutch version had difficulty in differentiating respondents with high work engagement. The measurement accuracy of both versions was not similar. Suppression of positive affect among Japanese people and self-enhancement (the general sensitivity to positive self-relevant information) among Dutch people may have caused decreased measurement accuracy. Hence, we should be cautious when interpreting low engagement scores among Japanese as well as high engagement scores among western employees.

  11. Assessment of Preference for Edible and Leisure Items in Individuals with Dementia

    Science.gov (United States)

    Ortega, Javier Virues; Iwata, Brian A.; Nogales-Gonzalez, Celia; Frades, Belen

    2012-01-01

    We conducted 2 studies on reinforcer preference in patients with dementia. Results of preference assessments yielded differential selections by 14 participants. Unlike prior studies with individuals with intellectual disabilities, all participants showed a noticeable preference for leisure items over edible items. Results of a subsequent analysis…

  12. The 12-item World Health Organization Disability Assessment Schedule II (WHO-DAS II: a nonparametric item response analysis

    Directory of Open Access Journals (Sweden)

    Fernandez Ana

    2010-05-01

    Full Text Available Abstract Background Previous studies have analyzed the psychometric properties of the World Health Organization Disability Assessment Schedule II (WHO-DAS II using classical omnibus measures of scale quality. These analyses are sample dependent and do not model item responses as a function of the underlying trait level. The main objective of this study was to examine the effectiveness of the WHO-DAS II items and their options in discriminating between changes in the underlying disability level by means of item response analyses. We also explored differential item functioning (DIF in men and women. Methods The participants were 3615 adult general practice patients from 17 regions of Spain, with a first diagnosed major depressive episode. The 12-item WHO-DAS II was administered by the general practitioners during the consultation. We used a non-parametric item response method (Kernel-Smoothing implemented with the TestGraf software to examine the effectiveness of each item (item characteristic curves and their options (option characteristic curves in discriminating between changes in the underliying disability level. We examined composite DIF to know whether women had a higher probability than men of endorsing each item. Results Item response analyses indicated that the twelve items forming the WHO-DAS II perform very well. All items were determined to provide good discrimination across varying standardized levels of the trait. The items also had option characteristic curves that showed good discrimination, given that each increasing option became more likely than the previous as a function of increasing trait level. No gender-related DIF was found on any of the items. Conclusions All WHO-DAS II items were very good at assessing overall disability. Our results supported the appropriateness of the weights assigned to response option categories and showed an absence of gender differences in item functioning.

  13. Dissociating the neural correlates of intra-item and inter-item working-memory binding.

    Directory of Open Access Journals (Sweden)

    Carinne Piekema

    Full Text Available BACKGROUND: Integration of information streams into a unitary representation is an important task of our cognitive system. Within working memory, the medial temporal lobe (MTL has been conceptually linked to the maintenance of bound representations. In a previous fMRI study, we have shown that the MTL is indeed more active during working-memory maintenance of spatial associations as compared to non-spatial associations or single items. There are two explanations for this result, the mere presence of the spatial component activates the MTL, or the MTL is recruited to bind associations between neurally non-overlapping representations. METHODOLOGY/PRINCIPAL FINDINGS: The current fMRI study investigates this issue further by directly comparing intrinsic intra-item binding (object/colour, extrinsic intra-item binding (object/location, and inter-item binding (object/object. The three binding conditions resulted in differential activation of brain regions. Specifically, we show that the MTL is important for establishing extrinsic intra-item associations and inter-item associations, in line with the notion that binding of information processed in different brain regions depends on the MTL. CONCLUSIONS/SIGNIFICANCE: Our findings indicate that different forms of working-memory binding rely on specific neural structures. In addition, these results extend previous reports indicating that the MTL is implicated in working-memory maintenance, challenging the classic distinction between short-term and long-term memory systems.

  14. Differential Item Functioning of Pathological Gambling Criteria: An Examination of Gender, Race/Ethnicity, and Age

    OpenAIRE

    Sacco, Paul; Torres, Luis R.; Cunningham-Williams, Renee M.; Woods, Carol; Unick, G. Jay

    2011-01-01

    This study tested for the presence of differential item functioning (DIF) in DSM-IV Pathological Gambling Disorder (PGD) criteria based on gender, race/ethnicity and age. Using a nationally representative sample of adults from the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC), indicating current gambling (n = 10,899), Multiple Indicator-Multiple Cause (MIMIC) models tested for DIF, controlling for income, education, and marital status. Compared to the reference grou...

  15. Why Japanese workers show low work engagement: An item response theory analysis of the Utrecht Work Engagement scale

    Directory of Open Access Journals (Sweden)

    Iwata Noboru

    2010-11-01

    Full Text Available Abstract With the globalization of occupational health psychology, more and more researchers are interested in applying employee well-being like work engagement (i.e., a positive, fulfilling, work-related state of mind that is characterized by vigor, dedication, and absorption to diverse populations. Accurate measurement contributes to our further understanding and to the generalizability of the concept of work engagement across different cultures. The present study investigated the measurement accuracy of the Japanese and the original Dutch versions of the Utrecht Work Engagement Scale (9-item version, UWES-9 and the comparability of this scale between both countries. Item Response Theory (IRT was applied to the data from Japan (N = 2,339 and the Netherlands (N = 13,406. Reliability of the scale was evaluated at various levels of the latent trait (i.e., work engagement based the test information function (TIF and the standard error of measurement (SEM. The Japanese version had difficulty in differentiating respondents with extremely low work engagement, whereas the original Dutch version had difficulty in differentiating respondents with high work engagement. The measurement accuracy of both versions was not similar. Suppression of positive affect among Japanese people and self-enhancement (the general sensitivity to positive self-relevant information among Dutch people may have caused decreased measurement accuracy. Hence, we should be cautious when interpreting low engagement scores among Japanese as well as high engagement scores among western employees.

  16. Funcionamento diferencial de itens para avaliar a agressividade de universitários Differential items functioning to assess aggressiveness in college students

    Directory of Open Access Journals (Sweden)

    Fermino Fernandes Sisto

    2008-01-01

    Full Text Available Nesta pesquisa buscou-se evidência de validade de construto relacionada ao funcionamento dos itens para diferenciar sexos em um instrumento de agressividade. Participaram 445 universitários, de ambos os sexos, dos cursos de Engenharia, Computação e Psicologia. A escala de agressividade composta por 81 itens foi aplicada coletivamente, em sala de aula, nos estudantes que consentiram em participar do estudo. Os itens do instrumento foram analisados por meio do modelo Rasch. Vinte e oito itens apresentaram funcionamento diferencial, sendo 15 condutas mais características de pessoas do sexo feminino e outras 13 mais características do masculino. Os índices de precisão foram de 0,99 para os itens e 0,86 para as pessoas. Conclui-se que a agressividade pode ser medida separadamente em razão do sexo.In this research evidences of construct validity were searched analyzing the differential functioning items related to aggressiveness. The participants were 445 college students of both genders, attending the courses of Engineering, Computing and Psychology. The scale of aggressiveness composed by 81 items was collectively applied, in the classroom, to the students who consented to participate in the study. The items of the instrument were studied by means of the Rasch model. Twenty-eight items presented differential functioning item, 15 were characterized as typical for females and 13 for males. The reliability coefficients were 0.99 to the items and 0.86 to the persons. It was concluded that the aggressiveness can be measured separately on the basis of gender.

  17. Item response theory analyses of the Delis-Kaplan Executive Function System card sorting subtest.

    Science.gov (United States)

    Spencer, Mercedes; Cho, Sun-Joo; Cutting, Laurie E

    2018-02-02

    In the current study, we examined the dimensionality of the 16-item Card Sorting subtest of the Delis-Kaplan Executive Functioning System assessment in a sample of 264 native English-speaking children between the ages of 9 and 15 years. We also tested for measurement invariance for these items across age and gender groups using item response theory (IRT). Results of the exploratory factor analysis indicated that a two-factor model that distinguished between verbal and perceptual items provided the best fit to the data. Although the items demonstrated measurement invariance across age groups, measurement invariance was violated for gender groups, with two items demonstrating differential item functioning for males and females. Multigroup analysis using all 16 items indicated that the items were more effective for individuals whose IRT scale scores were relatively high. A single-group explanatory IRT model using 14 non-differential item functioning items showed that for perceptual ability, females scored higher than males and that scores increased with age for both males and females; for verbal ability, the observed increase in scores across age differed for males and females. The implications of these findings are discussed.

  18. Comparing Two Versions of the MEOCS Using Differential Item Functioning

    National Research Council Canada - National Science Library

    Truhon, Stephen

    2003-01-01

    ...) from item response theory (IRT). DIF was found for the majority of the 40 items examined, although in many cases the DIF indicated improvements in the revised items. Implications for these scales and for the use of IRT with the MEOCS are discussed.

  19. Checking Equity: Why Differential Item Functioning Analysis Should Be a Routine Part of Developing Conceptual Assessments

    Czech Academy of Sciences Publication Activity Database

    Martinková, Patrícia; Drabinová, Adéla; Liaw, Y.L.; Sanders, E.A.; McFarland, J.L.; Price, R.M.

    2017-01-01

    Roč. 16, č. 2 (2017), č. článku rm2. ISSN 1931-7913 R&D Projects: GA ČR GJ15-15856Y Grant - others:NSF(US) DUE-1043443 Institutional support: RVO:67985807 Keywords : differential item functioning * fairness * conceptual assessments * concept inventory * undergraduate education * bias Subject RIV: AM - Education OBOR OECD: Education , special (to gifted persons, those with learning disabilities) Impact factor: 3.930, year: 2016

  20. Improving measurement of injection drug risk behavior using item response theory.

    Science.gov (United States)

    Janulis, Patrick

    2014-03-01

    Recent research highlights the multiple steps to preparing and injecting drugs and the resultant viral threats faced by drug users. This research suggests that more sensitive measurement of injection drug HIV risk behavior is required. In addition, growing evidence suggests there are gender differences in injection risk behavior. However, the potential for differential item functioning between genders has not been explored. To explore item response theory as an improved measurement modeling technique that provides empirically justified scaling of injection risk behavior and to examine for potential gender-based differential item functioning. Data is used from three studies in the National Institute on Drug Abuse's Criminal Justice Drug Abuse Treatment Studies. A two-parameter item response theory model was used to scale injection risk behavior and logistic regression was used to examine for differential item functioning. Item fit statistics suggest that item response theory can be used to scale injection risk behavior and these models can provide more sensitive estimates of risk behavior. Additionally, gender-based differential item functioning is present in the current data. Improved measurement of injection risk behavior using item response theory should be encouraged as these models provide increased congruence between construct measurement and the complexity of injection-related HIV risk. Suggestions are made to further improve injection risk behavior measurement. Furthermore, results suggest direct comparisons of composite scores between males and females may be misleading and future work should account for differential item functioning before comparing levels of injection risk behavior.

  1. A comparison of discriminant logistic regression and Item Response Theory Likelihood-Ratio Tests for Differential Item Functioning (IRTLRDIF) in polytomous short tests.

    Science.gov (United States)

    Hidalgo, María D; López-Martínez, María D; Gómez-Benito, Juana; Guilera, Georgina

    2016-01-01

    Short scales are typically used in the social, behavioural and health sciences. This is relevant since test length can influence whether items showing DIF are correctly flagged. This paper compares the relative effectiveness of discriminant logistic regression (DLR) and IRTLRDIF for detecting DIF in polytomous short tests. A simulation study was designed. Test length, sample size, DIF amount and item response categories number were manipulated. Type I error and power were evaluated. IRTLRDIF and DLR yielded Type I error rates close to nominal level in no-DIF conditions. Under DIF conditions, Type I error rates were affected by test length DIF amount, degree of test contamination, sample size and number of item response categories. DLR showed a higher Type I error rate than did IRTLRDIF. Power rates were affected by DIF amount and sample size, but not by test length. DLR achieved higher power rates than did IRTLRDIF in very short tests, although the high Type I error rate involved means that this result cannot be taken into account. Test length had an important impact on the Type I error rate. IRTLRDIF and DLR showed a low power rate in short tests and with small sample sizes.

  2. Differential item functioning (DIF) in the EORTC QLQ-C30: a comparison of baseline, on-treatment and off-treatment data

    DEFF Research Database (Denmark)

    Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.

    2009-01-01

    Differential item functioning (DIF) analyses can be used to explore translation, cultural, gender or other differences in the performance of quality of life (QoL) instruments. These analyses are commonly performed using "baseline" or pretreatment data. We previously reported DIF analyses to examine...

  3. Funcionamiento diferencial del item en la evaluación internacional PISA. Detección y comprensión. [Differential Item Functioning in the PISA Project: Detection and Understanding

    Directory of Open Access Journals (Sweden)

    Paula Elosua

    2006-08-01

    Full Text Available This report analyses the differential item functioning (DIF in the Programme for Indicators of Student Achievement PISA2000. The items studied are coming from the Reading Comprehension Test. We analyzed the released items from this year because we wanted to join the detection of DIF and its understanding. The reference group is the sample of United Kingdom and the focal group is the Spanish sample. The procedures of detection are Mantel-Haenszel, Logistic Regression and the standardized mean difference, and their extensions for polytomous items. Two items were flagged and the post-hoc analysis didn’t explain the causes of DIF entirely. Este trabajo analiza el funcionamiento diferencial del ítem (FDI de la prueba de comprensión lectora de la evaluación PISA2000 entre la muestras del Reino Unido y España. Se estudian los ítems liberados con el fin de aunar las fases de detección del FDI con la comprensión de sus causas. En la fase de detección se comparan los resultados de los procedimientos Mantel-Haenszel, Regresión Logística y Medias Estandarizadas en sus versiones para ítems dicotómicos y politómicos. Los resultados muestran que dos ítems presentan funcionamiento diferencial aunque el estudio post-hoc llevado a cabo sobre su contenido no ha podido precisar sus causas.

  4. Psychometric properties of the Triarchic Psychopathy Measure: An item response theory approach.

    Science.gov (United States)

    Shou, Yiyun; Sellbom, Martin; Xu, Jing

    2018-05-01

    There is cumulative evidence for the cross-cultural validity of the Triarchic Psychopathy Measure (TriPM; Patrick, 2010) among non-Western populations. Recent studies using correlational and regression analyses show promising construct validity of the TriPM in Chinese samples. However, little is known about the efficiency of items in TriPM in assessing the proposed latent traits. The current study evaluated the psychometric properties of the Chinese TriPM at the item level using item response theory analyses. It also examined the measurement invariance of the TriPM between the Chinese and the U.S. student samples by applying differential item functioning analyses under the item response theory framework. The results supported the unidimensional nature of the Disinhibition and Meanness scales. Both scales had a greater level of precision in the respective underlying constructs at the positive ends. The two scales, however, had several items that were weakly associated with their respective latent traits in the Chinese student sample. Boldness, on the other hand, was found to be multidimensional, and reflected a more normally distributed range of variation. The examination of measurement bias via differential item functioning analyses revealed that a number of items of the TriPM were not equivalent across the Chinese and the U.S. Some modification and adaptation of items might be considered for improving the precision of the TriPM for Chinese participants. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  5. Differential Item Functioning in While-Listening Performance Tests: The Case of the International English Language Testing System (IELTS) Listening Module

    Science.gov (United States)

    Aryadoust, Vahid

    2012-01-01

    This article investigates a version of the International English Language Testing System (IELTS) listening test for evidence of differential item functioning (DIF) based on gender, nationality, age, and degree of previous exposure to the test. Overall, the listening construct was found to be underrepresented, which is probably an important cause…

  6. Local context effects during emotional item directed forgetting in younger and older adults.

    Science.gov (United States)

    Gallant, Sara N; Dyson, Benjamin J; Yang, Lixia

    2017-09-01

    This paper explored the differential sensitivity young and older adults exhibit to the local context of items entering memory. We examined trial-to-trial performance during an item directed forgetting task for positive, negative, and neutral (or baseline) words each cued as either to-be-remembered (TBR) or to-be-forgotten (TBF). This allowed us to focus on how variations in emotional valence (independent of arousal) and instruction (TBR vs. TBF) of the previous item (trial n-1) impacted memory for the current item (trial n) during encoding. Different from research showing impairing effects of emotional arousal, both age groups showed a memorial boost for stimuli when preceded by items high in positive or negative valence relative to those preceded by neutral items. This advantage was particularly prominent for neutral trial n items that followed emotional items suggesting that, regardless of age, neutral memories may be strengthened by a local context that is high in valence. A trending age difference also emerged with older adults showing greater sensitivity when encoding instructions changed between trial n-1 and n. Results are discussed in light of age-related theories of cognitive and emotional processing, highlighting the need to consider the dynamic, moment-to-moment fluctuations of these systems.

  7. Compreensão da leitura: análise do funcionamento diferencial dos itens de um Teste de Cloze Reading comprehension: differential item functioning analysis of a Cloze Test

    Directory of Open Access Journals (Sweden)

    Katya Luciane Oliveira

    2012-01-01

    Full Text Available Este estudo teve por objetivos investigar o ajuste de um Teste de Cloze ao modelo Rasch e avaliar a dificuldade na resposta ao item em razão do gênero das pessoas (DIF. Participaram da pesquisa 573 alunos das 5ª a 8ª séries do ensino fundamental de escolas públicas estaduais dos estados de São Paulo e Minas Gerais. O teste de Cloze foi aplicado de forma coletiva. A análise do instrumento evidenciou um bom ajuste ao modelo Rasch, bem como os itens foram respondidos conforme o padrão esperado, demonstrando um bom ajuste, também. Quanto ao DIF, apenas três itens indicaram diferenciar o gênero. Com base nos dados, identificou-se que houve equilíbrio nas respostas dadas pelos meninos e meninas.The objectives of the present study were to investigate the adaptation of a Cloze test to the Rasch Model as well as to evaluate the Differential Item Functioning (DIF in relation to gender. The sample was composed by 573 students from 5th to 8th grades of public schools in the state of São Paulo. The cloze test was applied collectively. The analysis of the instrument revealed its adaptation to Rash Model and that the items were responded according to the expected pattern, showing good adjustment, as well. Regarding DIF, only three items were differentiated by gender. Based on the data, results indicated a balance in the answers given by boys and girls.

  8. Item Response Theory Analyses of the Cambridge Face Memory Test (CFMT)

    Science.gov (United States)

    Cho, Sun-Joo; Wilmer, Jeremy; Herzmann, Grit; McGugin, Rankin; Fiset, Daniel; Van Gulick, Ana E.; Ryan, Katie; Gauthier, Isabel

    2014-01-01

    We evaluated the psychometric properties of the Cambridge face memory test (CFMT; Duchaine & Nakayama, 2006). First, we assessed the dimensionality of the test with a bi-factor exploratory factor analysis (EFA). This EFA analysis revealed a general factor and three specific factors clustered by targets of CFMT. However, the three specific factors appeared to be minor factors that can be ignored. Second, we fit a unidimensional item response model. This item response model showed that the CFMT items could discriminate individuals at different ability levels and covered a wide range of the ability continuum. We found the CFMT to be particularly precise for a wide range of ability levels. Third, we implemented item response theory (IRT) differential item functioning (DIF) analyses for each gender group and two age groups (Age ≤ 20 versus Age > 21). This DIF analysis suggested little evidence of consequential differential functioning on the CFMT for these groups, supporting the use of the test to compare older to younger, or male to female, individuals. Fourth, we tested for a gender difference on the latent facial recognition ability with an explanatory item response model. We found a significant but small gender difference on the latent ability for face recognition, which was higher for women than men by 0.184, at age mean 23.2, controlling for linear and quadratic age effects. Finally, we discuss the practical considerations of the use of total scores versus IRT scale scores in applications of the CFMT. PMID:25642930

  9. A symptom profile of depression among Asian Americans: is there evidence for differential item functioning of depressive symptoms?

    Science.gov (United States)

    Kalibatseva, Z; Leong, F T L; Ham, E H

    2014-09-01

    Theoretical and clinical publications suggest the existence of cultural differences in the expression and experience of depression. Measurement non-equivalence remains a potential methodological explanation for the lower prevalence of depression among Asian Americans compared to European Americans. This study compared DSM-IV depressive symptoms among Asian Americans and European Americans using secondary data analysis of the Collaborative Psychiatric Epidemiology Surveys (CPES). The Composite International Diagnostic Interview (CIDI) was used for the assessment of depressive symptoms. Of the entire sample, 310 Asian Americans and 1974 European Americans reported depressive symptoms and were included in the analyses. Measurement variance was examined with an item response theory differential item functioning (IRT DIF) analysis. χ2 analyses indicated that, compared to Asian Americans, European American participants more frequently endorsed affective symptoms such as 'feeling depressed', 'feeling discouraged' and 'cried more often'. The IRT analysis detected DIF for four out of the 15 depression symptom items. At equal levels of depression, Asian Americans endorsed feeling worthless and appetite changes more easily than European Americans, and European Americans endorsed feeling nervous and crying more often than Asian Americans. Asian Americans did not seem to over-report somatic symptoms; however, European Americans seemed to report more affective symptoms than Asian Americans. The results suggest that there was measurement variance in a few of the depression items.

  10. A review of the effects on IRT item parameter estimates with a focus on misbehaving common items in test equating

    Directory of Open Access Journals (Sweden)

    Michalis P Michaelides

    2010-10-01

    Full Text Available Many studies have investigated the topic of change or drift in item parameter estimates in the context of Item Response Theory. Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.

  11. A Review of the Effects on IRT Item Parameter Estimates with a Focus on Misbehaving Common Items in Test Equating.

    Science.gov (United States)

    Michaelides, Michalis P

    2010-01-01

    Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.

  12. Development and Evaluation of the PROMIS® Pediatric Positive Affect Item Bank, Child-Report and Parent-Proxy Editions.

    Science.gov (United States)

    Forrest, Christopher B; Ravens-Sieberer, Ulrike; Devine, Janine; Becker, Brandon D; Teneralli, Rachel; Moon, JeanHee; Carle, Adam; Tucker, Carole A; Bevans, Katherine B

    2018-03-01

    The purpose of this study is to describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Positive Affect item bank, child-report and parent-proxy editions. The initial item pool comprising 53 items, previously developed using qualitative methods, was administered to 1,874 children 8-17 years old and 909 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and construct validity. A total of 14 items were deleted, because of poor psychometric performance, and an 8-item short form constructed from the remaining 39 items was administered to a national sample of 1,004 children 8-17 years old, and 1,306 parents of children 5-17 years old. The combined sample was used in item response theory (IRT) calibration analyses. The final item bank appeared unidimensional, the items appeared locally independent, and the items were free from differential item functioning. The scales showed excellent reliability and convergent and discriminant validity. Positive affect decreased with children's age and was lower for those with a special health care need. After IRT calibration, we found that 4 and 8 item short forms had a high degree of precision (reliability) across a wide range of the latent trait (>4 SD units). The PROMIS Pediatric Positive Affect item bank and its short forms provide an efficient, precise, and valid assessment of positive affect in children and youth.

  13. The Dif Identification in Constructed Response Items Using Partial Credit Model

    Directory of Open Access Journals (Sweden)

    Heri Retnawati

    2017-10-01

    Full Text Available The study was to identify the load, the type and the significance of differential item functioning (DIF in constructed response item using the partial credit model (PCM. The data in the study were the students’ instruments and the students’ responses toward the PISA-like test items that had been completed by 386 ninth grade students and 460 tenth grade students who had been about 15 years old in the Province of Yogyakarta Special Region in Indonesia. The analysis toward the item characteristics through the student categorization based on their class was conducted toward the PCM using CONQUEST software. Furthermore, by applying these items characteristics, the researcher draw the category response function (CRF graphic in order to identify whether the type of DIF content had been in uniform or non-uniform. The significance of DIF was identified by comparing the discrepancy between the difficulty level parameter and the error in the CONQUEST output results. The results of the analysis showed that from 18 items that had been analyzed there were 4 items which had not been identified load DIF, there were 5 items that had been identified containing DIF but not statistically significant and there were 9 items that had been identified containing DIF significantly. The causes of items containing DIF were discussed.

  14. Development and psychometric evaluation of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions.

    Science.gov (United States)

    Forrest, Christopher B; Devine, Janine; Bevans, Katherine B; Becker, Brandon D; Carle, Adam C; Teneralli, Rachel E; Moon, JeanHee; Tucker, Carole A; Ravens-Sieberer, Ulrike

    2018-01-01

    To describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions. A pool of 55 life satisfaction items was administered to 1992 children 8-17 years old and 964 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and assessment of construct validity. Thirteen items were deleted because of poor psychometric performance. An 8-item short form was administered to a national sample of 996 children 8-17 years old, and 1294 parents of children 5-17 years old. The combined sample (2988 children and 2258 parents) was used in item response theory (IRT) calibration analyses. The final item banks were unidimensional, the items were locally independent, and the items were free from impactful differential item functioning. The 8-item and 4-item short form scales showed excellent reliability, convergent validity, and discriminant validity. Life satisfaction decreased with declining socio-economic status, presence of a special health care need, and increasing age for girls, but not boys. After IRT calibration, we found that 4- and 8-item short forms had a high degree of precision (reliability) across a wide range (>4 SD units) of the latent variable. The PROMIS Pediatric Life Satisfaction item banks and their short forms provide efficient, precise, and valid assessments of life satisfaction in children and youth.

  15. A Psychometric Evaluation of the DSM-IV Criteria for Antisocial Personality Disorder: Dimensionality, Local Reliability, and Differential Item Functioning Across Gender.

    Science.gov (United States)

    Paap, Muirne C S; Braeken, Johan; Pedersen, Geir; Urnes, Øyvind; Karterud, Sigmund; Wilberg, Theresa; Hummelen, Benjamin

    2017-12-01

    This study aims at evaluating the psychometric properties of the antisocial personality disorder (ASPD) criteria in a large sample of patients, most of whom had one or more personality disorders (PD). PD diagnoses were assessed by experienced clinicians using the Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders, 4th edition, Axis II PDs. Analyses were performed within an item response theory framework. Results of the analyses indicated that ASPD is a unidimensional construct that can be measured reliably at the upper range of the latent trait scale. Differential item functioning across gender was restricted to two criteria and had little impact on the latent ASPD trait level. Patients fulfilling both the adult ASPD criteria and the conduct disorder criteria had similar latent trait distributions as patients fulfilling only the adult ASPD criteria. Overall, the ASPD items fit the purpose of a diagnostic instrument well, that is, distinguishing patients with moderate from those with high antisocial personality scores.

  16. Item analysis of ADAS-Cog: effect of baseline cognitive impairment in a clinical AD trial.

    Science.gov (United States)

    Sevigny, Jeffrey J; Peng, Yahong; Liu, Lian; Lines, Christopher R

    2010-03-01

    We explored the association of Alzheimer's disease (AD) Assessment Scale (ADAS-Cog) item scores with AD severity using cross-sectional and longitudinal data from the same study. Post hoc analyses were performed using placebo data from a 12-month trial of patients with mild-to-moderate AD (N =281 randomized, N =209 completed). Baseline distributions of ADAS-Cog item scores by Mini-Mental State Examination (MMSE) score and Clinical Dementia Rating (CDR) sum of boxes score (measures of dementia severity) were estimated using local and nonparametric regressions. Mixed-effect models were used to characterize ADAS-Cog item score changes over time by dementia severity (MMSE: mild =21-26, moderate =14-20; global CDR: mild =0.5-1, moderate =2). In the cross-sectional analysis of baseline ADAS-Cog item scores, orientation was the most sensitive item to differentiate patients across levels of cognitive impairment. Several items showed a ceiling effect, particularly in milder AD. In the longitudinal analysis of change scores over 12 months, orientation was the only item with noticeable decline (8%-10%) in mild AD. Most items showed modest declines (5%-20%) in moderate AD.

  17. Item Response Theory analysis of Fagerström Test for Cigarette Dependence.

    Science.gov (United States)

    Svicher, Andrea; Cosci, Fiammetta; Giannini, Marco; Pistelli, Francesco; Fagerström, Karl

    2018-02-01

    The Fagerström Test for Cigarette Dependence (FTCD) and the Heaviness of Smoking Index (HSI) are the gold standard measures to assess cigarette dependence. However, FTCD reliability and factor structure have been questioned and HSI psychometric properties are in need of further investigations. The present study examined the psychometrics properties of the FTCD and the HSI via the Item Response Theory. The study was a secondary analysis of data collected in 862 Italian daily smokers. Confirmatory factor analysis was run to evaluate the dimensionality of FTCD. A Grade Response Model was applied to FTCD and HSI to verify the fit to the data. Both item and test functioning were analyzed and item statistics, Test Information Function, and scale reliabilities were calculated. Mokken Scale Analysis was applied to estimate homogeneity and Loevinger's coefficients were calculated. The FTCD showed unidimensionality and homogeneity for most of the items and for the total score. It also showed high sensitivity and good reliability from medium to high levels of cigarette dependence, although problems related to some items (i.e., items 3 and 5) were evident. HSI had good homogeneity, adequate item functioning, and high reliability from medium to high levels of cigarette dependence. Significant Differential Item Functioning was found for items 1, 4, 5 of the FTCD and for both items of HSI. HSI seems highly recommended in clinical settings addressed to heavy smokers while FTCD would be better used in smokers with a level of cigarette dependence ranging between low and high. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Converging evidence for control of color-word Stroop interference at the item level.

    Science.gov (United States)

    Bugg, Julie M; Hutchison, Keith A

    2013-04-01

    Prior studies have shown that cognitive control is implemented at the list and context levels in the color-word Stroop task. At first blush, the finding that Stroop interference is reduced for mostly incongruent items as compared with mostly congruent items (i.e., the item-specific proportion congruence [ISPC] effect) appears to provide evidence for yet a third level of control, which modulates word reading at the item level. However, evidence to date favors the view that ISPC effects reflect the rapid prediction of high-contingency responses and not item-specific control. In Experiment 1, we first show that an ISPC effect is obtained when the relevant dimension (i.e., color) signals proportion congruency, a problematic pattern for theories based on differential response contingencies. In Experiment 2, we replicate and extend this pattern by showing that item-specific control settings transfer to new stimuli, ruling out alternative frequency-based accounts. In Experiment 3, we revert to the traditional design in which the irrelevant dimension (i.e., word) signals proportion congruency. Evidence for item-specific control, including transfer of the ISPC effect to new stimuli, is apparent when 4-item sets are employed but not when 2-item sets are employed. We attribute this pattern to the absence of high-contingency responses on incongruent trials in the 4-item set. These novel findings provide converging evidence for reactive control of color-word Stroop interference at the item level, reveal theoretically important factors that modulate reliance on item-specific control versus contingency learning, and suggest an update to the item-specific control account (Bugg, Jacoby, & Chanani, 2011).

  19. Item Purification Does Not Always Improve DIF Detection: A Counterexample with Angoff's Delta Plot

    Science.gov (United States)

    Magis, David; Facon, Bruno

    2013-01-01

    Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score-based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores to get purified sets of items, unaffected by DIF. The…

  20. Bayes Factor Covariance Testing in Item Response Models.

    Science.gov (United States)

    Fox, Jean-Paul; Mulder, Joris; Sinharay, Sandip

    2017-12-01

    Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning the underlying covariance structure are evaluated using (fractional) Bayes factor tests. The support for a unidimensional factor (i.e., assumption of local independence) and differential item functioning are evaluated by testing the covariance components. The posterior distribution of common covariance components is obtained in closed form by transforming latent responses with an orthogonal (Helmert) matrix. This posterior distribution is defined as a shifted-inverse-gamma, thereby introducing a default prior and a balanced prior distribution. Based on that, an MCMC algorithm is described to estimate all model parameters and to compute (fractional) Bayes factor tests. Simulation studies are used to show that the (fractional) Bayes factor tests have good properties for testing the underlying covariance structure of binary response data. The method is illustrated with two real data studies.

  1. Analysis of Nonequivalent Assessments across Different Linguistic Groups Using a Mixed Methods Approach: Understanding the Causes of Differential Item Functioning by Cognitive Interviewing

    Science.gov (United States)

    Benítez, Isabel; Padilla, José-Luis

    2014-01-01

    Differential item functioning (DIF) can undermine the validity of cross-lingual comparisons. While a lot of efficient statistics for detecting DIF are available, few general findings have been found to explain DIF results. The objective of the article was to study DIF sources by using a mixed method design. The design involves a quantitative phase…

  2. Examination of the PROMIS upper extremity item bank.

    Science.gov (United States)

    Hung, Man; Voss, Maren W; Bounsanga, Jerry; Crum, Anthony B; Tyser, Andrew R

    Clinical measurement. The psychometric properties of the PROMIS v1.2 UE item bank were tested on various samples prior to its release, but have not been fully evaluated among the orthopaedic population. This study assesses the performance of the UE item bank within the UE orthopaedic patient population. The UE item bank was administered to 1197 adult patients presenting to a tertiary orthopaedic clinic specializing in hand and UE conditions and was examined using traditional statistics and Rasch analysis. The UE item bank fits a unidimensional model (outfit MNSQ range from 0.64 to 1.70) and has adequate reliabilities (person = 0.84; item = 0.82) and local independence (item residual correlations range from -0.37 to 0.34). Only one item exhibits gender differential item functioning. Most items target low levels of function. The UE item bank is a useful clinical assessment tool. Additional items covering higher functions are needed to enhance validity. Supplemental testing is recommended for patients at higher levels of function until more high function UE items are developed. 2c. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.

  3. The emotional memory effect: differential processing or item distinctiveness?

    Science.gov (United States)

    Schmidt, Stephen R; Saari, Bonnie

    2007-12-01

    A color-naming task was followed by incidental free recall to investigate how emotional words affect attention and memory. We compared taboo, nonthreatening negative-affect, and neutral words across three experiments. As compared with neutral words, taboo words led to longer color-naming times and better memory in both within- and between-subjects designs. Color naming of negative-emotion nontaboo words was slower than color naming of neutral words only during block presentation and at relatively short interstimulus intervals (ISIs). The nontaboo emotion words were remembered better than neutral words following blocked and random presentation and at both long and short ISIs, but only in mixed-list designs. Our results support multifactor theories of the effects of emotion on attention and memory. As compared with neutral words, threatening stimuli received increased attention, poststimulus elaboration, and benefit from item distinctiveness, whereas nonthreatening emotional stimuli benefited only from increased item distinctiveness.

  4. Calibration of the PROMIS physical function item bank in Dutch patients with rheumatoid arthritis.

    Directory of Open Access Journals (Sweden)

    Martijn A H Oude Voshaar

    Full Text Available OBJECTIVE: To calibrate the Dutch-Flemish version of the PROMIS physical function (PF item bank in patients with rheumatoid arthritis (RA and to evaluate cross-cultural measurement equivalence with US general population and RA data. METHODS: Data were collected from RA patients enrolled in the Dutch DREAM registry. An incomplete longitudinal anchored design was used where patients completed all 121 items of the item bank over the course of three waves of data collection. Item responses were fit to a generalized partial credit model adapted for longitudinal data and the item parameters were examined for differential item functioning (DIF across country, age, and sex. RESULTS: In total, 690 patients participated in the study at time point 1 (T2, N = 489; T3, N = 311. The item bank could be successfully fitted to a generalized partial credit model, with the number of misfitting items falling within acceptable limits. Seven items demonstrated DIF for sex, while 5 items showed DIF for age in the Dutch RA sample. Twenty-five (20% items were flagged for cross-cultural DIF compared to the US general population. However, the impact of observed DIF on total physical function estimates was negligible. DISCUSSION: The results of this study showed that the PROMIS PF item bank adequately fit a unidimensional IRT model which provides support for applications that require invariant estimates of physical function, such as computer adaptive testing and targeted short forms. More studies are needed to further investigate the cross-cultural applicability of the US-based PROMIS calibration and standardized metric.

  5. The differential item functioning and structural equivalence of a nonverbal cognitive ability test for five language groups

    Directory of Open Access Journals (Sweden)

    Pieter Schaap

    2011-10-01

    Research purpose: The aim of the study was to determine the differential item functioning (DIF and structural equivalence of a nonverbal cognitive ability test (the PiB/SpEEx Observance test [401] for five South African language groups. Motivation for study: Cultural and language group sensitive tests can lead to unfair discrimination and is a contentious workplace issue in South Africa today. Misconceptions about psychometric testing in industry can cause tests to lose credibility if industries do not use a scientifically sound test-by-test evaluation approach. Research design, approach and method: The researcher used a quasi-experimental design and factor analytic and logistic regression techniques to meet the research aims. The study used a convenience sample drawn from industry and an educational institution. Main findings: The main findings of the study show structural equivalence of the test at a holistic level and nonsignificant DIF effect sizes for most of the comparisons that the researcher made. Practical/managerial implications: This research shows that the PIB/SpEEx Observance Test (401 is not completely language insensitive. One should see it rather as a language-reduced test when people from different language groups need testing. Contribution/value-add: The findings provide supporting evidence that nonverbal cognitive tests are plausible alternatives to verbal tests when one compares people from different language groups.

  6. Quality of life in infants and children with atopic dermatitis: Addressing issues of differential item functioning across countries in multinational clinical trials

    Directory of Open Access Journals (Sweden)

    Tennant Alan

    2007-07-01

    Full Text Available Abstract Background A previous study had identified 45 items assessing the impact of atopic dermatitis (AD on the whole family. From these it was intended to develop two separate scales, one assessing impact on carers and the other determining the effect on the child. Methods The 45 items were included in three clinical trials designed to test the efficacy of a new topical treatment (pimecrolimus, Elidel cream 1% in the treatment of AD in infants and children and in validation studies in the UK, US, Germany, France and the Netherlands. Rasch analyses were undertaken to determine whether an internationally valid, unidimensional scale could be developed that would inform on the direct impact of AD on the child. Results Rasch analyses applied to the data from the trials indicated that the draft measure consisted of two scales, one assessing the QoL of the carer and the other (consisting of 12 items measuring the impact of AD on the child. Three of the 12 potential items failed to fit the measurement model in Europe and five in the US. In addition, four items exhibiting differential item functioning (DIF by country were identified. After removing the misfitting items and controlling for DIF it was possible to derive a scale; The Childhood Impact of Atopic Dermatitis (CIAD with good item fit for each trial analysis. Analysis of the validation data from each of the different countries confirmed that the CIAD had adequate internal consistency, reproducibility and construct validity. The CIAD demonstrated the benefits of treatment with Elidel over placebo in the European trial. A similar (non-significant trend was found for the US trials. Conclusion The study represents a novel method of dealing with the problem of DIF associated with different cultures. Such problems are likely to arise in any multinational study involving patient-reported outcome measures, as items in the scales are likely to be valued differently in different cultures. However, where

  7. The Effects of Item Format and Cognitive Domain on Students' Science Performance in TIMSS 2011

    Science.gov (United States)

    Liou, Pey-Yan; Bulut, Okan

    2017-12-01

    The purpose of this study was to examine eighth-grade students' science performance in terms of two test design components, item format, and cognitive domain. The portion of Taiwanese data came from the 2011 administration of the Trends in International Mathematics and Science Study (TIMSS), one of the major international large-scale assessments in science. The item difficulty analysis was initially applied to show the proportion of correct items. A regression-based cumulative link mixed modeling (CLMM) approach was further utilized to estimate the impact of item format, cognitive domain, and their interaction on the students' science scores. The results of the proportion-correct statistics showed that constructed-response items were more difficult than multiple-choice items, and that the reasoning cognitive domain items were more difficult compared to the items in the applying and knowing domains. In terms of the CLMM results, students tended to obtain higher scores when answering constructed-response items as well as items in the applying cognitive domain. When the two predictors and the interaction term were included together, the directions and magnitudes of the predictors on student science performance changed substantially. Plausible explanations for the complex nature of the effects of the two test-design predictors on student science performance are discussed. The results provide practical, empirical-based evidence for test developers, teachers, and stakeholders to be aware of the differential function of item format, cognitive domain, and their interaction in students' science performance.

  8. Linking Existing Instruments to Develop an Activity of Daily Living Item Bank.

    Science.gov (United States)

    Li, Chih-Ying; Romero, Sergio; Bonilha, Heather S; Simpson, Kit N; Simpson, Annie N; Hong, Ickpyo; Velozo, Craig A

    2018-03-01

    This study examined dimensionality and item-level psychometric properties of an item bank measuring activities of daily living (ADL) across inpatient rehabilitation facilities and community living centers. Common person equating method was used in the retrospective veterans data set. This study examined dimensionality, model fit, local independence, and monotonicity using factor analyses and fit statistics, principal component analysis (PCA), and differential item functioning (DIF) using Rasch analysis. Following the elimination of invalid data, 371 veterans who completed both the Functional Independence Measure (FIM) and minimum data set (MDS) within 6 days were retained. The FIM-MDS item bank demonstrated good internal consistency (Cronbach's α = .98) and met three rating scale diagnostic criteria and three of the four model fit statistics (comparative fit index/Tucker-Lewis index = 0.98, root mean square error of approximation = 0.14, and standardized root mean residual = 0.07). PCA of Rasch residuals showed the item bank explained 94.2% variance. The item bank covered the range of θ from -1.50 to 1.26 (item), -3.57 to 4.21 (person) with person strata of 6.3. The findings indicated the ADL physical function item bank constructed from FIM and MDS measured a single latent trait with overall acceptable item-level psychometric properties, suggesting that it is an appropriate source for developing efficient test forms such as short forms and computerized adaptive tests.

  9. Concreteness effects in short-term memory: a test of the item-order hypothesis.

    Science.gov (United States)

    Roche, Jaclynn; Tolan, G Anne; Tehan, Gerald

    2011-12-01

    The following experiments explore word length and concreteness effects in short-term memory within an item-order processing framework. This framework asserts order memory is better for those items that are relatively easy to process at the item level. However, words that are difficult to process benefit at the item level for increased attention/resources being applied. The prediction of the model is that differential item and order processing can be detected in episodic tasks that differ in the degree to which item or order memory are required by the task. The item-order account has been applied to the word length effect such that there is a short word advantage in serial recall but a long word advantage in item recognition. The current experiment considered the possibility that concreteness effects might be explained within the same framework. In two experiments, word length (Experiment 1) and concreteness (Experiment 2) are examined using forward serial recall, backward serial recall, and item recognition. These results for word length replicate previous studies showing the dissociation in item and order tasks. The same was not true for the concreteness effect. In all three tasks concrete words were better remembered than abstract words. The concreteness effect cannot be explained in terms of an item-order trade off. PsycINFO Database Record (c) 2011 APA, all rights reserved.

  10. Cross-cultural differences in knee functional status outcomes in a polyglot society represented true disparities not biased by differential item functioning.

    Science.gov (United States)

    Deutscher, Daniel; Hart, Dennis L; Crane, Paul K; Dickstein, Ruth

    2010-12-01

    Comparative effectiveness research across cultures requires unbiased measures that accurately detect clinical differences between patient groups. The purpose of this study was to assess the presence and impact of differential item functioning (DIF) in knee functional status (FS) items administered using computerized adaptive testing (CAT) as a possible cause for observed differences in outcomes between 2 cultural patient groups in a polyglot society. This study was a secondary analysis of prospectively collected data. We evaluated data from 9,134 patients with knee impairments from outpatient physical therapy clinics in Israel. Items were analyzed for DIF related to sex, age, symptom acuity, surgical history, exercise history, and language used to complete the functional survey (Hebrew versus Russian). Several items exhibited DIF, but unadjusted FS estimates and FS estimates that accounted for DIF were essentially equal (intraclass correlation coefficient [2,1]>.999). No individual patient had a difference between unadjusted and adjusted FS estimates as large as the median standard error of the unadjusted estimates. Differences between groups defined by any of the covariates considered were essentially unchanged when using adjusted instead of unadjusted FS estimates. The greatest group-level impact was <0.3% of 1 standard deviation of the unadjusted FS estimates. Complete data where patients answered all items in the scale would have been preferred for DIF analysis, but only CAT data were available. Differences in FS outcomes between groups of patients with knee impairments who answered the knee CAT in Hebrew or Russian in Israel most likely reflected true differences that may reflect societal disparities in this health outcome.

  11. Using Explanatory Item Response Models to Evaluate Complex Scientific Tasks Designed for the Next Generation Science Standards

    Science.gov (United States)

    Chiu, Tina

    This dissertation includes three studies that analyze a new set of assessment tasks developed by the Learning Progressions in Middle School Science (LPS) Project. These assessment tasks were designed to measure science content knowledge on the structure of matter domain and scientific argumentation, while following the goals from the Next Generation Science Standards (NGSS). The three studies focus on the evidence available for the success of this design and its implementation, generally labelled as "validity" evidence. I use explanatory item response models (EIRMs) as the overarching framework to investigate these assessment tasks. These models can be useful when gathering validity evidence for assessments as they can help explain student learning and group differences. In the first study, I explore the dimensionality of the LPS assessment by comparing the fit of unidimensional, between-item multidimensional, and Rasch testlet models to see which is most appropriate for this data. By applying multidimensional item response models, multiple relationships can be investigated, and in turn, allow for a more substantive look into the assessment tasks. The second study focuses on person predictors through latent regression and differential item functioning (DIF) models. Latent regression models show the influence of certain person characteristics on item responses, while DIF models test whether one group is differentially affected by specific assessment items, after conditioning on latent ability. Finally, the last study applies the linear logistic test model (LLTM) to investigate whether item features can help explain differences in item difficulties.

  12. Examining Differential Math Performance by Gender and Opportunity to Learn

    Science.gov (United States)

    Albano, Anthony D.; Rodriguez, Michael C.

    2013-01-01

    Although a substantial amount of research has been conducted on differential item functioning in testing, studies have focused on detecting differential item functioning rather than on explaining how or why it may occur. Some recent work has explored sources of differential functioning using explanatory and multilevel item response models. This…

  13. Use of differential item functioning (DIF analysis for bias analysis in test construction

    Directory of Open Access Journals (Sweden)

    Marié De Beer

    2004-10-01

    Opsomming Waar differensiële itemfunksioneringsprosedures (DIF-prosedures vir itemontleding gebaseer op itemresponsteorie (IRT tydens toetskonstruksie gebruik word, is dit moontlik om itemkarakteristiekekrommes vir dieselfde item vir verskillende subgroepe voor te stel. Hierdie krommes dui aan hoe elke item vir die verskillende subgroepe op verskillende vermoënsvlakke te funksioneer. DIF word aangetoon deur die area tussen die krommes. DIF is in die konstruksie van die 'Learning Potential Computerised Adaptive test (LPCAT' gebruik om die items te identifiseer wat sydigheid ten opsigte van geslag, kultuur, taal of opleidingspeil geopenbaar het. Items wat ’n voorafbepaalde vlak van DIF oorskry het, is uit die finale itembank weggelaat, ongeag die subgroep wat bevoordeel of benadeel is. Die proses en resultate van die DIF-ontleding word bespreek.

  14. Beneficial effects of semantic memory support on older adults' episodic memory: Differential patterns of support of item and associative information.

    Science.gov (United States)

    Mohanty, Praggyan Pam; Naveh-Benjamin, Moshe; Ratneshwar, Srinivasan

    2016-02-01

    The effects of two types of semantic memory support-meaningfulness of an item and relatedness between items-in mitigating age-related deficits in item and associative, memory are examined in a marketing context. In Experiment 1, participants studied less (vs. more) meaningful brand logo graphics (pictures) paired with meaningful brand names (words) and later were assessed by item (old/new) and associative (intact/recombined) memory recognition tests. Results showed that meaningfulness of items eliminated age deficits in item memory, while equivalently boosting associative memory for older and younger adults. Experiment 2, in which related and unrelated brand logo graphics and brand name pairs served as stimuli, revealed that relatedness between items eliminated age deficits in associative memory, while improving to the same degree item memory in older and younger adults. Experiment 2 also provided evidence for a probable boundary condition that could reconcile seemingly contradictory extant results. Overall, these experiments provided evidence that although the two types of semantic memory support can improve both item and associative memory in older and younger adults, older adults' memory deficits can be eliminated when the type of support provided is compatible with the type of information required to perform well on the test. (c) 2016 APA, all rights reserved).

  15. Validation of a mobility item bank for older patients in primary care.

    Science.gov (United States)

    Cabrero-García, Julio; Ramos-Pichardo, Juan Diego; Muñoz-Mendoza, Carmen Luz; Cabañero-Martínez, María José; González-Llopis, Lorena; Reig-Ferrer, Abilio

    2012-12-05

    To develop and validate an item bank to measure mobility in older people in primary care and to analyse differential item functioning (DIF) and differential bundle functioning (DBF) by sex. A pool of 48 mobility items was administered by interview to 593 older people attending primary health care practices. The pool contained four domains based on the International Classification of Functioning: changing and maintaining body position, carrying, lifting and pushing, walking and going up and down stairs. The Late Life Mobility item bank consisted of 35 items, and measured with a reliability of 0.90 or more across the full spectrum of mobility, except at the higher end of better functioning. No evidence was found of non-uniform DIF but uniform DIF was observed, mainly for items in the changing and maintaining body position and carrying, lifting and pushing domains. The walking domain did not display DBF, but the other three domains did, principally the carrying, lifting and pushing items. During the design and validation of an item bank to measure mobility in older people, we found that strength (carrying, lifting and pushing) items formed a secondary dimension that produced DBF. More research is needed to determine how best to include strength items in a mobility measure, or whether it would be more appropriate to design separate measures for each construct.

  16. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    Science.gov (United States)

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  17. Carcinoma Showing Thymus-Like Differentiation (CASTLE of Thyroid: A Case Report and Literature Review

    Directory of Open Access Journals (Sweden)

    Leong-Perng Chan

    2008-11-01

    Full Text Available Carcinoma showing thymus-like differentiation (CASTLE is a rare malignant neoplasm that occurs in the thyroid gland, or head and neck. This tumor arises from either ectopic thymus tissue or remnants of branchial pouches, which retain the potential to differentiate along the thymus line. Clinical presentation and imaging can be consistent with a malignant lesion such as thyroid cancer or thymic carcinoma. Immunohistochemical staining with CD5 can differentiate CASTLE from other malignant thyroid neoplasms. A 54-year-old male had initially presented with a painless, left neck mass for 3 months. He underwent left thyroid lobectomy via a median sternotomy approach. Carcinoma showing thymus-like differentiation was the final histopathologic diagnosis. After 36 months of follow-up, no evidence of recurrence was observed. A median sternotomy is an excellent approach for CASTLE with anterior mediastinum involvement. Complete resection is important to improve the long-term survival rate and the locoregional recurrence rate.

  18. Differential item functioning of pathological gambling criteria: an examination of gender, race/ethnicity, and age.

    Science.gov (United States)

    Sacco, Paul; Torres, Luis R; Cunningham-Williams, Renee M; Woods, Carol; Unick, G Jay

    2011-06-01

    This study tested for the presence of differential item functioning (DIF) in DSM-IV Pathological Gambling Disorder (PGD) criteria based on gender, race/ethnicity and age. Using a nationally representative sample of adults from the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC), indicating current gambling (n = 10,899), Multiple Indicator-Multiple Cause (MIMIC) models tested for DIF, controlling for income, education, and marital status. Compared to the reference groups (i.e., Male, Caucasian, and ages 25-59 years), women (OR = 0.62; P gambling to escape (Criterion 5) (OR = 2.22; P < .001) but young adults (OR = 0.62; P < .05) were less likely to endorse it. African Americans (OR = 2.50; P < .001) and Hispanics were more likely to endorse trying to cut back (Criterion 3) (OR = 2.01; P < .01). African Americans were more likely to endorse the suffering losses (OR = 2.27; P < .01) criterion. Young adults were more likely to endorse chasing losses (Criterion 9) (OR = 1.81; P < .01) while older adults were less likely to endorse this criterion (OR = 0.76; P < .05). Further research is needed to identify factors contributing to DIF, address criteria level bias, and examine differential test functioning.

  19. The Dif Identification in Constructed Response Items Using Partial Credit Model

    OpenAIRE

    Heri Retnawati

    2017-01-01

    The study was to identify the load, the type and the significance of differential item functioning (DIF) in constructed response item using the partial credit model (PCM). The data in the study were the students’ instruments and the students’ responses toward the PISA-like test items that had been completed by 386 ninth grade students and 460 tenth grade students who had been about 15 years old in the Province of Yogyakarta Special Region in Indonesia. The analysis toward the item characteris...

  20. The comparability of English, French and Dutch scores on the Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F: an assessment of differential item functioning in patients with systemic sclerosis.

    Directory of Open Access Journals (Sweden)

    Linda Kwakkenbos

    Full Text Available The Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F is commonly used to assess fatigue in rheumatic diseases, and has shown to discriminate better across levels of the fatigue spectrum than other commonly used measures. The aim of this study was to assess the cross-language measurement equivalence of the English, French, and Dutch versions of the FACIT-F in systemic sclerosis (SSc patients.The FACIT-F was completed by 871 English-speaking Canadian, 238 French-speaking Canadian and 230 Dutch SSc patients. Confirmatory factor analysis was used to assess the factor structure in the three samples. The Multiple-Indicator Multiple-Cause (MIMIC model was utilized to assess differential item functioning (DIF, comparing English versus French and versus Dutch patient responses separately.A unidimensional factor model showed good fit in all samples. Comparing French versus English patients, statistically significant, but small-magnitude DIF was found for 3 of 13 items. French patients had 0.04 of a standard deviation (SD lower latent fatigue scores than English patients and there was an increase of only 0.03 SD after accounting for DIF. For the Dutch versus English comparison, 4 items showed small, but statistically significant, DIF. Dutch patients had 0.20 SD lower latent fatigue scores than English patients. After correcting for DIF, there was a reduction of 0.16 SD in this difference.There was statistically significant DIF in several items, but the overall effect on fatigue scores was minimal. English, French and Dutch versions of the FACIT-F can be reasonably treated as having equivalent scoring metrics.

  1. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions

    Directory of Open Access Journals (Sweden)

    Yoon Soo ePark

    2016-02-01

    Full Text Available This study investigates the impact of item parameter drift (IPD on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effect on item parameters and examinee ability.

  2. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions.

    Science.gov (United States)

    Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

    2016-01-01

    This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.

  3. Fitting a Mixture Rasch Model to English as a Foreign Language Listening Tests: The Role of Cognitive and Background Variables in Explaining Latent Differential Item Functioning

    Science.gov (United States)

    Aryadoust, Vahid

    2015-01-01

    The present study uses a mixture Rasch model to examine latent differential item functioning in English as a foreign language listening tests. Participants (n = 250) took a listening and lexico-grammatical test and completed the metacognitive awareness listening questionnaire comprising problem solving (PS), planning and evaluation (PE), mental…

  4. A multi-level differential item functioning analysis of trends in international mathematics and science study: Potential sources of gender and minority difference among U.S. eighth graders' science achievement

    Science.gov (United States)

    Qian, Xiaoyu

    Science is an area where a large achievement gap has been observed between White and minority, and between male and female students. The science minority gap has continued as indicated by the National Assessment of Educational Progress and the Trends in International Mathematics and Science Studies (TIMSS). TIMSS also shows a gender gap favoring males emerging at the eighth grade. Both gaps continue to be wider in the number of doctoral degrees and full professorships awarded (NSF, 2008). The current study investigated both minority and gender achievement gaps in science utilizing a multi-level differential item functioning (DIF) methodology (Kamata, 2001) within fully Bayesian framework. All dichotomously coded items from TIMSS 2007 science assessment at eighth grade were analyzed. Both gender DIF and minority DIF were studied. Multi-level models were employed to identify DIF items and sources of DIF at both student and teacher levels. The study found that several student variables were potential sources of achievement gaps. It was also found that gender DIF favoring male students was more noticeable in the content areas of physics and earth science than biology and chemistry. In terms of item type, the majority of these gender DIF items were multiple choice than constructed response items. Female students also performed less well on items requiring visual-spatial ability. Minority students performed significantly worse on physics and earth science items as well. A higher percentage of minority DIF items in earth science and biology were constructed response than multiple choice items, indicating that literacy may be the cause of minority DIF. Three-level model results suggested that some teacher variables may be the cause of DIF variations from teacher to teacher. It is essential for both middle school science teachers and science educators to find instructional methods that work more effectively to improve science achievement of both female and minority students

  5. 5 CFR 591.212 - How does OPM select survey items?

    Science.gov (United States)

    2010-01-01

    ... 5 Administrative Personnel 1 2010-01-01 2010-01-01 false How does OPM select survey items? 591.212 Section 591.212 Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT CIVIL SERVICE REGULATIONS ALLOWANCES AND DIFFERENTIALS Cost-of-Living Allowance and Post Differential-Nonforeign Areas Cost-Of-Living...

  6. Differential Item Functioning in the SF-36 Physical Functioning and Mental Health Sub-Scales: A Population-Based Investigation in the Canadian Multicentre Osteoporosis Study.

    Science.gov (United States)

    Lix, Lisa M; Wu, Xiuyun; Hopman, Wilma; Mayo, Nancy; Sajobi, Tolulope T; Liu, Juxin; Prior, Jerilynn C; Papaioannou, Alexandra; Josse, Robert G; Towheed, Tanveer E; Davison, K Shawn; Sawatzky, Richard

    2016-01-01

    Self-reported health status measures, like the Short Form 36-item Health Survey (SF-36), can provide rich information about the overall health of a population and its components, such as physical, mental, and social health. However, differential item functioning (DIF), which arises when population sub-groups with the same underlying (i.e., latent) level of health have different measured item response probabilities, may compromise the comparability of these measures. The purpose of this study was to test for DIF on the SF-36 physical functioning (PF) and mental health (MH) sub-scale items in a Canadian population-based sample. Study data were from the prospective Canadian Multicentre Osteoporosis Study (CaMos), which collected baseline data in 1996-1997. DIF was tested using a multiple indicators multiple causes (MIMIC) method. Confirmatory factor analysis defined the latent variable measurement model for the item responses and latent variable regression with demographic and health status covariates (i.e., sex, age group, body weight, self-perceived general health) produced estimates of the magnitude of DIF effects. The CaMos cohort consisted of 9423 respondents; 69.4% were female and 51.7% were less than 65 years. Eight of 10 items on the PF sub-scale and four of five items on the MH sub-scale exhibited DIF. Large DIF effects were observed on PF sub-scale items about vigorous and moderate activities, lifting and carrying groceries, walking one block, and bathing or dressing. On the MH sub-scale items, all DIF effects were small or moderate in size. SF-36 PF and MH sub-scale scores were not comparable across population sub-groups defined by demographic and health status variables due to the effects of DIF, although the magnitude of this bias was not large for most items. We recommend testing and adjusting for DIF to ensure comparability of the SF-36 in population-based investigations.

  7. Differential Item Functioning in the SF-36 Physical Functioning and Mental Health Sub-Scales: A Population-Based Investigation in the Canadian Multicentre Osteoporosis Study.

    Directory of Open Access Journals (Sweden)

    Lisa M Lix

    Full Text Available Self-reported health status measures, like the Short Form 36-item Health Survey (SF-36, can provide rich information about the overall health of a population and its components, such as physical, mental, and social health. However, differential item functioning (DIF, which arises when population sub-groups with the same underlying (i.e., latent level of health have different measured item response probabilities, may compromise the comparability of these measures. The purpose of this study was to test for DIF on the SF-36 physical functioning (PF and mental health (MH sub-scale items in a Canadian population-based sample.Study data were from the prospective Canadian Multicentre Osteoporosis Study (CaMos, which collected baseline data in 1996-1997. DIF was tested using a multiple indicators multiple causes (MIMIC method. Confirmatory factor analysis defined the latent variable measurement model for the item responses and latent variable regression with demographic and health status covariates (i.e., sex, age group, body weight, self-perceived general health produced estimates of the magnitude of DIF effects.The CaMos cohort consisted of 9423 respondents; 69.4% were female and 51.7% were less than 65 years. Eight of 10 items on the PF sub-scale and four of five items on the MH sub-scale exhibited DIF. Large DIF effects were observed on PF sub-scale items about vigorous and moderate activities, lifting and carrying groceries, walking one block, and bathing or dressing. On the MH sub-scale items, all DIF effects were small or moderate in size.SF-36 PF and MH sub-scale scores were not comparable across population sub-groups defined by demographic and health status variables due to the effects of DIF, although the magnitude of this bias was not large for most items. We recommend testing and adjusting for DIF to ensure comparability of the SF-36 in population-based investigations.

  8. Item Banking with Embedded Standards

    Science.gov (United States)

    MacCann, Robert G.; Stanley, Gordon

    2009-01-01

    An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…

  9. Gender differences in national assessment of educational progress science items: What does i don't know really mean?

    Science.gov (United States)

    Linn, Marcia C.; de Benedictis, Tina; Delucchi, Kevin; Harris, Abigail; Stage, Elizabeth

    The National Assessment of Educational Progress Science Assessment has consistently revealed small gender differences on science content items but not on science inquiry items. This assessment differs from others in that respondents can choose I don't know rather than guessing. This paper examines explanations for the gender differences including (a) differential prior instruction, (b) differential response to uncertainty and use of the I don't know response, (c) differential response to figurally presented items, and (d) different attitudes towards science. Of these possible explanations, the first two received support. Females are more likely to use the I don't know response, especially for items with physical science content or masculine themes such as football. To ameliorate this situation we need more effective science instruction and more gender-neutral assessment items.

  10. Evaluation of the Multiple Sclerosis Walking Scale-12 (MSWS-12) in a Dutch sample: Application of item response theory.

    Science.gov (United States)

    Mokkink, Lidwine Brigitta; Galindo-Garre, Francisca; Uitdehaag, Bernard Mj

    2016-12-01

    The Multiple Sclerosis Walking Scale-12 (MSWS-12) measures walking ability from the patients' perspective. We examined the quality of the MSWS-12 using an item response theory model, the graded response model (GRM). A total of 625 unique Dutch multiple sclerosis (MS) patients were included. After testing for unidimensionality, monotonicity, and absence of local dependence, a GRM was fit and item characteristics were assessed. Differential item functioning (DIF) for the variables gender, age, duration of MS, type of MS and severity of MS, reliability, total test information, and standard error of the trait level (θ) were investigated. Confirmatory factor analysis showed a unidimensional structure of the 12 items of the scale, explaining 88% of the variance. Item 2 did not fit into the GRM model. Reliability was 0.93. Items 8 and 9 (of the 11 and 12 item version respectively) showed DIF on the variable severity, based on the Expanded Disability Status Scale (EDSS). However, the EDSS is strongly related to the content of both items. Our results confirm the good quality of the MSWS-12. The trait level (θ) scores and item parameters of both the 12- and 11-item versions were highly comparable, although we do not suggest to change the content of the MSWS-12. © The Author(s), 2016.

  11. Brief Report: Checklist for Autism Spectrum Disorder--Most Discriminating Items for Diagnosing Autism

    Science.gov (United States)

    Mayes, Susan D.

    2018-01-01

    The smallest subset of items from the 30-item Checklist for Autism Spectrum Disorder (CASD) that differentiated 607 referred children (3-17 years) with and without autism with 100% accuracy was identified. This 6-item subset (CASD-Short Form) was cross-validated on an independent sample of 397 referred children (1-18 years) with and without autism…

  12. The Comparability of English, French and Dutch Scores on the Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F): An Assessment of Differential Item Functioning in Patients with Systemic Sclerosis

    Science.gov (United States)

    Kwakkenbos, Linda; Willems, Linda M.; Baron, Murray; Hudson, Marie; Cella, David; van den Ende, Cornelia H. M.; Thombs, Brett D.

    2014-01-01

    Objective The Functional Assessment of Chronic Illness Therapy- Fatigue (FACIT-F) is commonly used to assess fatigue in rheumatic diseases, and has shown to discriminate better across levels of the fatigue spectrum than other commonly used measures. The aim of this study was to assess the cross-language measurement equivalence of the English, French, and Dutch versions of the FACIT-F in systemic sclerosis (SSc) patients. Methods The FACIT-F was completed by 871 English-speaking Canadian, 238 French-speaking Canadian and 230 Dutch SSc patients. Confirmatory factor analysis was used to assess the factor structure in the three samples. The Multiple-Indicator Multiple-Cause (MIMIC) model was utilized to assess differential item functioning (DIF), comparing English versus French and versus Dutch patient responses separately. Results A unidimensional factor model showed good fit in all samples. Comparing French versus English patients, statistically significant, but small-magnitude DIF was found for 3 of 13 items. French patients had 0.04 of a standard deviation (SD) lower latent fatigue scores than English patients and there was an increase of only 0.03 SD after accounting for DIF. For the Dutch versus English comparison, 4 items showed small, but statistically significant, DIF. Dutch patients had 0.20 SD lower latent fatigue scores than English patients. After correcting for DIF, there was a reduction of 0.16 SD in this difference. Conclusions There was statistically significant DIF in several items, but the overall effect on fatigue scores was minimal. English, French and Dutch versions of the FACIT-F can be reasonably treated as having equivalent scoring metrics. PMID:24638101

  13. Factor Structure and Reliability of Test Items for Saudi Teacher Licence Assessment

    Science.gov (United States)

    Alsadaawi, Abdullah Saleh

    2017-01-01

    The Saudi National Assessment Centre administers the Computer Science Teacher Test for teacher certification. The aim of this study is to explore gender differences in candidates' scores, and investigate dimensionality, reliability, and differential item functioning using confirmatory factor analysis and item response theory. The confirmatory…

  14. The effects of value on context-item associative memory in younger and older adults.

    Science.gov (United States)

    Hennessee, Joseph P; Knowlton, Barbara J; Castel, Alan D

    2018-02-01

    Valuable items are often remembered better than items that are less valuable by both older and younger adults, but older adults typically show deficits in binding. Here, we examine whether value affects the quality of recognition memory and the binding of incidental details to valuable items. In Experiment 1, participants learned English words each associated with a point-value they earned for correct recognition with the goal of maximizing their score. In Experiment 2, value was manipulated by presenting items that were either congruent or incongruent with an imagined state of physiological need (e.g., hunger). In Experiment 1, point-value was associated with enhanced recollection in both age groups. Memory for the color associated with the word was in fact reduced for high-value recollected items compared with low-value recollected items, suggesting value selectively enhances binding of task-relevant details. In Experiment 2, memory for learned images was enhanced by value in both age groups. However, value differentially enhanced binding of an imagined context to the item in younger and older adults, with a strong trend for increased binding in younger adults only. These findings suggest that value enhances episodic encoding in both older and younger adults but that binding of associated details may be reduced for valuable items compared to less valuable items, particularly in older adults. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  15. Thyroid-specific questions on work ability showed known-groups validity among Danes with thyroid diseases.

    Science.gov (United States)

    Nexo, Mette Andersen; Watt, Torquil; Bonnema, Steen Joop; Hegedüs, Laszlo; Rasmussen, Åse Krogh; Feldt-Rasmussen, Ulla; Bjorner, Jakob Bue

    2015-07-01

    We aimed to identify the best approach to work ability assessment in patients with thyroid disease by evaluating the factor structure, measurement equivalence, known-groups validity, and predictive validity of a broad set of work ability items. Based on the literature and interviews with thyroid patients, 24 work ability items were selected from previous questionnaires, revised, or developed anew. Items were tested among 632 patients with thyroid disease (non-toxic goiter, toxic nodular goiter, Graves' disease (with or without orbitopathy), autoimmune hypothyroidism, and other thyroid diseases), 391 of which had participated in a study 5 years previously. Responses to select items were compared to general population data. We used confirmatory factor analyses for categorical data, logistic regression analyses and tests of differential item function, and head-to-head comparisons of relative validity in distinguishing known groups. Although all work ability items loaded on a common factor, the optimal factor solution included five factors: role physical, role emotional, thyroid-specific limitations, work limitations (without disease attribution), and work performance. The scale on thyroid-specific limitations showed the most power in distinguishing clinical groups and time since diagnosis. A global single item proved useful for comparisons with the general population, and a thyroid-specific item predicted labor market exclusion within the next 5 years (OR 5.0, 95 % CI 2.7-9.1). Items on work limitations with attribution to thyroid disease were most effective in detecting impact on work ability and showed good predictive validity. Generic work ability items remain useful for general population comparisons.

  16. The processing of inter-item relations as a moderating factor of retrieval-induced forgetting

    OpenAIRE

    Tempel, Tobias; Wippich, Werner

    2012-01-01

    We investigated influences of item generation and emotional valence on retrieval-induced forgetting. Drawing on postulates of the three-factor theory of generation effects, generation tasks differentially affecting the processing of inter-item relations were applied. Whereas retrieval-induced forgetting of freely generated items was moderated by the emotional valence as well as retrieval-induced forgetting of read items, even though in the reverse direction (Experiment 1), fragment completion...

  17. Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain.

    Science.gov (United States)

    Crins, Martine H P; Roorda, Leo D; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Terwee, Caroline B

    2015-01-01

    The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT) model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA). Items were calibrated using the graded response model (GRM), an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF) for language (Dutch vs. English) was examined. Reliability was evaluated based on standard errors and Cronbach's alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire) were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986). Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44). The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF), good reliability (Cronbach's alpha = 0.98), and good construct validity (Pearson correlations between 0.62 and 0.75). A computer adaptive test (CAT) and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed.

  18. Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain.

    Directory of Open Access Journals (Sweden)

    Martine H P Crins

    Full Text Available The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA. Items were calibrated using the graded response model (GRM, an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF for language (Dutch vs. English was examined. Reliability was evaluated based on standard errors and Cronbach's alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986. Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44. The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF, good reliability (Cronbach's alpha = 0.98, and good construct validity (Pearson correlations between 0.62 and 0.75. A computer adaptive test (CAT and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed.

  19. Exploring differential item functioning (DIF) with the Rasch model: a comparison of gender differences on eighth grade science items in the United States and Spain.

    Science.gov (United States)

    Babiar, Tasha Calvert

    2011-01-01

    Traditionally, women and minorities have not been fully represented in science and engineering. Numerous studies have attributed these differences to gaps in science achievement as measured by various standardized tests. Rather than describe mean group differences in science achievement across multiple cultures, this study focused on an in-depth item-level analysis across two countries: Spain and the United States. This study investigated eighth-grade gender differences on science items across the two countries. A secondary purpose of the study was to explore the nature of gender differences using the many-faceted Rasch Model as a way to estimate gender DIF. A secondary analysis of data from the Third International Mathematics and Science Study (TIMSS) was used to address three questions: 1) Does gender DIF in science achievement exist? 2) Is there a relationship between gender DIF and characteristics of the science items? 3) Do the relationships between item characteristics and gender DIF in science items replicate across countries. Participants included 7,087 eight grade students from the United States and 3,855 students from Spain who participated in TIMSS. The Facets program (Linacre and Wright, 1992) was used to estimate gender DIF. The results of the analysis indicate that the content of the item seemed to be related to gender DIF. The analysis also suggests that there is a relationship between gender DIF and item format. No pattern of gender DIF related to cognitive demand was found. The general pattern of gender DIF was similar across the two countries used in the analysis. The strength of item-level analysis as opposed to group mean difference analysis is that gender differences can be detected at the item level, even when no mean differences can be detected at the group level.

  20. The MIMIC Model as a Tool for Differential Bundle Functioning Detection

    Science.gov (United States)

    Finch, W. Holmes

    2012-01-01

    Increasingly, researchers interested in identifying potentially biased test items are encouraged to use a confirmatory, rather than exploratory, approach. One such method for confirmatory testing is rooted in differential bundle functioning (DBF), where hypotheses regarding potential differential item functioning (DIF) for sets of items (bundles)…

  1. Differential gene expression of two extreme honey bee (Apis mellifera) colonies showing varroa tolerance and susceptibility.

    Science.gov (United States)

    Jiang, S; Robertson, T; Mostajeran, M; Robertson, A J; Qiu, X

    2016-06-01

    Varroa destructor, an ectoparasitic mite of honey bees (Apis mellifera), is the most serious pest threatening the apiculture industry. In our honey bee breeding programme, two honey bee colonies showing extreme phenotypes for varroa tolerance/resistance (S88) and susceptibility (G4) were identified by natural selection from a large gene pool over a 6-year period. To investigate potential defence mechanisms for honey bee tolerance to varroa infestation, we employed DNA microarray and real time quantitative (PCR) analyses to identify differentially expressed genes in the tolerant and susceptible colonies at pupa and adult stages. Our results showed that more differentially expressed genes were identified in the tolerant bees than in bees from the susceptible colony, indicating that the tolerant colony showed an increased genetic capacity to respond to varroa mite infestation. In both colonies, there were more differentially expressed genes identified at the pupa stage than at the adult stage, indicating that pupa bees are more responsive to varroa infestation than adult bees. Genes showing differential expression in the colony phenotypes were categorized into several groups based on their molecular functions, such as olfactory signalling, detoxification processes, exoskeleton formation, protein degradation and long-chain fatty acid metabolism, suggesting that these biological processes play roles in conferring varroa tolerance to naturally selected colonies. Identification of differentially expressed genes between the two colony phenotypes provides potential molecular markers for selecting and breeding varroa-tolerant honey bees. © 2016 The Royal Entomological Society.

  2. Item Response Theory Applied to Factors Affecting the Patient Journey Towards Hearing Rehabilitation

    Science.gov (United States)

    Chenault, Michelene; Berger, Martijn; Kremer, Bernd; Anteunis, Lucien

    2016-01-01

    To develop a tool for use in hearing screening and to evaluate the patient journey towards hearing rehabilitation, responses to the hearing aid rehabilitation questionnaire scales aid stigma, pressure, and aid unwanted addressing respectively hearing aid stigma, experienced pressure from others; perceived hearing aid benefit were evaluated with item response theory. The sample was comprised of 212 persons aged 55 years or more; 63 were hearing aid users, 64 with and 85 persons without hearing impairment according to guidelines for hearing aid reimbursement in the Netherlands. Bias was investigated relative to hearing aid use and hearing impairment within the differential test functioning framework. Items compromising model fit or demonstrating differential item functioning were dropped. The aid stigma scale was reduced from 6 to 4, the pressure scale from 7 to 4, and the aid unwanted scale from 5 to 4 items. This procedure resulted in bias-free scales ready for screening purposes and application to further understand the help-seeking process of the hearing impaired. PMID:28028428

  3. Calibration of the Dutch-Flemish PROMIS Pain Behavior item bank in patients with chronic pain.

    Science.gov (United States)

    Crins, M H P; Roorda, L D; Smits, N; de Vet, H C W; Westhovens, R; Cella, D; Cook, K F; Revicki, D; van Leeuwen, J; Boers, M; Dekker, J; Terwee, C B

    2016-02-01

    The aims of the current study were to calibrate the item parameters of the Dutch-Flemish PROMIS Pain Behavior item bank using a sample of Dutch patients with chronic pain and to evaluate cross-cultural validity between the Dutch-Flemish and the US PROMIS Pain Behavior item banks. Furthermore, reliability and construct validity of the Dutch-Flemish PROMIS Pain Behavior item bank were evaluated. The 39 items in the bank were completed by 1042 Dutch patients with chronic pain. To evaluate unidimensionality, a one-factor confirmatory factor analysis (CFA) was performed. A graded response model (GRM) was used to calibrate the items. To evaluate cross-cultural validity, Differential item functioning (DIF) for language (Dutch vs. English) was evaluated. Reliability of the item bank was also examined and construct validity was studied using several legacy instruments, e.g. the Roland Morris Disability Questionnaire. CFA supported the unidimensionality of the Dutch-Flemish PROMIS Pain Behavior item bank (CFI = 0.960, TLI = 0.958), the data also fit the GRM, and demonstrated good coverage across the pain behavior construct (threshold parameters range: -3.42 to 3.54). Analysis showed good cross-cultural validity (only six DIF items), reliability (Cronbach's α = 0.95) and construct validity (all correlations ≥0.53). The Dutch-Flemish PROMIS Pain Behavior item bank was found to have good cross-cultural validity, reliability and construct validity. The development of the Dutch-Flemish PROMIS Pain Behavior item bank will serve as the basis for Dutch-Flemish PROMIS short forms and computer adaptive testing (CAT). © 2015 European Pain Federation - EFIC®

  4. Gender Differences in Figural Matrices: The Moderating Role of Item Design Features

    Science.gov (United States)

    Arendasy, Martin E.; Sommer, Markus

    2012-01-01

    There is a heated debate on whether observed gender differences in some figural matrices in adults can be attributed to gender differences in inductive reasoning/G[subscript f] or differential item functioning and/or test bias. Based on previous studies we hypothesized that three specific item design features moderate the effect size of the gender…

  5. A signal detection-item response theory model for evaluating neuropsychological measures.

    Science.gov (United States)

    Thomas, Michael L; Brown, Gregory G; Gur, Ruben C; Moore, Tyler M; Patt, Virginie M; Risbrough, Victoria B; Baker, Dewleen G

    2018-02-05

    Models from signal detection theory are commonly used to score neuropsychological test data, especially tests of recognition memory. Here we show that certain item response theory models can be formulated as signal detection theory models, thus linking two complementary but distinct methodologies. We then use the approach to evaluate the validity (construct representation) of commonly used research measures, demonstrate the impact of conditional error on neuropsychological outcomes, and evaluate measurement bias. Signal detection-item response theory (SD-IRT) models were fitted to recognition memory data for words, faces, and objects. The sample consisted of U.S. Infantry Marines and Navy Corpsmen participating in the Marine Resiliency Study. Data comprised item responses to the Penn Face Memory Test (PFMT; N = 1,338), Penn Word Memory Test (PWMT; N = 1,331), and Visual Object Learning Test (VOLT; N = 1,249), and self-report of past head injury with loss of consciousness. SD-IRT models adequately fitted recognition memory item data across all modalities. Error varied systematically with ability estimates, and distributions of residuals from the regression of memory discrimination onto self-report of past head injury were positively skewed towards regions of larger measurement error. Analyses of differential item functioning revealed little evidence of systematic bias by level of education. SD-IRT models benefit from the measurement rigor of item response theory-which permits the modeling of item difficulty and examinee ability-and from signal detection theory-which provides an interpretive framework encompassing the experimentally validated constructs of memory discrimination and response bias. We used this approach to validate the construct representation of commonly used research measures and to demonstrate how nonoptimized item parameters can lead to erroneous conclusions when interpreting neuropsychological test data. Future work might include the

  6. Dementias show differential physiological responses to salient sounds.

    Science.gov (United States)

    Fletcher, Phillip D; Nicholas, Jennifer M; Shakespeare, Timothy J; Downey, Laura E; Golden, Hannah L; Agustus, Jennifer L; Clark, Camilla N; Mummery, Catherine J; Schott, Jonathan M; Crutch, Sebastian J; Warren, Jason D

    2015-01-01

    Abnormal responsiveness to salient sensory signals is often a prominent feature of dementia diseases, particularly the frontotemporal lobar degenerations, but has been little studied. Here we assessed processing of one important class of salient signals, looming sounds, in canonical dementia syndromes. We manipulated tones using intensity cues to create percepts of salient approaching ("looming") or less salient withdrawing sounds. Pupil dilatation responses and behavioral rating responses to these stimuli were compared in patients fulfilling consensus criteria for dementia syndromes (semantic dementia, n = 10; behavioral variant frontotemporal dementia, n = 16, progressive nonfluent aphasia, n = 12; amnestic Alzheimer's disease, n = 10) and a cohort of 26 healthy age-matched individuals. Approaching sounds were rated as more salient than withdrawing sounds by healthy older individuals but this behavioral response to salience did not differentiate healthy individuals from patients with dementia syndromes. Pupil responses to approaching sounds were greater than responses to withdrawing sounds in healthy older individuals and in patients with semantic dementia: this differential pupil response was reduced in patients with progressive nonfluent aphasia and Alzheimer's disease relative both to the healthy control and semantic dementia groups, and did not correlate with nonverbal auditory semantic function. Autonomic responses to auditory salience are differentially affected by dementias and may constitute a novel biomarker of these diseases.

  7. The e-MSWS-12: improving the multiple sclerosis walking scale using item response theory.

    Science.gov (United States)

    Engelhard, Matthew M; Schmidt, Karen M; Engel, Casey E; Brenton, J Nicholas; Patek, Stephen D; Goldman, Myla D

    2016-12-01

    The Multiple Sclerosis Walking Scale (MSWS-12) is the predominant patient-reported measure of multiple sclerosis (MS) -elated walking ability, yet it had not been analyzed using item response theory (IRT), the emerging standard for patient-reported outcome (PRO) validation. This study aims to reduce MSWS-12 measurement error and facilitate computerized adaptive testing by creating an IRT model of the MSWS-12 and distributing it online. MSWS-12 responses from 284 subjects with MS were collected by mail and used to fit and compare several IRT models. Following model selection and assessment, subpopulations based on age and sex were tested for differential item functioning (DIF). Model comparison favored a one-dimensional graded response model (GRM). This model met fit criteria and explained 87 % of response variance. The performance of each MSWS-12 item was characterized using category response curves (CRCs) and item information. IRT-based MSWS-12 scores correlated with traditional MSWS-12 scores (r = 0.99) and timed 25-foot walk (T25FW) speed (r =  -0.70). Item 2 showed DIF based on age (χ 2  = 19.02, df = 5, p Item 11 showed DIF based on sex (χ 2  = 13.76, df = 5, p = 0.02). MSWS-12 measurement error depends on walking ability, but could be lowered by improving or replacing items with low information or DIF. The e-MSWS-12 includes IRT-based scoring, error checking, and an estimated T25FW derived from MSWS-12 responses. It is available at https://ms-irt.shinyapps.io/e-MSWS-12 .

  8. Item Response Theory Analysis of the Psychopathic Personality Inventory-Revised.

    Science.gov (United States)

    Eichenbaum, Alexander E; Marcus, David K; French, Brian F

    2017-06-01

    This study examined item and scale functioning in the Psychopathic Personality Inventory-Revised (PPI-R) using an item response theory analysis. PPI-R protocols from 1,052 college student participants (348 male, 704 female) were analyzed. Analyses were conducted on the 131 self-report items comprising the PPI-R's eight content scales, using a graded response model. Scales collected a majority of their information about respondents possessing higher than average levels of the traits being measured. Each scale contained at least some items that evidenced limited ability to differentiate between respondents with differing levels of the trait being measured. Moreover, 80 items (61.1%) yielded significantly different responses between men and women presumably possessing similar levels of the trait being measured. Item performance was also influenced by the scoring format (directly scored vs. reverse-scored) of the items. Overall, the results suggest that the PPI-R, despite identifying psychopathic personality traits in individuals possessing high levels of those traits, may not identify these traits equally well for men and women, and scores are likely influenced by the scoring format of the individual item and scale.

  9. Dementias show differential physiological responses to salient sounds

    Directory of Open Access Journals (Sweden)

    Phillip David Fletcher

    2015-03-01

    Full Text Available Abnormal responsiveness to salient sensory signals is often a prominent feature of dementia diseases, particularly the frontotemporal lobar degenerations, but has been little studied. Here we assessed processing of one important class of salient signals, looming sounds, in canonical dementia syndromes. We manipulated tones using intensity cues to create percepts of salient approaching (‘looming’ or less salient withdrawing sounds. Pupil dilatation responses and behavioural rating responses to these stimuli were compared in patients fulfilling consensus criteria for dementia syndromes (semantic dementia, n=10; behavioural variant frontotemporal dementia, n=16, progressive non-fluent aphasia, n=12; amnestic Alzheimer’s disease, n=10 and a cohort of 26 healthy age-matched individuals. Approaching sounds were rated as more salient than withdrawing sounds by healthy older individuals but this behavioural response to salience did not differentiate healthy individuals from patients with dementia syndromes. Pupil responses to approaching sounds were greater than responses to withdrawing sounds in healthy older individuals and in patients with semantic dementia: this differential pupil response was reduced in patients with progressive nonfluent aphasia and Alzheimer’s disease relative both to the healthy control and semantic dementia groups, and did not correlate with nonverbal auditory semantic function. Autonomic responses to auditory salience are differentially affected by dementias and may constitute a novel biomarker of these diseases.

  10. Dementias show differential physiological responses to salient sounds

    Science.gov (United States)

    Fletcher, Phillip D.; Nicholas, Jennifer M.; Shakespeare, Timothy J.; Downey, Laura E.; Golden, Hannah L.; Agustus, Jennifer L.; Clark, Camilla N.; Mummery, Catherine J.; Schott, Jonathan M.; Crutch, Sebastian J.; Warren, Jason D.

    2015-01-01

    Abnormal responsiveness to salient sensory signals is often a prominent feature of dementia diseases, particularly the frontotemporal lobar degenerations, but has been little studied. Here we assessed processing of one important class of salient signals, looming sounds, in canonical dementia syndromes. We manipulated tones using intensity cues to create percepts of salient approaching (“looming”) or less salient withdrawing sounds. Pupil dilatation responses and behavioral rating responses to these stimuli were compared in patients fulfilling consensus criteria for dementia syndromes (semantic dementia, n = 10; behavioral variant frontotemporal dementia, n = 16, progressive nonfluent aphasia, n = 12; amnestic Alzheimer's disease, n = 10) and a cohort of 26 healthy age-matched individuals. Approaching sounds were rated as more salient than withdrawing sounds by healthy older individuals but this behavioral response to salience did not differentiate healthy individuals from patients with dementia syndromes. Pupil responses to approaching sounds were greater than responses to withdrawing sounds in healthy older individuals and in patients with semantic dementia: this differential pupil response was reduced in patients with progressive nonfluent aphasia and Alzheimer's disease relative both to the healthy control and semantic dementia groups, and did not correlate with nonverbal auditory semantic function. Autonomic responses to auditory salience are differentially affected by dementias and may constitute a novel biomarker of these diseases. PMID:25859194

  11. Item validity vs. item discrimination index: a redundancy?

    Science.gov (United States)

    Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.

    2018-03-01

    In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.

  12. Using response-time constraints in item selection to control for differential speededness in computerized adaptive testing

    NARCIS (Netherlands)

    van der Linden, Willem J.; Scrams, David J.; Schnipke, Deborah L.

    2003-01-01

    This paper proposes an item selection algorithm that can be used to neutralize the effect of time limits in computer adaptive testing. The method is based on a statistical model for the response-time distributions of the test takers on the items in the pool that is updated each time a new item has

  13. Análise do funcionamento diferencial dos itens do Exame Nacional do Estudante (ENADE de psicologia de 2006 Differential item functioning of the national student exam for psychology (ENADE 2006

    Directory of Open Access Journals (Sweden)

    Ricardo Primi

    2010-12-01

    Full Text Available Parte do Sistema Nacional de Avaliação das Instituições de Educação Superior considera o desempenho dos estudantes por meio do ENADE. Neste artigo efetuou-se uma análise dos itens da prova do ENADE de psicologia aplicada em 2006 tentando-se detectar itens com funcionamento diferencial (DIF, isto é, itens com problema de equivalência ao medir ingressantes e concluintes e estudantes de instituições públicas e privada. Analisou-se uma amostra de 26.613 estudantes ingressantes e concluintes representativa de todos os cursos do país. Empregou-se a análise de Rasch e regressão logística para se detectar o DIF. Onze itens dos 30 que compunham a prova apresentaram DIF. Dois tipos de DIF ocorreram, um tipo em itens com baixa discriminação e outro em itens com alta discriminação. O subgrupo mais relevante tende a favorecer alunos de instituições públicas. Discute-se também a questão da discriminação elevada como indicativo de DIF.Part of the National Assessment of Institutions of Higher Education considers student performance through ENADE. In this article we performed differential item function analysis of the ENADE that took place in 2006 trying to detect items with problems in measurement equivalence in the assessment of freshman and senior students and from public and private institutions. We analyzed a sample of 26,613 freshmen and seniors representative of all the courses in the country. We used the Rasch analysis and logistic regression to detect DIF. Eleven of the 30 items composing the test showed DIF. Two types of DIF were observed, one occurring in less discriminating items and the other in more discriminating items. The most relevant subgroup of items tends to favor students from public institutions. We also discuss the issue of discrimination parameter being an indicator of DIF.

  14. Psychometric Consequences of Subpopulation Item Parameter Drift

    Science.gov (United States)

    Huggins-Manley, Anne Corinne

    2017-01-01

    This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…

  15. Item response theory applied to factors affecting the patient journey towards hearing rehabilitation

    Directory of Open Access Journals (Sweden)

    Michelene Chenault

    2016-11-01

    Full Text Available To develop a tool for use in hearing screening and to evaluate the patient journey towards hearing rehabilitation, responses to the hearing aid rehabilitation questionnaire scales aid stigma, pressure, and aid unwanted addressing respectively hearing aid stigma, experienced pressure from others; perceived hearing aid benefit were evaluated with item response theory. The sample was comprised of 212 persons aged 55 years or more; 63 were hearing aid users, 64 with and 85 persons without hearing impairment according to guidelines for hearing aid reimbursement in the Netherlands. Bias was investigated relative to hearing aid use and hearing impairment within the differential test functioning framework. Items compromising model fit or demonstrating differential item functioning were dropped. The aid stigma scale was reduced from 6 to 4, the pressure scale from 7 to 4, and the aid unwanted scale from 5 to 4 items. This procedure resulted in bias-free scales ready for screening purposes and application to further understand the help-seeking process of the hearing impaired.

  16. Psychometric validation of the Persian nine-item Internet Gaming Disorder Scale - Short Form: Does gender and hours spent online gaming affect the interpretations of item descriptions?

    Science.gov (United States)

    Wu, Tzu-Yi; Lin, Chung-Ying; Årestedt, Kristofer; Griffiths, Mark D; Broström, Anders; Pakpour, Amir H

    2017-06-01

    Background and aims The nine-item Internet Gaming Disorder Scale - Short Form (IGDS-SF9) is brief and effective to evaluate Internet Gaming Disorder (IGD) severity. Although its scores show promising psychometric properties, less is known about whether different groups of gamers interpret the items similarly. This study aimed to verify the construct validity of the Persian IGDS-SF9 and examine the scores in relation to gender and hours spent online gaming among 2,363 Iranian adolescents. Methods Confirmatory factor analysis (CFA) and Rasch analysis were used to examine the construct validity of the IGDS-SF9. The effects of gender and time spent online gaming per week were investigated by multigroup CFA and Rasch differential item functioning (DIF). Results The unidimensionality of the IGDS-SF9 was supported in both CFA and Rasch. However, Item 4 (fail to control or cease gaming activities) displayed DIF (DIF contrast = 0.55) slightly over the recommended cutoff in Rasch but was invariant in multigroup CFA across gender. Items 4 (DIF contrast = -0.67) and 9 (jeopardize or lose an important thing because of gaming activity; DIF contrast = 0.61) displayed DIF in Rasch and were non-invariant in multigroup CFA across time spent online gaming. Conclusions Given the Persian IGDS-SF9 was unidimensional, it is concluded that the instrument can be used to assess IGD severity. However, users of the instrument are cautioned concerning the comparisons of the sum scores of the IGDS-SF9 across gender and across adolescents spending different amounts of time online gaming.

  17. Approximation Preserving Reductions among Item Pricing Problems

    Science.gov (United States)

    Hamane, Ryoso; Itoh, Toshiya; Tomita, Kouhei

    When a store sells items to customers, the store wishes to determine the prices of the items to maximize its profit. Intuitively, if the store sells the items with low (resp. high) prices, the customers buy more (resp. less) items, which provides less profit to the store. So it would be hard for the store to decide the prices of items. Assume that the store has a set V of n items and there is a set E of m customers who wish to buy those items, and also assume that each item i ∈ V has the production cost di and each customer ej ∈ E has the valuation vj on the bundle ej ⊆ V of items. When the store sells an item i ∈ V at the price ri, the profit for the item i is pi = ri - di. The goal of the store is to decide the price of each item to maximize its total profit. We refer to this maximization problem as the item pricing problem. In most of the previous works, the item pricing problem was considered under the assumption that pi ≥ 0 for each i ∈ V, however, Balcan, et al. [In Proc. of WINE, LNCS 4858, 2007] introduced the notion of “loss-leader, ” and showed that the seller can get more total profit in the case that pi < 0 is allowed than in the case that pi < 0 is not allowed. In this paper, we derive approximation preserving reductions among several item pricing problems and show that all of them have algorithms with good approximation ratio.

  18. Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients.

    Science.gov (United States)

    Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J B; Conroy, Thierry; Tomaszewski, Krzysztof A; Young, Teresa; Petersen, Morten Aa

    2017-11-01

    The European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Group is developing computerized adaptive testing (CAT) versions of all EORTC Quality of Life Questionnaire (QLQ-C30) scales with the aim to enhance measurement precision. Here we present the results on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients. This evaluation included an assessment of dimensionality, fit to the item response theory (IRT) model, differential item functioning (DIF), and measurement properties. A total of 1030 cancer patients completed the 44 candidate items on CF. Of these, 34 items could be included in a unidimensional IRT model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient assessment of HRQOL of cancer patients, without loss of comparability of results.

  19. Application of Item Response Theory to Tests of Substance-related Associative Memory

    Science.gov (United States)

    Shono, Yusuke; Grenard, Jerry L.; Ames, Susan L.; Stacy, Alan W.

    2015-01-01

    A substance-related word association test (WAT) is one of the commonly used indirect tests of substance-related implicit associative memory and has been shown to predict substance use. This study applied an item response theory (IRT) modeling approach to evaluate psychometric properties of the alcohol- and marijuana-related WATs and their items among 775 ethnically diverse at-risk adolescents. After examining the IRT assumptions, item fit, and differential item functioning (DIF) across gender and age groups, the original 18 WAT items were reduced to 14- and 15-items in the alcohol- and marijuana-related WAT, respectively. Thereafter, unidimensional one- and two-parameter logistic models (1PL and 2PL models) were fitted to the revised WAT items. The results demonstrated that both alcohol- and marijuana-related WATs have good psychometric properties. These results were discussed in light of the framework of a unified concept of construct validity (Messick, 1975, 1989, 1995). PMID:25134051

  20. The Dutch-Flemish PROMIS Physical Function item bank exhibited strong psychometric properties in patients with chronic pain.

    Science.gov (United States)

    Crins, Martine H P; Terwee, Caroline B; Klausch, Thomas; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis A; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Roorda, Leo D

    2017-07-01

    The objective of this study was to assess the psychometric properties of the Dutch-Flemish Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank in Dutch patients with chronic pain. A bank of 121 items was administered to 1,247 Dutch patients with chronic pain. Unidimensionality was assessed by fitting a one-factor confirmatory factor analysis and evaluating resulting fit statistics. Items were calibrated with the graded response model and its fit was evaluated. Cross-cultural validity was assessed by testing items for differential item functioning (DIF) based on language (Dutch vs. English). Construct validity was evaluated by calculation correlations between scores on the Dutch-Flemish PROMIS Physical Function measure and scores on generic and disease-specific measures. Results supported the Dutch-Flemish PROMIS Physical Function item bank's unidimensionality (Comparative Fit Index = 0.976, Tucker Lewis Index = 0.976) and model fit. Item thresholds targeted a wide range of physical function construct (threshold-parameters range: -4.2 to 5.6). Cross-cultural validity was good as four items only showed DIF for language and their impact on item scores was minimal. Physical Function scores were strongly associated with scores on all other measures (all correlations ≤ -0.60 as expected). The Dutch-Flemish PROMIS Physical Function item bank exhibited good psychometric properties. Development of a computer adaptive test based on the large bank is warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Using automatic item generation to create multiple-choice test items.

    Science.gov (United States)

    Gierl, Mark J; Lai, Hollis; Turner, Simon R

    2012-08-01

    Many tests of medical knowledge, from the undergraduate level to the level of certification and licensure, contain multiple-choice items. Although these are efficient in measuring examinees' knowledge and skills across diverse content areas, multiple-choice items are time-consuming and expensive to create. Changes in student assessment brought about by new forms of computer-based testing have created the demand for large numbers of multiple-choice items. Our current approaches to item development cannot meet this demand. We present a methodology for developing multiple-choice items based on automatic item generation (AIG) concepts and procedures. We describe a three-stage approach to AIG and we illustrate this approach by generating multiple-choice items for a medical licensure test in the content area of surgery. To generate multiple-choice items, our method requires a three-stage process. Firstly, a cognitive model is created by content specialists. Secondly, item models are developed using the content from the cognitive model. Thirdly, items are generated from the item models using computer software. Using this methodology, we generated 1248 multiple-choice items from one item model. Automatic item generation is a process that involves using models to generate items using computer technology. With our method, content specialists identify and structure the content for the test items, and computer technology systematically combines the content to generate new test items. By combining these outcomes, items can be generated automatically. © Blackwell Publishing Ltd 2012.

  2. A Non-Parametric Item Response Theory Evaluation of the CAGE Instrument Among Older Adults.

    Science.gov (United States)

    Abdin, Edimansyah; Sagayadevan, Vathsala; Vaingankar, Janhavi Ajit; Picco, Louisa; Chong, Siow Ann; Subramaniam, Mythily

    2018-02-23

    The validity of the CAGE using item response theory (IRT) has not yet been examined in older adult population. This study aims to investigate the psychometric properties of the CAGE using both non-parametric and parametric IRT models, assess whether there is any differential item functioning (DIF) by age, gender and ethnicity and examine the measurement precision at the cut-off scores. We used data from the Well-being of the Singapore Elderly study to conduct Mokken scaling analysis (MSA), dichotomous Rasch and 2-parameter logistic IRT models. The measurement precision at the cut-off scores were evaluated using classification accuracy (CA) and classification consistency (CC). The MSA showed the overall scalability H index was 0.459, indicating a medium performing instrument. All items were found to be homogenous, measuring the same construct and able to discriminate well between respondents with high levels of the construct and the ones with lower levels. The item discrimination ranged from 1.07 to 6.73 while the item difficulty ranged from 0.33 to 2.80. Significant DIF was found for 2-item across ethnic group. More than 90% (CC and CA ranged from 92.5% to 94.3%) of the respondents were consistently and accurately classified by the CAGE cut-off scores of 2 and 3. The current study provides new evidence on the validity of the CAGE from the IRT perspective. This study provides valuable information of each item in the assessment of the overall severity of alcohol problem and the precision of the cut-off scores in older adult population.

  3. Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank.

    Science.gov (United States)

    Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Vonkeman, Harald E; van de Laar, Mart A F J

    2017-11-01

    Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Seventy-two items generated from patient interviews and mapped to the International Classification of Functioning, Disability and Health (ICF) domestic life chapter were administered to 1128 adults representative of the Dutch population. The partial credit model was fitted to the item responses and evaluated with respect to its assumptions, model fit, and differential item functioning (DIF). Measurement performance of a computerized adaptive testing (CAT) algorithm was compared with the SF-36 physical functioning scale (PF-10). A final bank of 41 items was developed. All items demonstrated acceptable fit to the partial credit model and measurement invariance across age, sex, and educational level. Five- and ten-item CAT simulations were shown to have high measurement precision, which exceeded that of SF-36 physical functioning scale across the physical function continuum. Floor effects were absent for a 10-item empirical CAT simulation, and ceiling effects were low (13.5%) compared with SF-36 physical functioning (38.1%). CAT also discriminated better than SF-36 physical functioning between age groups, number of chronic conditions, and respondents with or without rheumatic conditions. The Rasch assessment of everyday activity limitations (REAL) item bank will hopefully prove a useful instrument for assessing everyday activity limitations. T-scores obtained using derived measures can be used to benchmark physical function outcomes against the general Dutch adult population.

  4. The Piper Fatigue Scale-12 (PFS-12): psychometric findings and item reduction in a cohort of breast cancer survivors.

    Science.gov (United States)

    Reeve, Bryce B; Stover, Angela M; Alfano, Catherine M; Smith, Ashley Wilder; Ballard-Barbash, Rachel; Bernstein, Leslie; McTiernan, Anne; Baumgartner, Kathy B; Piper, Barbara F

    2012-11-01

    Brief, valid measures of fatigue, a prevalent and distressing cancer symptom, are needed for use in research. This study's primary aim was to create a shortened version of the revised Piper Fatigue Scale (PFS-R) based on data from a diverse cohort of breast cancer survivors. A secondary aim was to determine whether the PFS captured multiple distinct aspects of fatigue (a multidimensional model) or a single overall fatigue factor (a unidimensional model). Breast cancer survivors (n = 799; stages in situ through IIIa; ages 29-86 years) were recruited through three SEER registries (New Mexico, Western Washington, and Los Angeles, CA) as part of the Health, Eating, Activity, and Lifestyle (HEAL) study. Fatigue was measured approximately 3 years post-diagnosis using the 22-item PFS-R that has four subscales (Behavior, Affect, Sensory, and Cognition). Confirmatory factor analysis was used to compare unidimensional and multidimensional models. Six criteria were used to make item selections to shorten the PFS-R: scale's content validity, items' relationship with fatigue, content redundancy, differential item functioning by race and/or education, scale reliability, and literacy demand. Factor analyses supported the original 4-factor structure. There was also evidence from the bi-factor model for a dominant underlying fatigue factor. Six items tested positive for differential item functioning between African-American and Caucasian survivors. Four additional items either showed poor association, local dependence, or content validity concerns. After removing these 10 items, the reliability of the PFS-12 subscales ranged from 0.87 to 0.89, compared to 0.90-0.94 prior to item removal. The newly developed PFS-12 can be used to assess fatigue in African-American and Caucasian breast cancer survivors and reduces response burden without compromising reliability or validity. This is the first study to determine PFS literacy demand and to compare PFS-R responses in African

  5. Harmonizing Measures of Cognitive Performance Across International Surveys of Aging Using Item Response Theory.

    Science.gov (United States)

    Chan, Kitty S; Gross, Alden L; Pezzin, Liliana E; Brandt, Jason; Kasper, Judith D

    2015-12-01

    To harmonize measures of cognitive performance using item response theory (IRT) across two international aging studies. Data for persons ≥65 years from the Health and Retirement Study (HRS, N = 9,471) and the English Longitudinal Study of Aging (ELSA, N = 5,444). Cognitive performance measures varied (HRS fielded 25, ELSA 13); 9 were in common. Measurement precision was examined for IRT scores based on (a) common items, (b) common items adjusted for differential item functioning (DIF), and (c) DIF-adjusted all items. Three common items (day of date, immediate word recall, and delayed word recall) demonstrated DIF by survey. Adding survey-specific items improved precision but mainly for HRS respondents at lower cognitive levels. IRT offers a feasible strategy for harmonizing cognitive performance measures across other surveys and for other multi-item constructs of interest in studies of aging. Practical implications depend on sample distribution and the difficulty mix of in-common and survey-specific items. © The Author(s) 2015.

  6. P2-19: The Effect of item Repetition on Item-Context Association Depends on the Prior Exposure of Items

    Directory of Open Access Journals (Sweden)

    Hongmi Lee

    2012-10-01

    Full Text Available Previous studies have reported conflicting findings on whether item repetition has beneficial or detrimental effects on source memory. To reconcile such contradictions, we investigated whether the degree of pre-exposure of items can be a potential modulating factor. The experimental procedures spanned two consecutive days. On Day 1, participants were exposed to a set of unfamiliar faces. On Day 2, the same faces presented on the previous day were used again in half of the participants, whereas novel faces were used for the other half. Day 2 procedures consisted of three successive phases: item repetition, source association, and source memory test. In the item repetition phase, half of the face stimuli were repeatedly presented while participants were making male/female judgments. During the source association phase, both the repeated and the unrepeated faces appeared in one of the four locations on the screen. Finally, participants were tested on the location in which a given face was presented during the previous phase and reported the confidence of their memory. Source memory accuracy was measured as the percentage of correct non-guess trials. As results, we found a significant interaction between prior exposure and repetition. Repetition impaired source memory when the items had been pre-exposed on Day 1, while it led to greater accuracy in novel ones. These results show that pre-experimental exposure can modulate the effects of repetition on associative binding between an item and its contextual information, suggesting that pre-existing representation and novelty signal interact to form new episodic memory.

  7. Item response analysis on an examination in anesthesiology for medical students in Taiwan: A comparison of one- and two-parameter logistic models

    Directory of Open Access Journals (Sweden)

    Yu-Feng Huang

    2013-06-01

    Conclusion: Item response models are useful for medical test analyses and provide valuable information about model comparisons and identification of differential items other than test reliability, item difficulty, and examinee's ability.

  8. Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?

    Science.gov (United States)

    Sinharay, Sandip

    2017-09-01

    Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.

  9. Symptom endorsement in men versus women with a diagnosis of depression: A differential item functioning approach.

    Science.gov (United States)

    Cavanagh, Anna; Wilson, Coralie J; Caputi, Peter; Kavanagh, David J

    2016-09-01

    There is some evidence that, in contrast to depressed women, depressed men tend to report alternative symptoms that are not listed as standard diagnostic criteria. This may possibly lead to an under- or misdiagnosis of depression in men. This study aims to clarify whether depressed men and women report different symptoms. This study used data from the 2007 Australian National Survey of Mental Health and Wellbeing that was collected using the World Health Organization's Composite International Diagnostic Interview. Participants with a diagnosis of a depressive disorder with 12-month symptoms (n = 663) were identified and included in this study. Differential item functioning (DIF) was used to test whether depressed men and women endorse different features associated with their condition. Gender-related DIF was present for three symptoms associated with depression. Depressed women were more likely to report 'appetite/weight disturbance', whereas depressed men were more likely to report 'alcohol misuse' and 'substance misuse'. While the results may reflect a greater risk of co-occurring alcohol and substance misuse in men, inclusion of these features in assessments may improve the detection of depression in men, especially if standard depressive symptoms are under-reported. © The Author(s) 2016.

  10. A comparison of three methods of assessing differential item functioning (DIF) in the Hospital Anxiety Depression Scale: ordinal logistic regression, Rasch analysis and the Mantel chi-square procedure.

    Science.gov (United States)

    Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C

    2014-12-01

    It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.

  11. DISC Predictive Scales (DPS): Factor Structure and Uniform Differential Item Functioning Across Gender and Three Racial/Ethnic Groups for ADHD, Conduct Disorder, and Oppositional Defiant Disorder Symptoms

    OpenAIRE

    Wiesner, Margit; Kanouse, David E.; Elliott, Marc N.; Windle, Michael; Schuster, Mark A.

    2015-01-01

    The factor structure and potential uniform differential item functioning (DIF) among gender and three racial/ethnic groups of adolescents (African American, Latino, White) were evaluated for attention deficit/hyperactivity disorder (ADHD), conduct disorder (CD), and oppositional defiant disorder (ODD) symptom scores of the DISC Predictive Scales (DPS; Leung et al., 2005; Lucas et al., 2001). Primary caregivers reported on DSM–IV ADHD, CD, and ODD symptoms for a probability sample of 4,491 chi...

  12. Differential Performance by English Language Learners on an Inquiry-Based Science Assessment

    Science.gov (United States)

    Turkan, Sultan; Liu, Ou Lydia

    2012-10-01

    The performance of English language learners (ELLs) has been a concern given the rapidly changing demographics in US K-12 education. This study aimed to examine whether students' English language status has an impact on their inquiry science performance. Differential item functioning (DIF) analysis was conducted with regard to ELL status on an inquiry-based science assessment, using a multifaceted Rasch DIF model. A total of 1,396 seventh- and eighth-grade students took the science test, including 313 ELL students. The results showed that, overall, non-ELLs significantly outperformed ELLs. Of the four items that showed DIF, three favored non-ELLs while one favored ELLs. The item that favored ELLs provided a graphic representation of a science concept within a family context. There is some evidence that constructed-response items may help ELLs articulate scientific reasoning using their own words. Assessment developers and teachers should pay attention to the possible interaction between linguistic challenges and science content when designing assessment for and providing instruction to ELLs.

  13. A comparison of Rasch item-fit and Cronbach's alpha item reduction analysis for the development of a Quality of Life scale for children and adolescents.

    Science.gov (United States)

    Erhart, M; Hagquist, C; Auquier, P; Rajmil, L; Power, M; Ravens-Sieberer, U

    2010-07-01

    This study compares item reduction analysis based on classical test theory (maximizing Cronbach's alpha - approach A), with analysis based on the Rasch Partial Credit Model item-fit (approach B), as applied to children and adolescents' health-related quality of life (HRQoL) items. The reliability and structural, cross-cultural and known-group validity of the measures were examined. Within the European KIDSCREEN project, 3019 children and adolescents (8-18 years) from seven European countries answered 19 HRQoL items of the Physical Well-being dimension of a preliminary KIDSCREEN instrument. The Cronbach's alpha and corrected item total correlation (approach A) were compared with infit mean squares and the Q-index item-fit derived according to a partial credit model (approach B). Cross-cultural differential item functioning (DIF ordinal logistic regression approach), structural validity (confirmatory factor analysis and residual correlation) and relative validity (RV) for socio-demographic and health-related factors were calculated for approaches (A) and (B). Approach (A) led to the retention of 13 items, compared with 11 items with approach (B). The item overlap was 69% for (A) and 78% for (B). The correlation coefficient of the summated ratings was 0.93. The Cronbach's alpha was similar for both versions [0.86 (A); 0.85 (B)]. Both approaches selected some items that are not strictly unidimensional and items displaying DIF. RV ratios favoured (A) with regard to socio-demographic aspects. Approach (B) was superior in RV with regard to health-related aspects. Both types of item reduction analysis should be accompanied by additional analyses. Neither of the two approaches was universally superior with regard to cultural, structural and known-group validity. However, the results support the usability of the Rasch method for developing new HRQoL measures for children and adolescents.

  14. Psychometric validation of the Persian nine-item Internet Gaming Disorder Scale – Short Form: Does gender and hours spent online gaming affect the interpretations of item descriptions?

    Science.gov (United States)

    Wu, Tzu-Yi; Lin, Chung-Ying; Årestedt, Kristofer; Griffiths, Mark D.; Broström, Anders; Pakpour, Amir H.

    2017-01-01

    Background and aims The nine-item Internet Gaming Disorder Scale – Short Form (IGDS-SF9) is brief and effective to evaluate Internet Gaming Disorder (IGD) severity. Although its scores show promising psychometric properties, less is known about whether different groups of gamers interpret the items similarly. This study aimed to verify the construct validity of the Persian IGDS-SF9 and examine the scores in relation to gender and hours spent online gaming among 2,363 Iranian adolescents. Methods Confirmatory factor analysis (CFA) and Rasch analysis were used to examine the construct validity of the IGDS-SF9. The effects of gender and time spent online gaming per week were investigated by multigroup CFA and Rasch differential item functioning (DIF). Results The unidimensionality of the IGDS-SF9 was supported in both CFA and Rasch. However, Item 4 (fail to control or cease gaming activities) displayed DIF (DIF contrast = 0.55) slightly over the recommended cutoff in Rasch but was invariant in multigroup CFA across gender. Items 4 (DIF contrast = −0.67) and 9 (jeopardize or lose an important thing because of gaming activity; DIF contrast = 0.61) displayed DIF in Rasch and were non-invariant in multigroup CFA across time spent online gaming. Conclusions Given the Persian IGDS-SF9 was unidimensional, it is concluded that the instrument can be used to assess IGD severity. However, users of the instrument are cautioned concerning the comparisons of the sum scores of the IGDS-SF9 across gender and across adolescents spending different amounts of time online gaming. PMID:28571474

  15. The Role of Item Models in Automatic Item Generation

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2012-01-01

    Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…

  16. Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

    Science.gov (United States)

    Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

    2013-07-01

    Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.

  17. Problems with the factor analysis of items: Solutions based on item response theory and item parcelling

    Directory of Open Access Journals (Sweden)

    Gideon P. De Bruin

    2004-10-01

    Full Text Available The factor analysis of items often produces spurious results in the sense that unidimensional scales appear multidimensional. This may be ascribed to failure in meeting the assumptions of linearity and normality on which factor analysis is based. Item response theory is explicitly designed for the modelling of the non-linear relations between ordinal variables and provides a strong alternative to the factor analysis of items. Items may also be combined in parcels that are more likely to satisfy the assumptions of factor analysis than do the items. The use of the Rasch rating scale model and the factor analysis of parcels is illustrated with data obtained with the Locus of Control Inventory. The results of these analyses are compared with the results obtained through the factor analysis of items. It is shown that the Rasch rating scale model and the factoring of parcels produce superior results to the factor analysis of items. Recommendations for the analysis of scales are made. Opsomming Die faktorontleding van items lewer dikwels misleidende resultate op, veral in die opsig dat eendimensionele skale as meerdimensioneel voorkom. Hierdie resultate kan dikwels daaraan toegeskryf word dat daar nie aan die aannames van lineariteit en normaliteit waarop faktorontleding berus, voldoen word nie. Itemresponsteorie, wat eksplisiet vir die modellering van die nie-liniêre verbande tussen ordinale items ontwerp is, bied ’n aantreklike alternatief vir die faktorontleding van items. Items kan ook in pakkies gegroepeer word wat meer waarskynlik aan die aannames van faktorontleding voldoen as individuele items. Die gebruik van die Rasch beoordelingskaalmodel en die faktorontleding van pakkies word aan die hand van data wat met die Lokus van Beheervraelys verkry is, gedemonstreer. Die resultate van hierdie ontledings word vergelyk met die resultate wat deur ‘n faktorontleding van die individuele items verkry is. Die resultate dui daarop dat die Rasch

  18. Hidradenocarcinoma showing prominent mucinous and squamous differentiation and associated pagetoid cells.

    Science.gov (United States)

    Honda, Yumi; Tanigawa, Hiroki; Harada, Miho; Fukushima, Satoshi; Masuguchi, Shinichi; Ishihara, Tsuyoshi; Ihn, Hironobu; Iyama, Ken-ichi

    2013-05-01

    Herein, we report a 63-year-old man presenting with hidradenocarcinoma showing prominent mucinous and squamous differentiation on his back. The tumor was dermal-based, solid and cystic. Tumor cells with squamous differentiation and with keratin pearl formation were identified predominantly in the superficial dermis, and mucinous cells were identified principally in the cystic lesion in the deep dermis. Interestingly, the additional feature of pagetoid cells was identified in the overlying epidermis. Both the mucinous cells in hidradenocarcinoma and pagetoid cells had intracytoplasmic mucin; however, they had different histopathologic findings and immunophenotypes. Mucinous cells in hidradenocarcinoma had small nuclei and abundant intracytoplasmic mucin presenting goblet cells with low rate of positive immunostaining for p53 and Ki67. In contrast, pagetoid cells had larger nuclei with less intracytoplasmic mucin. Both p53- and Ki67-positive cells were increased in pagetoid cells. Additionally, mucinous cells in hidradenocarcinoma were MUC1(+)/MUC2(-)/MUC5AC(+)/MUC6(+), but pagetoid cells were MUC1(+; focal)/MUC2(-)/MUC5AC(-)/MUC6(+; focal). The derivation of pagetoid cells is unclear; however, the localized small region of pagetoid cells over the hidradenocarcinoma in the present case may suggest a common histogenesis of these two malignant neoplasms. Copyright © 2013 John Wiley & Sons A/S. Published by Blackwell Publishing Ltd.

  19. DISC Predictive Scales (DPS): Factor structure and uniform differential item functioning across gender and three racial/ethnic groups for ADHD, conduct disorder, and oppositional defiant disorder symptoms.

    Science.gov (United States)

    Wiesner, Margit; Windle, Michael; Kanouse, David E; Elliott, Marc N; Schuster, Mark A

    2015-12-01

    The factor structure and potential uniform differential item functioning (DIF) among gender and three racial/ethnic groups of adolescents (African American, Latino, White) were evaluated for attention deficit/hyperactivity disorder (ADHD), conduct disorder (CD), and oppositional defiant disorder (ODD) symptom scores of the DISC Predictive Scales (DPS; Leung et al., 2005; Lucas et al., 2001). Primary caregivers reported on DSM-IV ADHD, CD, and ODD symptoms for a probability sample of 4,491 children from three geographical regions who took part in the Healthy Passages study (mean age = 12.60 years, SD = 0.66). Confirmatory factor analysis indicated that the expected 3-factor structure was tenable for the data. Multiple indicators multiple causes (MIMIC) modeling revealed uniform DIF for three ADHD and 9 ODD item scores, but not for any of the CD item scores. Uniform DIF was observed predominantly as a function of child race/ethnicity, but minimally as a function of child gender. On the positive side, uniform DIF had little impact on latent mean differences of ADHD, CD, and ODD symptomatology among gender and racial/ethnic groups. Implications of the findings for researchers and practitioners are discussed. (c) 2015 APA, all rights reserved).

  20. Numerical Differentiation Methods for Computing Error Covariance Matrices in Item Response Theory Modeling: An Evaluation and a New Proposal

    Science.gov (United States)

    Tian, Wei; Cai, Li; Thissen, David; Xin, Tao

    2013-01-01

    In item response theory (IRT) modeling, the item parameter error covariance matrix plays a critical role in statistical inference procedures. When item parameters are estimated using the EM algorithm, the parameter error covariance matrix is not an automatic by-product of item calibration. Cai proposed the use of Supplemented EM algorithm for…

  1. Differential geometry and mathematical physics

    CERN Document Server

    Rudolph, Gerd

    Starting from an undergraduate level, this book systematically develops the basics of • Calculus on manifolds, vector bundles, vector fields and differential forms, • Lie groups and Lie group actions, • Linear symplectic algebra and symplectic geometry, • Hamiltonian systems, symmetries and reduction, integrable systems and Hamilton-Jacobi theory. The topics listed under the first item are relevant for virtually all areas of mathematical physics. The second and third items constitute the link between abstract calculus and the theory of Hamiltonian systems. The last item provides an introduction to various aspects of this theory, including Morse families, the Maslov class and caustics. The book guides the reader from elementary differential geometry to advanced topics in the theory of Hamiltonian systems with the aim of making current research literature accessible. The style is that of a mathematical textbook,with full proofs given in the text or as exercises. The material is illustrated by numerous d...

  2. Disparities in Sense of Community: True Race Differences or Differential Item Functioning?

    Science.gov (United States)

    Coffman, Donna L.; BeLue, Rhonda

    2009-01-01

    The sense of community index (SCI) has been widely used to measure psychological sense of community (SOC). Furthermore, SOC has been found to differ among racial groups. Because different ethnic groups have different cultural and historical experiences that may lead to different interpretations of measurement items, it is important to know whether…

  3. Item information and discrimination functions for trinary PCM items

    NARCIS (Netherlands)

    Akkermans, Wies; Muraki, Eiji

    1997-01-01

    For trinary partial credit items the shape of the item information and the item discrimination function is examined in relation to the item parameters. In particular, it is shown that these functions are unimodal if δ2 – δ1 < 4 ln 2 and bimodal otherwise. The locations and values of the maxima are

  4. Psychometric properties of the Epworth Sleepiness Scale: A factor analysis and item-response theory approach.

    Science.gov (United States)

    Pilcher, June J; Switzer, Fred S; Munc, Alec; Donnelly, Janet; Jellen, Julia C; Lamm, Claus

    2018-04-01

    The purpose of this study is to examine the psychometric properties of the Epworth Sleepiness Scale (ESS) in two languages, German and English. Students from a university in Austria (N = 292; 55 males; mean age = 18.71 ± 1.71 years; 237 females; mean age = 18.24 ± 0.88 years) and a university in the US (N = 329; 128 males; mean age = 18.71 ± 0.88 years; 201 females; mean age = 21.59 ± 2.27 years) completed the ESS. An exploratory-factor analysis was completed to examine dimensionality of the ESS. Item response theory (IRT) analyses were used to provide information about the response rates on the items on the ESS and provide differential item functioning (DIF) analyses to examine whether the items were interpreted differently between the two languages. The factor analyses suggest that the ESS measures two distinct sleepiness constructs. These constructs indicate that the ESS is probing sleepiness in settings requiring active versus passive responding. The IRT analyses found that overall, the items on the ESS perform well as a measure of sleepiness. However, Item 8 and to a lesser extent Item 6 were being interpreted differently by respondents in comparison to the other items. In addition, the DIF analyses showed that the responses between German and English were very similar indicating that there are only minor measurement differences between the two language versions of the ESS. These findings suggest that the ESS provides a reliable measure of propensity to sleepiness; however, it does convey a two-factor approach to sleepiness. Researchers and clinicians can use the German and English versions of the ESS but may wish to exclude Item 8 when calculating a total sleepiness score.

  5. Development of a Short Version of MSQOL-54 Using Factor Analysis and Item Response Theory.

    Directory of Open Access Journals (Sweden)

    Rosalba Rosato

    Full Text Available The Multiple Sclerosis Quality of Life-54 (MSQOL-54, 52 items grouped in 12 subscales plus two single items is the most used MS specific health related quality of life inventory.To develop a shortened version of the MSQOL-54.MSQOL-54 dimensionality and metric properties were investigated by confirmatory factor analysis (CFA and Rasch modelling (Partial Credit Model, PCM on MSQOL-54s completed by 473 MS patients. Their mean age was 41 years, 65% were women, and median Expanded Disability Status Scale (EDSS score was 2.0 (range 0-9.5. Differential item functioning (DIF was evaluated for gender, age and EDSS. Dimensionality of the resulting short version was assessed by exploratory factor analysis (EFA and CFA. Cognitive debriefing of the short instrument (vs. the original was then performed on 12 MS patients.CFA of MSQOL-54 subscales showed that the data fitted the overall model well. Two subscales (Role Limitations--Physical, Role Limitations--Emotional did not fit the PCM, and were removed; two other subscales (Health Perceptions, Social Function did not fit the model, but were retained as single items. Sexual Satisfaction (single-item subscale was also removed. The resulting MSQOL-29 consisted of 25 items grouped in 7 subscales, plus 4 single items. PCM fit statistics were within the acceptability range for all MSQOL-29 items except one which had significant DIF by age. EFA and CFA indicated adequate fit to the original two-factor (Physical and Mental Health Composites hypothesis. Cognitive debriefing confirmed that MSQOL-29 was acceptable and had lost no key items.The proposed MSQOL-29 is 50% shorter than MSQOL-54, yet preserves key quality of life dimensions. Prospective validation on a large, independent MS patient sample is ongoing.

  6. Item level diagnostics and model - data fit in item response theory ...

    African Journals Online (AJOL)

    Item response theory (IRT) is a framework for modeling and analyzing item response data. Item-level modeling gives IRT advantages over classical test theory. The fit of an item score pattern to an item response theory (IRT) models is a necessary condition that must be assessed for further use of item and models that best fit ...

  7. Evaluating HIV Knowledge Questionnaires Among Men Who Have Sex with Men: A Multi-Study Item Response Theory Analysis.

    Science.gov (United States)

    Janulis, Patrick; Newcomb, Michael E; Sullivan, Patrick; Mustanski, Brian

    2018-01-01

    Knowledge about the transmission, prevention, and treatment of HIV remains a critical element in psychosocial models of HIV risk behavior and is commonly used as an outcome in HIV prevention interventions. However, most HIV knowledge questions have not undergone rigorous psychometric testing such as using item response theory. The current study used data from six studies of men who have sex with men (MSM; n = 3565) to (1) examine the item properties of HIV knowledge questions, (2) test for differential item functioning on commonly studied characteristics (i.e., age, race/ethnicity, and HIV risk behavior), (3) select items with the optimal item characteristics, and (4) leverage this combined dataset to examine the potential moderating effect of age on the relationship between condomless anal sex (CAS) and HIV knowledge. Findings indicated that existing questions tend to poorly differentiate those with higher levels of HIV knowledge, but items were relatively robust across diverse individuals. Furthermore, age moderated the relationship between CAS and HIV knowledge with older MSM having the strongest association. These findings suggest that additional items are required in order to capture a more nuanced understanding of HIV knowledge and that the association between CAS and HIV knowledge may vary by age.

  8. Reading ability and print exposure: item response theory analysis of the author recognition test.

    Science.gov (United States)

    Moore, Mariah; Gordon, Peter C

    2015-12-01

    In the author recognition test (ART), participants are presented with a series of names and foils and are asked to indicate which ones they recognize as authors. The test is a strong predictor of reading skill, and this predictive ability is generally explained as occurring because author knowledge is likely acquired through reading or other forms of print exposure. In this large-scale study (1,012 college student participants), we used item response theory (IRT) to analyze item (author) characteristics in order to facilitate identification of the determinants of item difficulty, provide a basis for further test development, and optimize scoring of the ART. Factor analysis suggested a potential two-factor structure of the ART, differentiating between literary and popular authors. Effective and ineffective author names were identified so as to facilitate future revisions of the ART. Analyses showed that the ART is a highly significant predictor of the time spent encoding words, as measured using eyetracking during reading. The relationship between the ART and time spent reading provided a basis for implementing a higher penalty for selecting foils, rather than the standard method of ART scoring (names selected minus foils selected). The findings provide novel support for the view that the ART is a valid indicator of reading volume. Furthermore, they show that frequency data can be used to select items of appropriate difficulty, and that frequency data from corpora based on particular time periods and types of texts may allow adaptations of the test for different populations.

  9. Model EPQ Multi Item yang Dimodifikasi untuk Dua Permintaan secara Simultan

    Directory of Open Access Journals (Sweden)

    Taufiq Rahman

    2017-05-01

    Full Text Available Inventory is one of many factors of the business operation that need to be controlled by industries in order to improve efficiency, enhance productivity, and decrease the holding cost. The holding cost of inventories in supply chain contribute to 20% - 40% of the product value. It can be controlled by applying appropriate inventory model, such as EPQ/Economic Production Quantity and EOQ/Economic Order Quantity. EPQ is an inventory model that used to determine the optimum production lot size with balanced the production setup cost and holding cost. Even the classic EPQ has applied widely in industries, the assumption used by this model differed between the researchers whether it is continuous or discrete demand, because the multi delivery or discrete demand is mostly used by industries. Even so, there are industries that used both continuous and discrete demand simultaneously. Based on previous research, there was an advanced EPQ model that synchronizing both assumptions simultaneously, but it still addressed single item problem. Since almost the industries produced multi item, this model has lack of applicability. Therefore, this research proposed a multi item EPQ Model that synchronizing continuous and discrete demand simultaneously. The solution procedure that used in this proposed model are classical calculus method/differential calculus and simultaneous approach. A numerical example is given to show the effectiveness of the proposed approach based on the data from the literature.

  10. Evaluation of the Fecal Incontinence Quality of Life Scale (FIQL) using item response theory reveals limitations and suggests revisions.

    Science.gov (United States)

    Peterson, Alexander C; Sutherland, Jason M; Liu, Guiping; Crump, R Trafford; Karimuddin, Ahmer A

    2018-06-01

    The Fecal Incontinence Quality of Life Scale (FIQL) is a commonly used patient-reported outcome measure for fecal incontinence, often used in clinical trials, yet has not been validated in English since its initial development. This study uses modern methods to thoroughly evaluate the psychometric characteristics of the FIQL and its potential for differential functioning by gender. This study analyzed prospectively collected patient-reported outcome data from a sample of patients prior to colorectal surgery. Patients were recruited from 14 general and colorectal surgeons in Vancouver Coastal Health hospitals in Vancouver, Canada. Confirmatory factor analysis was used to assess construct validity. Item response theory was used to evaluate test reliability, describe item-level characteristics, identify local item dependence, and test for differential functioning by gender. 236 patients were included for analysis, with mean age 58 and approximately half female. Factor analysis failed to identify the lifestyle, coping, depression, and embarrassment domains, suggesting lack of construct validity. Items demonstrated low difficulty, indicating that the test has the highest reliability among individuals who have low quality of life. Five items are suggested for removal or replacement. Differential test functioning was minimal. This study has identified specific improvements that can be made to each domain of the Fecal Incontinence Quality of Life Scale and to the instrument overall. Formatting, scoring, and instructions may be simplified, and items with higher difficulty developed. The lifestyle domain can be used as is. The embarrassment domain should be significantly revised before use.

  11. Evaluating the quality of medical multiple-choice items created with automated processes.

    Science.gov (United States)

    Gierl, Mark J; Lai, Hollis

    2013-07-01

    Computerised assessment raises formidable challenges because it requires large numbers of test items. Automatic item generation (AIG) can help address this test development problem because it yields large numbers of new items both quickly and efficiently. To date, however, the quality of the items produced using a generative approach has not been evaluated. The purpose of this study was to determine whether automatic processes yield items that meet standards of quality that are appropriate for medical testing. Quality was evaluated firstly by subjecting items created using both AIG and traditional processes to rating by a four-member expert medical panel using indicators of multiple-choice item quality, and secondly by asking the panellists to identify which items were developed using AIG in a blind review. Fifteen items from the domain of therapeutics were created in three different experimental test development conditions. The first 15 items were created by content specialists using traditional test development methods (Group 1 Traditional). The second 15 items were created by the same content specialists using AIG methods (Group 1 AIG). The third 15 items were created by a new group of content specialists using traditional methods (Group 2 Traditional). These 45 items were then evaluated for quality by a four-member panel of medical experts and were subsequently categorised as either Traditional or AIG items. Three outcomes were reported: (i) the items produced using traditional and AIG processes were comparable on seven of eight indicators of multiple-choice item quality; (ii) AIG items can be differentiated from Traditional items by the quality of their distractors, and (iii) the overall predictive accuracy of the four expert medical panellists was 42%. Items generated by AIG methods are, for the most part, equivalent to traditionally developed items from the perspective of expert medical reviewers. While the AIG method produced comparatively fewer plausible

  12. deltaPlotR: An R Package for Di?erential Item Functioning Analysis with Ango? s Delta Plot

    OpenAIRE

    David Magis; Bruno Facon

    2014-01-01

    Angoff's delta plot is a straightforward and not computationally intensive method to identify differential item functioning (DIF) among dichotomously scored items. This approach was recently improved by proposing an optimal threshold selection and by considering several item purification processes. Moreover, to support practical DIF analyses with the delta plot and these improvements, the R package deltaPlotR was also developed. The purpose of this paper is twofold: to outline the delta plot ...

  13. Using Cochran's Z Statistic to Test the Kernel-Smoothed Item Response Function Differences between Focal and Reference Groups

    Science.gov (United States)

    Zheng, Yinggan; Gierl, Mark J.; Cui, Ying

    2010-01-01

    This study combined the kernel smoothing procedure and a nonparametric differential item functioning statistic--Cochran's Z--to statistically test the difference between the kernel-smoothed item response functions for reference and focal groups. Simulation studies were conducted to investigate the Type I error and power of the proposed…

  14. Methodology for the development and calibration of the SCI-QOL item banks.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Victorson, David; Choi, Seung W; Gershon, Richard; Heinemann, Allen W; Cella, David

    2015-05-01

    To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Individual interviews (n=44) and focus groups (n=65 individuals with SCI and n=42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n=877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n=245) to assess test-retest reliability and stability. A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury--Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM.

  15. Development of the PROMIS positive emotional and sensory expectancies of smoking item banks.

    Science.gov (United States)

    Tucker, Joan S; Shadel, William G; Edelen, Maria Orlando; Stucky, Brian D; Li, Zhen; Hansen, Mark; Cai, Li

    2014-09-01

    The positive emotional and sensory expectancies of cigarette smoking include improved cognitive abilities, positive affective states, and pleasurable sensorimotor sensations. This paper describes development of Positive Emotional and Sensory Expectancies of Smoking item banks that will serve to standardize the assessment of this construct among daily and nondaily cigarette smokers. Data came from daily (N = 4,201) and nondaily (N =1,183) smokers who completed an online survey. To identify a unidimensional set of items, we conducted item factor analyses, item response theory analyses, and differential item functioning analyses. Additionally, we evaluated the performance of fixed-item short forms (SFs) and computer adaptive tests (CATs) to efficiently assess the construct. Eighteen items were included in the item banks (15 common across daily and nondaily smokers, 1 unique to daily, 2 unique to nondaily). The item banks are strongly unidimensional, highly reliable (reliability = 0.95 for both), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.86). Results from simulated CATs indicated that, on average, less than 8 items are needed to assess the construct with adequate precision using the item banks. These analyses identified a new set of items that can assess the positive emotional and sensory expectancies of smoking in a reliable and standardized manner. Considerable efficiency in assessing this construct can be achieved by using the item bank SF, employing computer adaptive tests, or selecting subsets of items tailored to specific research or clinical purposes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  16. Analysis of Item-Level Bias in the Bayley-III Language Subscales: The Validity and Utility of Standardized Language Assessment in a Multilingual Setting.

    Science.gov (United States)

    Goh, Shaun K Y; Tham, Elaine K H; Magiati, Iliana; Sim, Litwee; Sanmugam, Shamini; Qiu, Anqi; Daniel, Mary L; Broekman, Birit F P; Rifkin-Graboi, Anne

    2017-09-18

    The purpose of this study was to improve standardized language assessments among bilingual toddlers by investigating and removing the effects of bias due to unfamiliarity with cultural norms or a distributed language system. The Expressive and Receptive Bayley-III language scales were adapted for use in a multilingual country (Singapore). Differential item functioning (DIF) was applied to data from 459 two-year-olds without atypical language development. This involved investigating if the probability of success on each item varied according to language exposure while holding latent language ability, gender, and socioeconomic status constant. Associations with language, behavioral, and emotional problems were also examined. Five of 16 items showed DIF, 1 of which may be attributed to cultural bias and another to a distributed language system. The remaining 3 items favored toddlers with higher bilingual exposure. Removal of DIF items reduced associations between language scales and emotional and language problems, but improved the validity of the expressive scale from poor to good. Our findings indicate the importance of considering cultural and distributed language bias in standardized language assessments. We discuss possible mechanisms influencing performance on items favoring bilingual exposure, including the potential role of inhibitory processing.

  17. Item response theory analysis of the mechanics baseline test

    Science.gov (United States)

    Cardamone, Caroline N.; Abbott, Jonathan E.; Rayyan, Saif; Seaton, Daniel T.; Pawl, Andrew; Pritchard, David E.

    2012-02-01

    Item response theory is useful in both the development and evaluation of assessments and in computing standardized measures of student performance. In item response theory, individual parameters (difficulty, discrimination) for each item or question are fit by item response models. These parameters provide a means for evaluating a test and offer a better measure of student skill than a raw test score, because each skill calculation considers not only the number of questions answered correctly, but the individual properties of all questions answered. Here, we present the results from an analysis of the Mechanics Baseline Test given at MIT during 2005-2010. Using the item parameters, we identify questions on the Mechanics Baseline Test that are not effective in discriminating between MIT students of different abilities. We show that a limited subset of the highest quality questions on the Mechanics Baseline Test returns accurate measures of student skill. We compare student skills as determined by item response theory to the more traditional measurement of the raw score and show that a comparable measure of learning gain can be computed.

  18. The cultural fairness of the 12-item General Health Questionnaire among diverse adolescents.

    Science.gov (United States)

    Bowe, Anica

    2017-01-01

    The 12-item general health questionnaire (GHQ-12) was used in the Longitudinal Study of Young People in England (LSYPE; N = 15,770) to collect measures on adolescent mental health. Given the debate in current literature regarding the dimensionality of the GHQ-12, this study examined the cultural sensitivity of the instrument at the item level for each of the 7 major ethnic groups within the database. This study used a hybrid approach of ordinal logistic regression and item response theory (IRT) to examine the presence of differential item functioning (DIF) on the questionnaire. Results demonstrated that uniform, nonuniform, and overall DIF were present on items between White and Asian adolescents (7 items), White and Black Caribbean adolescents (1 item), and White and Black African adolescents (7 items), however all McFadden's pseudo R² effect size estimates indicated that the DIF was negligible. Overall, there were cumulative small scale level effects for the Mixed/Biracial, Asian, and Black African groups, but in each case the bias was only marginal. Findings demonstrate that the GHQ-12 can be considered culturally sensitive for adolescents from diverse ethnic groups in England, but follow-up studies are necessary. Implications for future education and health policies as well as the use of IR-based approaches for psychological instruments are discussed. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  19. Evaluation of the Psychometric Properties of the Asian Adolescent Depression Scale and Construction of a Short Form: An Item Response Theory Analysis.

    Science.gov (United States)

    Lo, Barbara Chuen Yee; Zhao, Yue; Kwok, Alice Wai Yee; Chan, Wai; Chan, Calais Kin Yuen

    2017-07-01

    The present study applied item response theory to examine the psychometric properties of the Asian Adolescent Depression Scale and to construct a short form among 1,084 teenagers recruited from secondary schools in Hong Kong. Findings suggested that some items of the full form reflected higher levels of severity and were more discriminating than others, and the Asian Adolescent Depression Scale was useful in measuring a broad range of depressive severity in community youths. Differential item functioning emerged in several items where females reported higher depressive severity than males. In the short form construction, preliminary validation suggested that, relative to the 20-item full form, our derived short form offered significantly greater diagnostic performance and stronger discriminatory ability in differentiating depressed and nondepressed groups, and simultaneously maintained adequate measurement precision with a reduced response burden in assessing depression in the Asian adolescents. Cultural variance in depressive symptomatology and clinical implications are discussed.

  20. A simple and fast item selection procedure for adaptive testing

    NARCIS (Netherlands)

    Veerkamp, W.J.J.; Veerkamp, Wim J.J.; Berger, Martijn; Berger, Martijn P.F.

    1994-01-01

    Items with the highest discrimination parameter values in a logistic item response theory (IRT) model do not necessarily give maximum information. This paper shows which discrimination parameter values (as a function of the guessing parameter and the distance between person ability and item

  1. Promoting cold-start items in recommender systems.

    Science.gov (United States)

    Liu, Jin-Hu; Zhou, Tao; Zhang, Zi-Ke; Yang, Zimo; Liu, Chuang; Li, Wei-Min

    2014-01-01

    As one of the major challenges, cold-start problem plagues nearly all recommender systems. In particular, new items will be overlooked, impeding the development of new products online. Given limited resources, how to utilize the knowledge of recommender systems and design efficient marketing strategy for new items is extremely important. In this paper, we convert this ticklish issue into a clear mathematical problem based on a bipartite network representation. Under the most widely used algorithm in real e-commerce recommender systems, the so-called item-based collaborative filtering, we show that to simply push new items to active users is not a good strategy. Interestingly, experiments on real recommender systems indicate that to connect new items with some less active users will statistically yield better performance, namely, these new items will have more chance to appear in other users' recommendation lists. Further analysis suggests that the disassortative nature of recommender systems contributes to such observation. In a word, getting in-depth understanding on recommender systems could pave the way for the owners to popularize their cold-start products with low costs.

  2. Promoting Cold-Start Items in Recommender Systems

    Science.gov (United States)

    Liu, Jin-Hu; Zhou, Tao; Zhang, Zi-Ke; Yang, Zimo; Liu, Chuang; Li, Wei-Min

    2014-01-01

    As one of the major challenges, cold-start problem plagues nearly all recommender systems. In particular, new items will be overlooked, impeding the development of new products online. Given limited resources, how to utilize the knowledge of recommender systems and design efficient marketing strategy for new items is extremely important. In this paper, we convert this ticklish issue into a clear mathematical problem based on a bipartite network representation. Under the most widely used algorithm in real e-commerce recommender systems, the so-called item-based collaborative filtering, we show that to simply push new items to active users is not a good strategy. Interestingly, experiments on real recommender systems indicate that to connect new items with some less active users will statistically yield better performance, namely, these new items will have more chance to appear in other users' recommendation lists. Further analysis suggests that the disassortative nature of recommender systems contributes to such observation. In a word, getting in-depth understanding on recommender systems could pave the way for the owners to popularize their cold-start products with low costs. PMID:25479013

  3. Development and psychometric characteristics of the SCI-QOL Ability to Participate and Satisfaction with Social Roles and Activities item banks and short forms.

    Science.gov (United States)

    Heinemann, Allen W; Kisala, Pamela A; Hahn, Elizabeth A; Tulsky, David S

    2015-05-01

    To develop a spinal cord injury (SCI)-focused version of PROMIS and Neuro-QOL social domain item banks; evaluate the psychometric properties of items developed for adults with SCI; and report information to facilitate clinical and research use. We used a mixed-methods design to develop and evaluate Ability to Participate in Social Roles and Activities and Satisfaction with Social Roles and Activities items. Focus groups helped define the constructs; cognitive interviews helped revise items; and confirmatory factor analysis and item response theory methods helped calibrate item banks and evaluate differential item functioning related to demographic and injury characteristics. Five SCI Model System sites and one Veterans Administration medical center. The calibration sample consisted of 641 individuals; a reliability sample consisted of 245 individuals residing in the community. A subset of 27 Ability to Participate and 35 Satisfaction items demonstrated good measurement properties and negligible differential item functioning related to demographic and injury characteristics. The SCI-specific measures correlate strongly with the PROMIS and Neuro-QOL versions. Ten item short forms correlate >0.96 with the full banks. Variable-length CATs with a minimum of 4 items, variable-length CATs with a minimum of 8 items, fixed-length CATs of 10 items, and the 10-item short forms demonstrate construct coverage and measurement error that is comparable to the full item bank. The Ability to Participate and Satisfaction with Social Roles and Activities CATs and short forms demonstrate excellent psychometric properties and are suitable for clinical and research applications.

  4. Development of a subjective cognitive decline questionnaire using item response theory: a pilot study.

    Science.gov (United States)

    Gifford, Katherine A; Liu, Dandan; Romano, Raymond; Jones, Richard N; Jefferson, Angela L

    2015-12-01

    Subjective cognitive decline (SCD) may indicate unhealthy cognitive changes, but no standardized SCD measurement exists. This pilot study aims to identify reliable SCD questions. 112 cognitively normal (NC, 76±8 years, 63% female), 43 mild cognitive impairment (MCI; 77±7 years, 51% female), and 33 diagnostically ambiguous participants (79±9 years, 58% female) were recruited from a research registry and completed 57 self-report SCD questions. Psychometric methods were used for item-reduction. Factor analytic models assessed unidimensionality of the latent trait (SCD); 19 items were removed with extreme response distribution or trait-fit. Item response theory (IRT) provided information about question utility; 17 items with low information were dropped. Post-hoc simulation using computerized adaptive test (CAT) modeling selected the most commonly used items (n=9 of 21 items) that represented the latent trait well (r=0.94) and differentiated NC from MCI participants (F(1,146)=8.9, p=0.003). Item response theory and computerized adaptive test modeling identified nine reliable SCD items. This pilot study is a first step toward refining SCD assessment in older adults. Replication of these findings and validation with Alzheimer's disease biomarkers will be an important next step for the creation of a SCD screener.

  5. Fibroblasts maintained in 3 dimensions show a better differentiation state and higher sensitivity to estrogens

    Energy Technology Data Exchange (ETDEWEB)

    Montani, Claudia [Laboratory of Biotechnology, Department of Laboratory Medicine, Civic Hospital of Brescia (Italy); Steimberg, Nathalie; Boniotti, Jennifer [Laboratory of Tissue Engineering, Anatomy and Physiopathology Unit, Department of Clinical and Experimental Sciences, School of Medicine, University of Brescia (Italy); Biasiotto, Giorgio; Zanella, Isabella [Laboratory of Biotechnology, Department of Laboratory Medicine, Civic Hospital of Brescia (Italy); Department of Molecular and Translational Medicine, University of Brescia, Brescia (Italy); Diafera, Giuseppe [Integrated Systems Engineering (ISE), Milan (Italy); Biunno, Ida [IRGB-CNR, Milan (Italy); IRCCS-Multimedica, Milan (Italy); Caimi, Luigi [Laboratory of Biotechnology, Department of Laboratory Medicine, Civic Hospital of Brescia (Italy); Department of Molecular and Translational Medicine, University of Brescia, Brescia (Italy); Mazzoleni, Giovanna [Laboratory of Tissue Engineering, Anatomy and Physiopathology Unit, Department of Clinical and Experimental Sciences, School of Medicine, University of Brescia (Italy); Di Lorenzo, Diego, E-mail: diego.dilorenzo@yahoo.it [Laboratory of Biotechnology, Department of Laboratory Medicine, Civic Hospital of Brescia (Italy)

    2014-11-01

    Cell differentiation and response to hormonal signals were studied in a 3D environment on an in-house generated mouse fibroblast cell line expressing a reporter gene under the control of estrogen responsive sequences (EREs). 3D cell culture conditions were obtained in a Rotary Cell Culture System; (RCCS™), a microgravity based bioreactor that promotes the aggregation of cells into multicellular spheroids (MCS). In this bioreactor the cells maintained a better differentiated phenotype and more closely resembled in vivo tissue. The RCCS™ cultured fibroblasts showed higher expression of genes regulating cell assembly, differentiation and hormonal functions. Microarray analysis showed that genes related to cell cycle, proliferation, cytoskeleton, migration, adhesion and motility were all down-regulated in 3D as compared to 2D conditions, as well as oncogene expression and inflammatory cytokines. Controlled remodeling of ECM, which is an essential aspect of cell organization, homeostasis and tissue was affected by the culture method as assessed by immunolocalization of β-tubulin. Markers of cell organization, homeostasis and tissue repair, metalloproteinase 2 (MMP2) and its physiological inhibitor (TIMP4) changed expression in association with the relative formation of cell aggregates. The fibroblasts cultured in the RCCS™ maintain a better responsiveness to estrogens, measured as expression of ERα and regulation of an ERE-dependent reporter and of the endogenous target genes CBP, Rarb, MMP1 and Dbp. Our data highlight the interest of this 3D culture model for its potential application in the field of cell response to hormonal signals and the pharmaco-toxicological analyses of chemicals and natural molecules endowed of estrogenic potential. - Highlights: • We here characterized the first cell line derived from an estrogen reporter mouse. • In the RCCS cells express an immortalized behavior but not a transformed phenotype. • The RCCS provides a system for

  6. Assessment of the psychometrics of a PROMIS item bank: self-efficacy for managing daily activities.

    Science.gov (United States)

    Hong, Ickpyo; Velozo, Craig A; Li, Chih-Ying; Romero, Sergio; Gruber-Baldini, Ann L; Shulman, Lisa M

    2016-09-01

    The aim of this study is to investigate the psychometrics of the Patient-Reported Outcomes Measurement Information System self-efficacy for managing daily activities item bank. The item pool was field tested on a sample of 1087 participants via internet (n = 250) and in-clinic (n = 837) surveys. All participants reported having at least one chronic health condition. The 35 item pool was investigated for dimensionality (confirmatory factor analyses, CFA and exploratory factor analysis, EFA), item-total correlations, local independence, precision, and differential item functioning (DIF) across gender, race, ethnicity, age groups, data collection modes, and neurological chronic conditions (McFadden Pseudo R (2) less than 10 %). The item pool met two of the four CFA fit criteria (CFI = 0.952 and SRMR = 0.07). EFA analysis found a dominant first factor (eigenvalue = 24.34) and the ratio of first to second eigenvalue was 12.4. The item pool demonstrated good item-total correlations (0.59-0.85) and acceptable internal consistency (Cronbach's alpha = 0.97). The item pool maintained its precision (reliability over 0.90) across a wide range of theta (3.70), and there was no significant DIF. The findings indicated the item pool has sound psychometric properties and the test items are eligible for development of computerized adaptive testing and short forms.

  7. Effects of Learning Experience on Forgetting Rates of Item and Associative Memories

    Science.gov (United States)

    Yang, Jiongjiong; Zhan, Lexia; Wang, Yingying; Du, Xiaoya; Zhou, Wenxi; Ning, Xueling; Sun, Qing; Moscovitch, Morris

    2016-01-01

    Are associative memories forgotten more quickly than item memories, and does the level of original learning differentially influence forgetting rates? In this study, we addressed these questions by having participants learn single words and word pairs once (Experiment 1), three times (Experiment 2), and six times (Experiment 3) in a massed…

  8. Detecting Differential Person Functioning in Emotional Intelligence

    Science.gov (United States)

    Alsmadi, Yahia M.; Alsmadi, Abdalla A.

    2009-01-01

    Differential Item Functioning (DIF) is a widely used term in test development literature. It is very important to analyze test's data for DIF because It is a serious threat to validity. If the same data matrix was transposed, similar analysis can be carried for Differential Person Functioning (DPF). The purpose of this paper is to introduce and…

  9. Item-saving assessment of self-care performance in children with developmental disabilities: A prospective caregiver-report computerized adaptive test

    Science.gov (United States)

    Chen, Cheng-Te; Chen, Yu-Lan; Lin, Yu-Ching; Hsieh, Ching-Lin; Tzeng, Jeng-Yi

    2018-01-01

    Objective The purpose of this study was to construct a computerized adaptive test (CAT) for measuring self-care performance (the CAT-SC) in children with developmental disabilities (DD) aged from 6 months to 12 years in a content-inclusive, precise, and efficient fashion. Methods The study was divided into 3 phases: (1) item bank development, (2) item testing, and (3) a simulation study to determine the stopping rules for the administration of the CAT-SC. A total of 215 caregivers of children with DD were interviewed with the 73-item CAT-SC item bank. An item response theory model was adopted for examining the construct validity to estimate item parameters after investigation of the unidimensionality, equality of slope parameters, item fitness, and differential item functioning (DIF). In the last phase, the reliability and concurrent validity of the CAT-SC were evaluated. Results The final CAT-SC item bank contained 56 items. The stopping rules suggested were (a) reliability coefficient greater than 0.9 or (b) 14 items administered. The results of simulation also showed that 85% of the estimated self-care performance scores would reach a reliability higher than 0.9 with a mean test length of 8.5 items, and the mean reliability for the rest was 0.86. Administering the CAT-SC could reduce the number of items administered by 75% to 84%. In addition, self-care performances estimated by the CAT-SC and the full item bank were very similar to each other (Pearson r = 0.98). Conclusion The newly developed CAT-SC can efficiently measure self-care performance in children with DD whose performances are comparable to those of TD children aged from 6 months to 12 years as precisely as the whole item bank. The item bank of the CAT-SC has good reliability and a unidimensional self-care construct, and the CAT can estimate self-care performance with less than 25% of the items in the item bank. Therefore, the CAT-SC could be useful for measuring self-care performance in children with

  10. Solving Differential Equations Analytically. Elementary Differential Equations. Modules and Monographs in Undergraduate Mathematics and Its Applications Project. UMAP Unit 335.

    Science.gov (United States)

    Goldston, J. W.

    This unit introduces analytic solutions of ordinary differential equations. The objective is to enable the student to decide whether a given function solves a given differential equation. Examples of problems from biology and chemistry are covered. Problem sets, quizzes, and a model exam are included, and answers to all items are provided. The…

  11. Testing the Item-Order Account of Design Effects Using the Production Effect

    Science.gov (United States)

    Jonker, Tanya R.; Levene, Merrick; MacLeod, Colin M.

    2014-01-01

    A number of memory phenomena evident in recall in within-subject, mixed-lists designs are reduced or eliminated in between-subject, pure-list designs. The item-order account (McDaniel & Bugg, 2008) proposes that differential retention of order information might underlie this pattern. According to this account, order information may be encoded…

  12. Distinguishing Differential Testlet Functioning from Differential Bundle Functioning Using the Multilevel Measurement Model

    Science.gov (United States)

    Beretvas, S. Natasha; Walker, Cindy M.

    2012-01-01

    This study extends the multilevel measurement model to handle testlet-based dependencies. A flexible two-level testlet response model (the MMMT-2 model) for dichotomous items is introduced that permits assessment of differential testlet functioning (DTLF). A distinction is made between this study's conceptualization of DTLF and that of…

  13. Robust Measurement via A Fused Latent and Graphical Item Response Theory Model.

    Science.gov (United States)

    Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Ying, Zhiliang

    2018-03-12

    Item response theory (IRT) plays an important role in psychological and educational measurement. Unlike the classical testing theory, IRT models aggregate the item level information, yielding more accurate measurements. Most IRT models assume local independence, an assumption not likely to be satisfied in practice, especially when the number of items is large. Results in the literature and simulation studies in this paper reveal that misspecifying the local independence assumption may result in inaccurate measurements and differential item functioning. To provide more robust measurements, we propose an integrated approach by adding a graphical component to a multidimensional IRT model that can offset the effect of unknown local dependence. The new model contains a confirmatory latent variable component, which measures the targeted latent traits, and a graphical component, which captures the local dependence. An efficient proximal algorithm is proposed for the parameter estimation and structure learning of the local dependence. This approach can substantially improve the measurement, given no prior information on the local dependence structure. The model can be applied to measure both a unidimensional latent trait and multidimensional latent traits.

  14. Lawton IADL scale in dementia: can item response theory make it more informative?

    Science.gov (United States)

    McGrory, Sarah; Shenkin, Susan D; Austin, Elizabeth J; Starr, John M

    2014-07-01

    impairment of functional abilities represents a crucial component of dementia diagnosis. Current functional measures rely on the traditional aggregate method of summing raw scores. While this summary score provides a quick representation of a person's ability, it disregards useful information on the item level. to use item response theory (IRT) methods to increase the interpretive power of the Lawton Instrumental Activities of Daily Living (IADL) scale by establishing a hierarchy of item 'difficulty' and 'discrimination'. this cross-sectional study applied IRT methods to the analysis of IADL outcomes. Participants were 202 members of the Scottish Dementia Research Interest Register (mean age = 76.39, range = 56-93, SD = 7.89 years) with complete itemised data available. a Mokken scale with good reliability (Molenaar Sijtsama statistic 0.79) was obtained, satisfying the IRT assumption that the items comprise a single unidimensional scale. The eight items in the scale could be placed on a hierarchy of 'difficulty' (H coefficient = 0.55), with 'Shopping' being the most 'difficult' item and 'Telephone use' being the least 'difficult' item. 'Shopping' was the most discriminatory item differentiating well between patients of different levels of ability. IRT methods are capable of providing more information about functional impairment than a summed score. 'Shopping' and 'Telephone use' were identified as items that reveal key information about a patient's level of ability, and could be useful screening questions for clinicians. © The Author 2013. Published by Oxford University Press on behalf of the British Geriatrics Society. All rights reserved. For Permissions, please email: journals.permissions@ oup.com.

  15. Evaluation of item candidates for a diabetic retinopathy quality of life item bank.

    Science.gov (United States)

    Fenwick, Eva K; Pesudovs, Konrad; Khadka, Jyoti; Rees, Gwyn; Wong, Tien Y; Lamoureux, Ecosse L

    2013-09-01

    We are developing an item bank assessing the impact of diabetic retinopathy (DR) on quality of life (QoL) using a rigorous multi-staged process combining qualitative and quantitative methods. We describe here the first two qualitative phases: content development and item evaluation. After a comprehensive literature review, items were generated from four sources: (1) 34 previously validated patient-reported outcome measures; (2) five published qualitative articles; (3) eight focus groups and 18 semi-structured interviews with 57 DR patients; and (4) seven semi-structured interviews with diabetes or ophthalmic experts. Items were then evaluated during 3 stages, namely binning (grouping) and winnowing (reduction) based on key criteria and panel consensus; development of item stems and response options; and pre-testing of items via cognitive interviews with patients. The content development phase yielded 1,165 unique items across 7 QoL domains. After 3 sessions of binning and winnowing, items were reduced to a minimally representative set (n = 312) across 9 domains of QoL: visual symptoms; ocular surface symptoms; activity limitation; mobility; emotional; health concerns; social; convenience; and economic. After 8 cognitive interviews, 42 items were amended resulting in a final set of 314 items. We have employed a systematic approach to develop items for a DR-specific QoL item bank. The psychometric properties of the nine QoL subscales will be assessed using Rasch analysis. The resulting validated item bank will allow clinicians and researchers to better understand the QoL impact of DR and DR therapies from the patient's perspective.

  16. Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank

    NARCIS (Netherlands)

    Oude Voshaar, Martijn A.H.; Ten Klooster, Peter M.; Vonkeman, Harald E.; van de Laar, Mart A.F.J.

    2017-01-01

    Objective: Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Study

  17. The role of attention in item-item binding in visual working memory.

    Science.gov (United States)

    Peterson, Dwight J; Naveh-Benjamin, Moshe

    2017-09-01

    An important yet unresolved question regarding visual working memory (VWM) relates to whether or not binding processes within VWM require additional attentional resources compared with processing solely the individual components comprising these bindings. Previous findings indicate that binding of surface features (e.g., colored shapes) within VWM is not demanding of resources beyond what is required for single features. However, it is possible that other types of binding, such as the binding of complex, distinct items (e.g., faces and scenes), in VWM may require additional resources. In 3 experiments, we examined VWM item-item binding performance under no load, articulatory suppression, and backward counting using a modified change detection task. Binding performance declined to a greater extent than single-item performance under higher compared with lower levels of concurrent load. The findings from each of these experiments indicate that processing item-item bindings within VWM requires a greater amount of attentional resources compared with single items. These findings also highlight an important distinction between the role of attention in item-item binding within VWM and previous studies of long-term memory (LTM) where declines in single-item and binding test performance are similar under divided attention. The current findings provide novel evidence that the specific type of binding is an important determining factor regarding whether or not VWM binding processes require attention. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  18. Automated Item Generation with Recurrent Neural Networks.

    Science.gov (United States)

    von Davier, Matthias

    2018-03-12

    Utilizing technology for automated item generation is not a new idea. However, test items used in commercial testing programs or in research are still predominantly written by humans, in most cases by content experts or professional item writers. Human experts are a limited resource and testing agencies incur high costs in the process of continuous renewal of item banks to sustain testing programs. Using algorithms instead holds the promise of providing unlimited resources for this crucial part of assessment development. The approach presented here deviates in several ways from previous attempts to solve this problem. In the past, automatic item generation relied either on generating clones of narrowly defined item types such as those found in language free intelligence tests (e.g., Raven's progressive matrices) or on an extensive analysis of task components and derivation of schemata to produce items with pre-specified variability that are hoped to have predictable levels of difficulty. It is somewhat unlikely that researchers utilizing these previous approaches would look at the proposed approach with favor; however, recent applications of machine learning show success in solving tasks that seemed impossible for machines not too long ago. The proposed approach uses deep learning to implement probabilistic language models, not unlike what Google brain and Amazon Alexa use for language processing and generation.

  19. Item-level factor analysis of the Self-Efficacy Scale.

    Science.gov (United States)

    Bunketorp Käll, Lina

    2014-03-01

    This study explores the internal structure of the Self-Efficacy Scale (SES) using item response analysis. The SES was previously translated into Swedish and modified to encompass all types of pain, not exclusively back pain. Data on perceived self-efficacy in 47 patients with subacute whiplash-associated disorders were derived from a previously conducted randomized-controlled trial. The item-level factor analysis was carried out using a six-step procedure. To further study the item inter-relationships and to determine the underlying structure empirically, the 20 items of the SES were also subjected to principal component analysis with varimax rotation. The analyses showed two underlying factors, named 'social activities' and 'physical activities', with seven items loading on each factor. The remaining six items of the SES appeared to measure somewhat different constructs and need to be analysed further.

  20. Negative affect impairs associative memory but not item memory.

    Science.gov (United States)

    Bisby, James A; Burgess, Neil

    2013-12-17

    The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine the effects of emotion on memory for items and their associations. By presenting neutral and negative items with background contexts, Experiment 1 demonstrated that item memory was facilitated by emotional affect, whereas memory for an associated context was reduced. In Experiment 2, arousal was manipulated independently of the memoranda, by a threat of shock, whereby encoding trials occurred under conditions of threat or safety. Memory for context was equally impaired by the presence of negative affect, whether induced by threat of shock or a negative item, relative to retrieval of the context of a neutral item in safety. In Experiment 3, participants were presented with neutral and negative items as paired associates, including all combinations of neutral and negative items. The results showed both above effects: compared to a neutral item, memory for the associate of a negative item (a second item here, context in Experiments 1 and 2) is impaired, whereas retrieval of the item itself is enhanced. Our findings suggest that negative affect impairs associative memory while recognition of a negative item is enhanced. They support dual-processing models in which negative affect or stress impairs hippocampal-dependent associative memory while the storage of negative sensory/perceptual representations is spared or even strengthened.

  1. Using item response theory to address vulnerabilities in FFQ.

    Science.gov (United States)

    Kazman, Josh B; Scott, Jonathan M; Deuster, Patricia A

    2017-09-01

    The limitations for self-reporting of dietary patterns are widely recognised as a major vulnerability of FFQ and the dietary screeners/scales derived from FFQ. Such instruments can yield inconsistent results to produce questionable interpretations. The present article discusses the value of psychometric approaches and standards in addressing these drawbacks for instruments used to estimate dietary habits and nutrient intake. We argue that a FFQ or screener that treats diet as a 'latent construct' can be optimised for both internal consistency and the value of the research results. Latent constructs, a foundation for item response theory (IRT)-based scales (e.g. Patient Reported Outcomes Measurement Information System) are typically introduced in the design stage of an instrument to elicit critical factors that cannot be observed or measured directly. We propose an iterative approach that uses such modelling to refine FFQ and similar instruments. To that end, we illustrate the benefits of psychometric modelling by using items and data from a sample of 12 370 Soldiers who completed the 2012 US Army Global Assessment Tool (GAT). We used factor analysis to build the scale incorporating five out of eleven survey items. An IRT-driven assessment of response category properties indicates likely problems in the ordering or wording of several response categories. Group comparisons, examined with differential item functioning (DIF), provided evidence of scale validity across each Army sub-population (sex, service component and officer status). Such an approach holds promise for future FFQ.

  2. Building an Evaluation Scale using Item Response Theory.

    Science.gov (United States)

    Lalor, John P; Wu, Hao; Yu, Hong

    2016-11-01

    Evaluation of NLP methods requires testing against a previously vetted gold-standard test set and reporting standard metrics (accuracy/precision/recall/F1). The current assumption is that all items in a given test set are equal with regards to difficulty and discriminating power. We propose Item Response Theory (IRT) from psychometrics as an alternative means for gold-standard test-set generation and NLP system evaluation. IRT is able to describe characteristics of individual items - their difficulty and discriminating power - and can account for these characteristics in its estimation of human intelligence or ability for an NLP task. In this paper, we demonstrate IRT by generating a gold-standard test set for Recognizing Textual Entailment. By collecting a large number of human responses and fitting our IRT model, we show that our IRT model compares NLP systems with the performance in a human population and is able to provide more insight into system performance than standard evaluation metrics. We show that a high accuracy score does not always imply a high IRT score, which depends on the item characteristics and the response pattern.

  3. Do Self Concept Tests Test Self Concept? An Evaluation of the Validity of Items on the Piers Harris and Coopersmith Measures.

    Science.gov (United States)

    Lynch, Mervin D.; Chaves, John

    Items from Peirs-Harris and Coopersmith self-concept tests were evaluated against independent measures on three self-constructs, idealized, empathic, and worth. Construct measurements were obtained with the semantic differential and D statistic. Ratings were obtained from 381 children, grades 4-6. For each test, item ratings and construct measures…

  4. Better assessment of physical function: item improvement is neglected but essential.

    Science.gov (United States)

    Bruce, Bonnie; Fries, James F; Ambrosini, Debbie; Lingala, Bharathi; Gandek, Barbara; Rose, Matthias; Ware, John E

    2009-01-01

    Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models

  5. The Protective Behavioral Strategies for Marijuana Scale: Further examination using item response theory.

    Science.gov (United States)

    Pedersen, Eric R; Huang, Wenjing; Dvorak, Robert D; Prince, Mark A; Hummer, Justin F

    2017-08-01

    Given recent state legislation legalizing marijuana for recreational purposes and majority popular opinion favoring these laws, we developed the Protective Behavioral Strategies for Marijuana scale (PBSM) to identify strategies that may mitigate the harms related to marijuana use among those young people who choose to use the drug. In the current study, we expand on the initial exploratory study of the PBSM to further validate the measure with a large and geographically diverse sample (N = 2,117; 60% women, 30% non-White) of college students from 11 different universities across the United States. We sought to develop a psychometrically sound item bank for the PBSM and to create a short assessment form that minimizes respondent burden and time. Quantitative item analyses, including exploratory and confirmatory factor analyses with item response theory (IRT) and evaluation of differential item functioning (DIF), revealed an item bank of 36 items that was examined for unidimensionality and good content coverage, as well as a short form of 17 items that is free of bias in terms of gender (men vs. women), race (White vs. non-White), ethnicity (Hispanic vs. non-Hispanic), and recreational marijuana use legal status (state recreational marijuana was legal for 25.5% of participants). We also provide a scoring table for easy transformation from sum scores to IRT scale scores. The PBSM item bank and short form associated strongly and negatively with past month marijuana use and consequences. The measure may be useful to researchers and clinicians conducting intervention and prevention programs with young adults. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  6. The Role of Medial Temporal Lobe Regions in Incidental and Intentional Retrieval of Item and Relational Information in Aging.

    Science.gov (United States)

    Wang, Wei-Chun; Giovanello, Kelly S

    2016-06-01

    Considerable neuropsychological and neuroimaging work indicates that the medial temporal lobes are critical for both item and relational memory retrieval. However, there remain outstanding issues in the literature, namely the extent to which medial temporal lobe regions are differentially recruited during incidental and intentional retrieval of item and relational information, and the extent to which aging may affect these neural substrates. The current fMRI study sought to address these questions; participants incidentally encoded word pairs embedded in sentences and incidental item and relational retrieval were assessed through speeded reading of intact, rearranged, and new word-pair sentences, while intentional item and relational retrieval were assessed through old/new associative recognition of a separate set of intact, rearranged, and new word pairs. Results indicated that, in both younger and older adults, anterior hippocampus and perirhinal cortex indexed incidental and intentional item retrieval in the same manner. In contrast, posterior hippocampus supported incidental and intentional relational retrieval in both age groups and an adjacent cluster in posterior hippocampus was recruited during both forms of relational retrieval for older, but not younger, adults. Our findings suggest that while medial temporal lobe regions do not differentiate between incidental and intentional forms of retrieval, there are distinct roles for anterior and posterior medial temporal lobe regions during retrieval of item and relational information, respectively, and further indicate that posterior regions may, under certain conditions, be over-recruited in healthy aging. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  7. ITEM LEVEL DIAGNOSTICS AND MODEL - DATA FIT IN ITEM ...

    African Journals Online (AJOL)

    Global Journal

    Item response theory (IRT) is a framework for modeling and analyzing item response ... data. Though, there is an argument that the evaluation of fit in IRT modeling has been ... National Council on Measurement in Education ... model data fit should be based on three types of ... prediction should be assessed through the.

  8. Maintenance of item and order information in verbal working memory.

    Science.gov (United States)

    Camos, Valérie; Lagner, Prune; Loaiza, Vanessa M

    2017-09-01

    Although verbal recall of item and order information is well-researched in short-term memory paradigms, there is relatively little research concerning item and order recall from working memory. The following study examined whether manipulating the opportunity for attentional refreshing and articulatory rehearsal in a complex span task differently affected the recall of item- and order-specific information of the memoranda. Five experiments varied the opportunity for articulatory rehearsal and attentional refreshing in a complex span task, but the type of recall was manipulated between experiments (item and order, order only, and item only recall). The results showed that impairing attentional refreshing and articulatory rehearsal similarly affected recall regardless of whether the scoring procedure (Experiments 1 and 4) or recall requirements (Experiments 2, 3, and 5) reflected item- or order-specific recall. This implies that both mechanisms sustain the maintenance of item and order information, and suggests that the common cumulative functioning of these two mechanisms to maintain items could be at the root of order maintenance.

  9. Using Item Response Theory to Describe the Nonverbal Literacy Assessment (NVLA)

    Science.gov (United States)

    Fleming, Danielle; Wilson, Mark; Ahlgrim-Delzell, Lynn

    2018-01-01

    The Nonverbal Literacy Assessment (NVLA) is a literacy assessment designed for students with significant intellectual disabilities. The 218-item test was initially examined using confirmatory factor analysis. This method showed that the test worked as expected, but the items loaded onto a single factor. This article uses item response theory to…

  10. The emotion dysregulation inventory: Psychometric properties and item response theory calibration in an autism spectrum disorder sample.

    Science.gov (United States)

    Mazefsky, Carla A; Yu, Lan; White, Susan W; Siegel, Matthew; Pilkonis, Paul A

    2018-04-06

    Individuals with autism spectrum disorder (ASD) often present with prominent emotion dysregulation that requires treatment but can be difficult to measure. The Emotion Dysregulation Inventory (EDI) was created using methods developed by the Patient-Reported Outcomes Measurement Information System (PROMIS ® ) to capture observable indicators of poor emotion regulation. Caregivers of 1,755 youth with ASD completed 66 candidate EDI items, and the final 30 items were selected based on classical test theory and item response theory (IRT) analyses. The analyses identified two factors: (a) Reactivity, characterized by intense, rapidly escalating, sustained, and poorly regulated negative emotional reactions, and (b) Dysphoria, characterized by anhedonia, sadness, and nervousness. The final items did not show differential item functioning (DIF) based on gender, age, intellectual ability, or verbal ability. Because the final items were calibrated using IRT, even a small number of items offers high precision, minimizing respondent burden. IRT co-calibration of the EDI with related measures demonstrated its superiority in assessing the severity of emotion dysregulation with as few as seven items. Validity of the EDI was supported by expert review, its association with related constructs (e.g., anxiety and depression symptoms, aggression), higher scores in psychiatric inpatients with ASD compared to a community ASD sample, and demonstration of test-retest stability and sensitivity to change. In sum, the EDI provides an efficient and sensitive method to measure emotion dysregulation for clinical assessment, monitoring, and research in youth with ASD of any level of cognitive or verbal ability. Autism Res 2018. © 2018 International Society for Autism Research, Wiley Periodicals, Inc. This paper describes a new measure of poor emotional control called the Emotion Dysregulation Inventory (EDI). Caregivers of 1,755 youth with ASD completed candidate items, and advanced statistical

  11. Attention restores discrete items to visual short-term memory.

    Science.gov (United States)

    Murray, Alexandra M; Nobre, Anna C; Clark, Ian A; Cravo, André M; Stokes, Mark G

    2013-04-01

    When a memory is forgotten, is it lost forever? Our study shows that selective attention can restore forgotten items to visual short-term memory (VSTM). In our two experiments, all stimuli presented in a memory array were designed to be equally task relevant during encoding. During the retention interval, however, participants were sometimes given a cue predicting which of the memory items would be probed at the end of the delay. This shift in task relevance improved recall for that item. We found that this type of cuing improved recall for items that otherwise would have been irretrievable, providing critical evidence that attention can restore forgotten information to VSTM. Psychophysical modeling of memory performance has confirmed that restoration of information in VSTM increases the probability that the cued item is available for recall but does not improve the representational quality of the memory. We further suggest that attention can restore discrete items to VSTM.

  12. Validation of the Spanish versions of the long (26 items) and short (12 items) forms of the Self-Compassion Scale (SCS).

    Science.gov (United States)

    Garcia-Campayo, Javier; Navarro-Gil, Mayte; Andrés, Eva; Montero-Marin, Jesús; López-Artal, Lorena; Demarzo, Marcelo Marcos Piva

    2014-01-10

    Self-compassion is a key psychological construct for assessing clinical outcomes in mindfulness-based interventions. The aim of this study was to validate the Spanish versions of the long (26 item) and short (12 item) forms of the Self-Compassion Scale (SCS). The translated Spanish versions of both subscales were administered to two independent samples: Sample 1 was comprised of university students (n = 268) who were recruited to validate the long form, and Sample 2 was comprised of Aragon Health Service workers (n = 271) who were recruited to validate the short form. In addition to SCS, the Mindful Attention Awareness Scale (MAAS), the State-Trait Anxiety Inventory-Trait (STAI-T), the Beck Depression Inventory (BDI) and the Perceived Stress Questionnaire (PSQ) were administered. Construct validity, internal consistency, test-retest reliability and convergent validity were tested. The Confirmatory Factor Analysis (CFA) of the long and short forms of the SCS confirmed the original six-factor model in both scales, showing goodness of fit. Cronbach's α for the 26 item SCS was 0.87 (95% CI = 0.85-0.90) and ranged between 0.72 and 0.79 for the 6 subscales. Cronbach's α for the 12-item SCS was 0.85 (95% CI = 0.81-0.88) and ranged between 0.71 and 0.77 for the 6 subscales. The long (26-item) form of the SCS showed a test-retest coefficient of 0.92 (95% CI = 0.89-0.94). The Intraclass Correlation (ICC) for the 6 subscales ranged from 0.84 to 0.93. The short (12-item) form of the SCS showed a test-retest coefficient of 0.89 (95% CI: 0.87-0.93). The ICC for the 6 subscales ranged from 0.79 to 0.91. The long and short forms of the SCS exhibited a significant negative correlation with the BDI, the STAI and the PSQ, and a significant positive correlation with the MAAS. The correlation between the total score of the long and short SCS form was r = 0.92. The Spanish versions of the long (26-item) and short (12-item) forms of the SCS are valid and

  13. Evolution of a Test Item

    Science.gov (United States)

    Spaan, Mary

    2007-01-01

    This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…

  14. A photographic method to measure food item intake. Validation in geriatric institutions.

    Science.gov (United States)

    Pouyet, Virginie; Cuvelier, Gérard; Benattar, Linda; Giboreau, Agnès

    2015-01-01

    From both a clinical and research perspective, measuring food intake is an important issue in geriatric institutions. However, weighing food in this context can be complex, particularly when the items remaining on a plate (side dish, meat or fish and sauce) need to be weighed separately following consumption. A method based on photography that involves taking photographs after a meal to determine food intake consequently seems to be a good alternative. This method enables the storage of raw data so that unhurried analyses can be performed to distinguish the food items present in the images. Therefore, the aim of this paper was to validate a photographic method to measure food intake in terms of differentiating food item intake in the context of a geriatric institution. Sixty-six elderly residents took part in this study, which was performed in four French nursing homes. Four dishes of standardized portions were offered to the residents during 16 different lunchtimes. Three non-trained assessors then independently estimated both the total and specific food item intakes of the participants using images of their plates taken after the meal (photographic method) and a reference image of one plate taken before the meal. Total food intakes were also recorded by weighing the food. To test the reliability of the photographic method, agreements between different assessors and agreements among various estimates made by the same assessor were evaluated. To test the accuracy and specificity of this method, food intake estimates for the four dishes were compared with the food intakes determined using the weighed food method. To illustrate the added value of the photographic method, food consumption differences between the dishes were explained by investigating the intakes of specific food items. Although they were not specifically trained for this purpose, the results demonstrated that the assessor estimates agreed between assessors and among various estimates made by the same

  15. The Longer We Have to Forget the More We Remember: The Ironic Effect of Postcue Duration in Item-Based Directed Forgetting

    Science.gov (United States)

    Bancroft, Tyler D.; Hockley, William E.; Farquhar, Riley

    2013-01-01

    The effects of the duration of remember and forget cues were examined to test the differential rehearsal account of item-based directed forgetting. In Experiments 1 and 2, cues were shown for 300, 600, or 900 ms, and a directed forgetting effect (better recognition of remember than forget items) was found at each duration. In addition, recognition…

  16. Efficient Algorithms for Segmentation of Item-Set Time Series

    Science.gov (United States)

    Chundi, Parvathi; Rosenkrantz, Daniel J.

    We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets—Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.

  17. Assessing Psychopathy Among Justice Involved Adolescents with the PCL: YV: An Item Response Theory Examination Across Gender

    Science.gov (United States)

    Tsang, Siny; Schmidt, Karen M.; Vincent, Gina M.; Salekin, Randall T.; Moretti, Marlene M.; Odgers, Candice L.

    2014-01-01

    This study used an item response theory (IRT) model and a large adolescent sample of justice involved youth (N = 1,007, 38% female) to examine the item functioning of the Psychopathy Checklist – Youth Version (PCL: YV). Items that were most discriminating (or most sensitive to changes) of the latent trait (thought to be psychopathy) among adolescents included “Glibness/superficial charm”, “Lack of remorse”, and “Need for stimulation”, whereas items that were least discriminating included “Pathological lying”, “Failure to accept responsibility”, and “Lacks goals.” The items “Impulsivity” and “Irresponsibility” were the most likely to be rated high among adolescents, whereas “Parasitic lifestyle”, and “Glibness/superficial charm” were the most likely to be rated low. Evidence of differential item functioning (DIF) on four of the 13 items was found between boys and girls. “Failure to accept responsibility” and “Impulsivity” were endorsed more frequently to describe adolescent girls than boys at similar levels of the latent trait, and vice versa for “Grandiose sense of self-worth” and “Lacks goals.” The DIF findings suggest that four PCL: YV items function differently between boys and girls. PMID:25580672

  18. Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2013-01-01

    Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

  19. Teoria da Resposta ao Item Teoria de la respuesta al item Item response theory

    Directory of Open Access Journals (Sweden)

    Eutalia Aparecida Candido de Araujo

    2009-12-01

    Full Text Available A preocupação com medidas de traços psicológicos é antiga, sendo que muitos estudos e propostas de métodos foram desenvolvidos no sentido de alcançar este objetivo. Entre os trabalhos propostos, destaca-se a Teoria da Resposta ao Item (TRI que, a princípio, veio completar limitações da Teoria Clássica de Medidas, empregada em larga escala até hoje na medida de traços psicológicos. O ponto principal da TRI é que ela leva em consideração o item particularmente, sem relevar os escores totais; portanto, as conclusões não dependem apenas do teste ou questionário, mas de cada item que o compõe. Este artigo propõe-se a apresentar esta Teoria que revolucionou a teoria de medidas.La preocupación con las medidas de los rasgos psicológicos es antigua y muchos estudios y propuestas de métodos fueron desarrollados para lograr este objetivo. Entre estas propuestas de trabajo se incluye la Teoría de la Respuesta al Ítem (TRI que, en principio, vino a completar las limitaciones de la Teoría Clásica de los Tests, ampliamente utilizada hasta hoy en la medida de los rasgos psicológicos. El punto principal de la TRI es que se tiene en cuenta el punto concreto, sin relevar las puntuaciones totales; por lo tanto, los resultados no sólo dependen de la prueba o cuestionario, sino que de cada ítem que lo compone. En este artículo se propone presentar la Teoría que revolucionó la teoría de medidas.The concern with measures of psychological traits is old and many studies and proposals of methods were developed to achieve this goal. Among these proposed methods highlights the Item Response Theory (IRT that, in principle, came to complete limitations of the Classical Test Theory, which is widely used until nowadays in the measurement of psychological traits. The main point of IRT is that it takes into account the item in particular, not relieving the total scores; therefore, the findings do not only depend on the test or questionnaire

  20. Measuring psychological trauma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Psychological Trauma item bank and short form.

    Science.gov (United States)

    Kisala, Pamela A; Victorson, David; Pace, Natalie; Heinemann, Allen W; Choi, Seung W; Tulsky, David S

    2015-05-01

    To describe the development and psychometric properties of the SCI-QOL Psychological Trauma item bank and short form. Using a mixed-methods design, we developed and tested a Psychological Trauma item bank with patient and provider focus groups, cognitive interviews, and item response theory based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a 31-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Veterans Administration hospital. A total of 716 individuals with SCI completed the trauma items The 31 items fit a unidimensional model (CFI=0.952; RMSEA=0.061) and demonstrated good precision (theta range between 0.6 and 2.5). Nine items demonstrated negligible DIF with little impact on score estimates. The final calibrated item bank contains 19 items The SCI-QOL Psychological Trauma item bank is a psychometrically robust measurement tool from which a short form and a computer adaptive test (CAT) version are available.

  1. Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items

    Science.gov (United States)

    Aybek, Eren Can; Demirtasli, R. Nukhet

    2017-01-01

    This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…

  2. Analyzing force concept inventory with item response theory

    Science.gov (United States)

    Wang, Jing; Bao, Lei

    2010-10-01

    Item response theory is a popular assessment method used in education. It rests on the assumption of a probability framework that relates students' innate ability and their performance on test questions. Item response theory transforms students' raw test scores into a scaled proficiency score, which can be used to compare results obtained with different test questions. The scaled score also addresses the issues of ceiling effects and guessing, which commonly exist in quantitative assessment. We used item response theory to analyze the force concept inventory (FCI). Our results show that item response theory can be useful for analyzing physics concept surveys such as the FCI and produces results about the individual questions and student performance that are beyond the capability of classical statistics. The theory yields detailed measurement parameters regarding the difficulty, discrimination features, and probability of correct guess for each of the FCI questions.

  3. Are great apes able to reason from multi-item samples to populations of food items?

    Science.gov (United States)

    Eckert, Johanna; Rakoczy, Hannes; Call, Josep

    2017-10-01

    Inductive learning from limited observations is a cognitive capacity of fundamental importance. In humans, it is underwritten by our intuitive statistics, the ability to draw systematic inferences from populations to randomly drawn samples and vice versa. According to recent research in cognitive development, human intuitive statistics develops early in infancy. Recent work in comparative psychology has produced first evidence for analogous cognitive capacities in great apes who flexibly drew inferences from populations to samples. In the present study, we investigated whether great apes (Pongo abelii, Pan troglodytes, Pan paniscus, Gorilla gorilla) also draw inductive inferences in the opposite direction, from samples to populations. In two experiments, apes saw an experimenter randomly drawing one multi-item sample from each of two populations of food items. The populations differed in their proportion of preferred to neutral items (24:6 vs. 6:24) but apes saw only the distribution of food items in the samples that reflected the distribution of the respective populations (e.g., 4:1 vs. 1:4). Based on this observation they were then allowed to choose between the two populations. Results show that apes seemed to make inferences from samples to populations and thus chose the population from which the more favorable (4:1) sample was drawn in Experiment 1. In this experiment, the more attractive sample not only contained proportionally but also absolutely more preferred food items than the less attractive sample. Experiment 2, however, revealed that when absolute and relative frequencies were disentangled, apes performed at chance level. Whether these limitations in apes' performance reflect true limits of cognitive competence or merely performance limitations due to accessory task demands is still an open question. © 2017 Wiley Periodicals, Inc.

  4. Nickel and cobalt release from jewellery and metal clothing items in Korea.

    Science.gov (United States)

    Cheong, Seung Hyun; Choi, You Won; Choi, Hae Young; Byun, Ji Yeon

    2014-01-01

    In Korea, the prevalence of nickel allergy has shown a sharply increasing trend. Cobalt contact allergy is often associated with concomitant reactions to nickel, and is more common in Korea than in western countries. The aim of the present study was to investigate the prevalence of items that release nickel and cobalt on the Korean market. A total of 471 items that included 193 branded jewellery, 202 non-branded jewellery and 76 metal clothing items were sampled and studied with a dimethylglyoxime (DMG) test and a cobalt spot test to detect nickel and cobalt release, respectively. Nickel release was detected in 47.8% of the tested items. The positive rates in the DMG test were 12.4% for the branded jewellery, 70.8% for the non-branded jewellery, and 76.3% for the metal clothing items. Cobalt release was found in 6.2% of items. Among the types of jewellery, belts and hair pins showed higher positive rates in both the DMG test and the cobalt spot test. Our study shows that the prevalence of items that release nickel or cobalt among jewellery and metal clothing items is high in Korea. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  5. A Differential Item Functioning (DIF) Analysis of the Communicative Participation Item Bank (CPIB): Comparing Individuals with Parkinson's Disease from the United States and New Zealand

    Science.gov (United States)

    Baylor, Carolyn; McAuliffe, Megan J.; Hughes, Louise E.; Yorkston, Kathryn; Anderson, Tim; Jiseon, Kim; Amtmann, Dagmar

    2014-01-01

    Purpose: To examine the cross-cultural applicability of the Communicative Participation Item Bank (CPIB) through a comparison of respondents with Parkinson's disease (PD) from the United States and New Zealand. Method: A total of 428 respondents--218 from the United States and 210 from New Zealand-completed the self-report CPIB and a series of…

  6. Selecting Items for Criterion-Referenced Tests.

    Science.gov (United States)

    Mellenbergh, Gideon J.; van der Linden, Wim J.

    1982-01-01

    Three item selection methods for criterion-referenced tests are examined: the classical theory of item difficulty and item-test correlation; the latent trait theory of item characteristic curves; and a decision-theoretic approach for optimal item selection. Item contribution to the standardized expected utility of mastery testing is discussed. (CM)

  7. Asymptotic Standard Errors for Item Response Theory True Score Equating of Polytomous Items

    Science.gov (United States)

    Cher Wong, Cheow

    2015-01-01

    Building on previous works by Lord and Ogasawara for dichotomous items, this article proposes an approach to derive the asymptotic standard errors of item response theory true score equating involving polytomous items, for equivalent and nonequivalent groups of examinees. This analytical approach could be used in place of empirical methods like…

  8. Dissociation between source and item memory in Parkinson's disease

    Institute of Scientific and Technical Information of China (English)

    Hu Panpan; Li Youhai; Ma Huijuan; Xi Chunhua; Chen Xianwen; Wang Kai

    2014-01-01

    Background Episodic memory includes information about item memory and source memory.Many researches support the hypothesis that these two memory systems are implemented by different brain structures.The aim of this study was to investigate the characteristics of item memory and source memory processing in patients with Parkinson's disease (PD),and to further verify the hypothesis of dual-process model of source and item memory.Methods We established a neuropsychological battery to measure the performance of item memory and source memory.Totally 35 PD individuals and 35 matched healthy controls (HC) were administrated with the battery.Item memory task consists of the learning and recognition of high-frequency national Chinese characters; source memory task consists of the learning and recognition of three modes (character,picture,and image) of objects.Results Compared with the controls,the idiopathic PD patients have been impaired source memory (PD vs.HC:0.65±0.06 vs.0.72±0.09,P=0.001),but not impaired in item memory (PD vs.HC:0.65±0.07 vs.0.67±0.08,P=0.240).Conclusions The present experiment provides evidence for dissociation between item and source memory in PD patients,thereby strengthening the claim that the item or source memory rely on different brain structures.PD patients show poor source memory,in which dopamine plays a critical role.

  9. Adaptive screening for depression--recalibration of an item bank for the assessment of depression in persons with mental and somatic diseases and evaluation in a simulated computer-adaptive test environment.

    Science.gov (United States)

    Forkmann, Thomas; Kroehne, Ulf; Wirtz, Markus; Norra, Christine; Baumeister, Harald; Gauggel, Siegfried; Elhan, Atilla Halil; Tennant, Alan; Boecker, Maren

    2013-11-01

    This study conducted a simulation study for computer-adaptive testing based on the Aachen Depression Item Bank (ADIB), which was developed for the assessment of depression in persons with somatic diseases. Prior to computer-adaptive test simulation, the ADIB was newly calibrated. Recalibration was performed in a sample of 161 patients treated for a depressive syndrome, 103 patients from cardiology, and 103 patients from otorhinolaryngology (mean age 44.1, SD=14.0; 44.7% female) and was cross-validated in a sample of 117 patients undergoing rehabilitation for cardiac diseases (mean age 58.4, SD=10.5; 24.8% women). Unidimensionality of the itembank was checked and a Rasch analysis was performed that evaluated local dependency (LD), differential item functioning (DIF), item fit and reliability. CAT-simulation was conducted with the total sample and additional simulated data. Recalibration resulted in a strictly unidimensional item bank with 36 items, showing good Rasch model fit (item fit residualsLD. CAT simulation revealed that 13 items on average were necessary to estimate depression in the range of -2 and +2 logits when terminating at SE≤0.32 and 4 items if using SE≤0.50. Receiver Operating Characteristics analysis showed that θ estimates based on the CAT algorithm have good criterion validity with regard to depression diagnoses (Area Under the Curve≥.78 for all cut-off criteria). The recalibration of the ADIB succeeded and the simulation studies conducted suggest that it has good screening performance in the samples investigated and that it may reasonably add to the improvement of depression assessment. © 2013.

  10. MMPI-2 Item Endorsements in Dissociative Identity Disorder vs. Simulators.

    Science.gov (United States)

    Brand, Bethany L; Chasson, Gregory S; Palermo, Cori A; Donato, Frank M; Rhodes, Kyle P; Voorhees, Emily F

    2016-03-01

    Elevated scores on some MMPI-2 (Minnesota Multiphasic Inventory-2) validity scales are common among patients with dissociative identity disorder (DID), which raises questions about the validity of their responses. Such patients show elevated scores on atypical answers (F), F-psychopathology (Fp), atypical answers in the second half of the test (FB), schizophrenia (Sc), and depression (D) scales, with Fp showing the greatest utility in distinguishing them from coached and uncoached DID simulators. In the current study, we investigated the items on the MMPI-2 F, Fp, FB, Sc, and D scales that were most and least commonly endorsed by participants with DID in our 2014 study and compared these responses with those of coached and uncoached DID simulators. The comparisons revealed that patients with DID most frequently endorsed items related to dissociation, trauma, depression, fearfulness, conflict within family, and self-destructiveness. The coached group more successfully imitated item endorsements of the DID group than did the uncoached group. However, both simulating groups, especially the uncoached group, frequently endorsed items that were uncommonly endorsed by the DID group. The uncoached group endorsed items consistent with popular media portrayals of people with DID being violent, delusional, and unlawful. These results suggest that item endorsement patterns can provide useful information to clinicians making determinations about whether an individual is presenting with DID or feigning. © 2016 American Academy of Psychiatry and the Law.

  11. Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory

    Science.gov (United States)

    Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi

    2016-01-01

    High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…

  12. Item Response Data Analysis Using Stata Item Response Theory Package

    Science.gov (United States)

    Yang, Ji Seung; Zheng, Xiaying

    2018-01-01

    The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

  13. An empirical comparison of Item Response Theory and Classical Test Theory

    Directory of Open Access Journals (Sweden)

    Špela Progar

    2008-11-01

    Full Text Available Based on nonlinear models between the measured latent variable and the item response, item response theory (IRT enables independent estimation of item and person parameters and local estimation of measurement error. These properties of IRT are also the main theoretical advantages of IRT over classical test theory (CTT. Empirical evidence, however, often failed to discover consistent differences between IRT and CTT parameters and between invariance measures of CTT and IRT parameter estimates. In this empirical study a real data set from the Third International Mathematics and Science Study (TIMSS 1995 was used to address the following questions: (1 How comparable are CTT and IRT based item and person parameters? (2 How invariant are CTT and IRT based item parameters across different participant groups? (3 How invariant are CTT and IRT based item and person parameters across different item sets? The findings indicate that the CTT and the IRT item/person parameters are very comparable, that the CTT and the IRT item parameters show similar invariance property when estimated across different groups of participants, that the IRT person parameters are more invariant across different item sets, and that the CTT item parameters are at least as much invariant in different item sets as the IRT item parameters. The results furthermore demonstrate that, with regards to the invariance property, IRT item/person parameters are in general empirically superior to CTT parameters, but only if the appropriate IRT model is used for modelling the data.

  14. An item-oriented recommendation algorithm on cold-start problem

    Science.gov (United States)

    Qiu, Tian; Chen, Guang; Zhang, Zi-Ke; Zhou, Tao

    2011-09-01

    Based on a hybrid algorithm incorporating the heat conduction and probability spreading processes (Proc. Natl. Acad. Sci. U.S.A., 107 (2010) 4511), in this letter, we propose an improved method by introducing an item-oriented function, focusing on solving the dilemma of the recommendation accuracy between the cold and popular items. Differently from previous works, the present algorithm does not require any additional information (e.g., tags). Further experimental results obtained in three real datasets, RYM, Netflix and MovieLens, show that, compared with the original hybrid method, the proposed algorithm significantly enhances the recommendation accuracy of the cold items, while it keeps the recommendation accuracy of the overall and the popular items. This work might shed some light on both understanding and designing effective methods for long-tailed online applications of recommender systems.

  15. Improved Approximation Algorithms for Item Pricing with Bounded Degree and Valuation

    Science.gov (United States)

    Hamane, Ryoso; Itoh, Toshiya

    When a store sells items to customers, the store wishes to decide the prices of the items to maximize its profit. If the store sells the items with low (resp. high) prices, the customers buy more (resp. less) items, which provides less profit to the store. It would be hard for the store to decide the prices of items. Assume that a store has a set V of n items and there is a set C of m customers who wish to buy those items. The goal of the store is to decide the price of each item to maximize its profit. We refer to this maximization problem as an item pricing problem. We classify the item pricing problems according to how many items the store can sell or how the customers valuate the items. If the store can sell every item i with unlimited (resp. limited) amount, we refer to this as unlimited supply (resp. limited supply). We say that the item pricing problem is single-minded if each customer j∈C wishes to buy a set ej⊆V of items and assigns valuation w(ej)≥0. For the single-minded item pricing problems (in unlimited supply), Balcan and Blum regarded them as weighted k-hypergraphs and gave several approximation algorithms. In this paper, we focus on the (pseudo) degree of k-hypergraphs and the valuation ratio, i. e., the ratio between the smallest and the largest valuations. Then for the single-minded item pricing problems (in unlimited supply), we show improved approximation algorithms (for k-hypergraphs, general graphs, bipartite graphs, etc.) with respect to the maximum (pseudo) degree and the valuation ratio.

  16. Changes in the expression of collagen genes show two stages in chondrocyte differentiation in vitro

    OpenAIRE

    1988-01-01

    This report deals with the quantitation of both mRNA and transcription activity of type I collagen gene and of three cartilage-specific collagens (types II, IX, and X) during in vitro differentiation of chick chondrocytes. Differentiation was obtained by transferal to suspension culture of dedifferentiated cells passaged for 3 wk as adherent cells. The type I collagen mRNA, highly represented in the dedifferentiated cells, rapidly decreased during chondrocyte differentiation. On the contrary,...

  17. Human colon cancer profiles show differential microRNA expression depending on mismatch repair status and are characteristic of undifferentiated proliferative states

    International Nuclear Information System (INIS)

    Sarver, Aaron L; Cunningham, Julie M; Subramanian, Subbaya; Wang, Liang; Smyrk, Tom C; Rodrigues, Cecilia MP; Thibodeau, Stephen N; Steer, Clifford J; French, Amy J; Borralho, Pedro M; Thayanithy, Venugopal; Oberg, Ann L; Silverstein, Kevin AT; Morlan, Bruce W; Riska, Shaun M; Boardman, Lisa A

    2009-01-01

    Colon cancer arises from the accumulation of multiple genetic and epigenetic alterations to normal colonic tissue. microRNAs (miRNAs) are small, non-coding regulatory RNAs that post-transcriptionally regulate gene expression. Differential miRNA expression in cancer versus normal tissue is a common event and may be pivotal for tumor onset and progression. To identify miRNAs that are differentially expressed in tumors and tumor subtypes, we carried out highly sensitive expression profiling of 735 miRNAs on samples obtained from a statistically powerful set of tumors (n = 80) and normal colon tissue (n = 28) and validated a subset of this data by qRT-PCR. Tumor specimens showed highly significant and large fold change differential expression of the levels of 39 miRNAs including miR-135b, miR-96, miR-182, miR-183, miR-1, and miR-133a, relative to normal colon tissue. Significant differences were also seen in 6 miRNAs including miR-31 and miR-592, in the direct comparison of tumors that were deficient or proficient for mismatch repair. Examination of the genomic regions containing differentially expressed miRNAs revealed that they were also differentially methylated in colon cancer at a far greater rate than would be expected by chance. A network of interactions between these miRNAs and genes associated with colon cancer provided evidence for the role of these miRNAs as oncogenes by attenuation of tumor suppressor genes. Colon tumors show differential expression of miRNAs depending on mismatch repair status. miRNA expression in colon tumors has an epigenetic component and altered expression that may reflect a reversion to regulatory programs characteristic of undifferentiated proliferative developmental states

  18. Optimization approach of background value and initial item for improving prediction precision of GM(1,1) model

    Institute of Scientific and Technical Information of China (English)

    Yuhong Wang; Qin Liu; Jianrong Tang; Wenbin Cao; Xiaozhong Li

    2014-01-01

    A combination method of optimization of the back-ground value and optimization of the initial item is proposed. The sequences of the unbiased exponential distribution are simulated and predicted through the optimization of the background value in grey differential equations. The principle of the new information priority in the grey system theory and the rationality of the initial item in the original GM(1,1) model are ful y expressed through the improvement of the initial item in the proposed time response function. A numerical example is employed to il ustrate that the proposed method is able to simulate and predict sequences of raw data with the unbiased exponential distribution and has better simulation performance and prediction precision than the original GM(1,1) model relatively.

  19. HIV/AIDS knowledge among men who have sex with men: applying the item response theory.

    Science.gov (United States)

    Gomes, Raquel Regina de Freitas Magalhães; Batista, José Rodrigues; Ceccato, Maria das Graças Braga; Kerr, Lígia Regina Franco Sansigolo; Guimarães, Mark Drew Crosland

    2014-04-01

    To evaluate the level of HIV/AIDS knowledge among men who have sex with men in Brazil using the latent trait model estimated by Item Response Theory. Multicenter, cross-sectional study, carried out in ten Brazilian cities between 2008 and 2009. Adult men who have sex with men were recruited (n = 3,746) through Respondent Driven Sampling. HIV/AIDS knowledge was ascertained through ten statements by face-to-face interview and latent scores were obtained through two-parameter logistic modeling (difficulty and discrimination) using Item Response Theory. Differential item functioning was used to examine each item characteristic curve by age and schooling. Overall, the HIV/AIDS knowledge scores using Item Response Theory did not exceed 6.0 (scale 0-10), with mean and median values of 5.0 (SD = 0.9) and 5.3, respectively, with 40.7% of the sample with knowledge levels below the average. Some beliefs still exist in this population regarding the transmission of the virus by insect bites, by using public restrooms, and by sharing utensils during meals. With regard to the difficulty and discrimination parameters, eight items were located below the mean of the scale and were considered very easy, and four items presented very low discrimination parameter (items contributed to the inaccuracy of the measurement of knowledge among those with median level and above. Item Response Theory analysis, which focuses on the individual properties of each item, allows measures to be obtained that do not vary or depend on the questionnaire, which provides better ascertainment and accuracy of knowledge scores. Valid and reliable scales are essential for monitoring HIV/AIDS knowledge among the men who have sex with men population over time and in different geographic regions, and this psychometric model brings this advantage.

  20. [Social anxiety and self-esteem: Hungarian validation of the "Brief Fear of Negative Evaluation Scale - Straightforward Items"].

    Science.gov (United States)

    Perczel-Forintos, Dóra; Kresznerits, Szilvia

    2017-06-01

    Although social anxiety disorder (SAD) is the third most frequent emotional disorder with 13-15% prevalence rate, it remains unrecognized very often. Social phobia is associated with low self-esteem, high self-criticism and fear of negative evaluation by others. It shows high comorbidity with depression, alcoholism, drug addiction and eating disorders. To adapt the widely used "Fear of Negative Evaluation" (FNE) social phobia questionnaire. Anxiety and mood disorder patients (n = 255) completed the Fear of Negative Evaluation Scale (30, 12 and 8 item-versions) as well as social cognition, anxiety and self-esteem questionnaires. All the three versions of the FNE have strong internal validity (α>0.83) and moderate significant correlation with low self-esteem, negative social cognitions and anxiety. The short 8-item BFNE-S has the strongest disciminative value in differentiating patients with social phobia and with other emotional disorders. The Hungarian version of the BFNE-S is an effective tool for the quick recognition of social phobia. Orv Hetil. 2017; 158(22): 843-850.

  1. Association of breast cancer risk with genetic variants showing differential allelic expression

    DEFF Research Database (Denmark)

    Hamdi, Yosr; Soucy, Penny; Adoue, Véronique

    2016-01-01

    There are significant inter-individual differences in the levels of gene expression. Through modulation of gene expression, cis-acting variants represent an important source of phenotypic variation. Consequently, cis-regulatory SNPs associated with differential allelic expression are functional...

  2. Separating relational from item load effects in paired recognition: temporoparietal and middle frontal gyral activity with increased associates, but not items during encoding and retention.

    Science.gov (United States)

    Phillips, Steven; Niki, Kazuhisa

    2002-10-01

    Working memory is affected by items stored and the relations between them. However, separating these factors has been difficult, because increased items usually accompany increased associations/relations. Hence, some have argued, relational effects are reducible to item effects. We overcome this problem by manipulating index length: the fewest number of item positions at which there is a unique item, or tuple of items (if length >1), for every instance in the relational (memory) set. Longer indexes imply greater similarity (number of shared items) between instances and higher load on encoding processes. Subjects were given lists of study pairs and asked to make a recognition judgement. The number of unique items and index length in the three list conditions were: (1) AB, CD: four/one; (2) AB, CD, EF: six/one; and (3) AB, AD, CB: four/two, respectively. Japanese letters were used in Experiments 1 (kanji-ideograms) and 2 (hiragana-phonograms); numbers in Experiment 3; and shapes generated from Fourier descriptors in Experiment 4. Across all materials, right dominant temporoparietal and middle frontal gyral activity was found with increased index length, but not items during study. In Experiment 5, a longer delay was used to isolate retention effects in the absence of visual stimuli. Increased left hemispheric activity was observed in the precuneus, middle frontal gyrus, and superior temporal gyrus with increased index length for the delay period. These results show that relational load is not reducible to item load.

  3. Effect of study context on item recollection.

    Science.gov (United States)

    Skinner, Erin I; Fernandes, Myra A

    2010-07-01

    We examined how visual context information provided during encoding, and unrelated to the target word, affected later recollection for words presented alone using a remember-know paradigm. Experiments 1A and 1B showed that participants had better overall memory-specifically, recollection-for words studied with pictures of intact faces than for words studied with pictures of scrambled or inverted faces. Experiment 2 replicated these results and showed that recollection was higher for words studied with pictures of faces than when no image accompanied the study word. In Experiment 3 participants showed equivalent memory for words studied with unique faces as for those studied with a repeatedly presented face. Results suggest that recollection benefits when visual context information high in meaningful content accompanies study words and that this benefit is not related to the uniqueness of the context. We suggest that participants use elaborative processes to integrate item and meaningful contexts into ensemble information, improving subsequent item recollection.

  4. Complex versus Simple Modeling for DIF Detection: When the Intraclass Correlation Coefficient (?) of the Studied Item Is Less Than the ? of the Total Score

    Science.gov (United States)

    Jin, Ying; Myers, Nicholas D.; Ahn, Soyeon

    2014-01-01

    Previous research has demonstrated that differential item functioning (DIF) methods that do not account for multilevel data structure could result in too frequent rejection of the null hypothesis (i.e., no DIF) when the intraclass correlation coefficient (?) of the studied item was the same as the ? of the total score. The current study extended…

  5. Brief Sensation Seeking Scale: Latent structure of 8-item and 4-item versions in Peruvian adolescents.

    Science.gov (United States)

    Merino-Soto, Cesar; Salas Blas, Edwin

    2018-01-01

    This research intended to validate two brief scales of sensations seeking with Peruvian adolescents: the eight item scale (BSSS8; Hoyle, Stephenson, Palmgreen, Lorch, y Donohew, 2002) and the four item scale (BSSS4; Stephenson, Hoyle, Slater, y Palmgreen, 2003). Questionnaires were administered to 618 voluntary participants, with an average age of 13.6 years, from different levels of high school, state and private school in a district in the south of Lima. It analyzed the internal structure of both short versions using three models: a) unidimensional (M1), b) oblique or related dimensions (M2), and c) the bifactor model (M3). Results show that both instruments have a single dimension which best represents the variability of the items; a fact that can be explained both by the complexity of the concept and by the small number of items representing each factor, which is more noticeable in the BSSS4. Reliability is within levels found by previous studies: alpha: .745 = BSSS8 and BSSS4 =. 643; omega coefficient: .747 in BSSS8 and .651 in BSSS4. These are considered suitable for the type of instruments studied. Based on the correlation between the two instruments, it was found that there are satisfactory levels of equivalence between the BSSS8 and BSSS4. However, it is recommended that the BSSS4 is mainly used for research and for the purpose of describing populations.

  6. Erasing and blurring memories: The differential impact of interference on separate aspects of forgetting.

    Science.gov (United States)

    Sun, Sol Z; Fidalgo, Celia; Barense, Morgan D; Lee, Andy C H; Cant, Jonathan S; Ferber, Susanne

    2017-11-01

    Interference disrupts information processing across many timescales, from immediate perception to memory over short and long durations. The widely held similarity assumption states that as similarity between interfering information and memory contents increases, so too does the degree of impairment. However, information is lost from memory in different ways. For instance, studied content might be erased in an all-or-nothing manner. Alternatively, information may be retained but the precision might be degraded or blurred. Here, we asked whether the similarity of interfering information to memory contents might differentially impact these 2 aspects of forgetting. Observers studied colored images of real-world objects, each followed by a stream of interfering objects. Across 4 experiments, we manipulated the similarity between the studied object and the interfering objects in circular color space. After interference, memory for object color was tested continuously on a color wheel, which in combination with mixture modeling, allowed for estimation of how erasing and blurring differentially contribute to forgetting. In contrast to the similarity assumption, we show that highly dissimilar interfering items caused the greatest increase in random guess responses, suggesting a greater frequency of memory erasure (Experiments 1-3). Moreover, we found that observers were generally able to resist interference from highly similar items, perhaps through surround suppression (Experiments 1 and 4). Finally, we report that interference from items of intermediate similarity tended to blur or decrease memory precision (Experiments 3 and 4). These results reveal that the nature of visual similarity can differentially alter how information is lost from memory. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  7. A Case of Carcinoma Showing Thymus-Like Differentiation with a Rapidly Lethal Course

    Directory of Open Access Journals (Sweden)

    Tomohiro Nogami

    2014-12-01

    Full Text Available A 55-year-old woman underwent a total thyroidectomy for carcinoma showing thymus-like differentiation (CASTLE. The patient was referred to our hospital after the tumor was found to have directly invaded the cervical esophagus and the entire circumference of the trachea. A total thyroidectomy was performed, followed by end-to-end anastomosis of the trachea, suprahyoid release and dissection of bilateral pulmonary ligaments. No major complications, including anastomotic dehiscence or stenosis, were observed. The patient experienced some swallowing disturbances and hoarseness during the perioperative period but fully recovered. Radiotherapy to the neck was performed as an adjuvant therapy. Eleven months after surgery, lower back pain and right leg numbness developed and led to gait inability. Multiple lung and bone recurrences were observed, but no local recurrence. Palliative radiotherapy to the bone metastasis was performed. The patient died of pleural metastasis 14 months after the initial diagnosis of CASTLE.

  8. Measuring resilience after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Resilience item bank and short form.

    Science.gov (United States)

    Victorson, David; Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Weiland, Brian; Choi, Seung W

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury--Quality of Life (SCI-QOL) Resilience item bank and short form. Using a mixed-methods design, we developed and tested a resilience item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory based analytic approaches, including tests of model fit and differential item functioning (DIF). We tested a 32-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Department of Veterans Affairs medical center. A total of 717 individuals with SCI completed the Resilience items. A unidimensional model was observed (CFI=0.968; RMSEA=0.074) and measurement precision was good (theta range between -3.1 and 0.9). Ten items were flagged for DIF, however, after examination of effect sizes we found this to be negligible with little practical impact on score estimates. The final calibrated item bank resulted in 21 retained items. This study indicates that the SCI-QOL Resilience item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.

  9. An Anthropologist among the Psychometricians: Assessment Events, Ethnography, and Differential Item Functioning in the Mongolian Gobi

    Science.gov (United States)

    Maddox, Bryan; Zumbo, Bruno D.; Tay-Lim, Brenda; Qu, Demin

    2015-01-01

    This article explores the potential for ethnographic observations to inform the analysis of test item performance. In 2010, a standardized, large-scale adult literacy assessment took place in Mongolia as part of the United Nations Educational, Scientific and Cultural Organization Literacy Assessment and Monitoring Programme (LAMP). In a novel form…

  10. Measuring performance at trade shows

    DEFF Research Database (Denmark)

    Hansen, Kåre

    2004-01-01

    Trade shows is an increasingly important marketing activity to many companies, but current measures of trade show performance do not adequately capture dimensions important to exhibitors. Based on the marketing literature's outcome and behavior-based control system taxonomy, a model is built...... that captures a outcome-based sales dimension and four behavior-based dimensions (i.e. information-gathering, relationship building, image building, and motivation activities). A 16-item instrument is developed for assessing exhibitors perceptions of their trade show performance. The paper presents evidence...

  11. The Technical Quality of Test Items Generated Using a Systematic Approach to Item Writing.

    Science.gov (United States)

    Siskind, Theresa G.; Anderson, Lorin W.

    The study was designed to examine the similarity of response options generated by different item writers using a systematic approach to item writing. The similarity of response options to student responses for the same item stems presented in an open-ended format was also examined. A non-systematic (subject matter expertise) approach and a…

  12. Rats Remember Items in Context Using Episodic Memory.

    Science.gov (United States)

    Panoz-Brown, Danielle; Corbin, Hannah E; Dalecki, Stefan J; Gentry, Meredith; Brotheridge, Sydney; Sluka, Christina M; Wu, Jie-En; Crystal, Jonathon D

    2016-10-24

    Vivid episodic memories in people have been characterized as the replay of unique events in sequential order [1-3]. Animal models of episodic memory have successfully documented episodic memory of a single event (e.g., [4-8]). However, a fundamental feature of episodic memory in people is that it involves multiple events, and notably, episodic memory impairments in human diseases are not limited to a single event. Critically, it is not known whether animals remember many unique events using episodic memory. Here, we show that rats remember many unique events and the contexts in which the events occurred using episodic memory. We used an olfactory memory assessment in which new (but not old) odors were rewarded using 32 items. Rats were presented with 16 odors in one context and the same odors in a second context. To attain high accuracy, the rats needed to remember item in context because each odor was rewarded as a new item in each context. The demands on item-in-context memory were varied by assessing memory with 2, 3, 5, or 15 unpredictable transitions between contexts, and item-in-context memory survived a 45 min retention interval challenge. When the memory of item in context was put in conflict with non-episodic familiarity cues, rats relied on item in context using episodic memory. Our findings suggest that rats remember multiple unique events and the contexts in which these events occurred using episodic memory and support the view that rats may be used to model fundamental aspects of human cognition. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Endometrial Cancer Side-Population Cells Show Prominent Migration and Have a Potential to Differentiate into the Mesenchymal Cell Lineage

    Science.gov (United States)

    Kato, Kiyoko; Takao, Tomoka; Kuboyama, Ayumi; Tanaka, Yoshihiro; Ohgami, Tatsuhiro; Yamaguchi, Shinichiro; Adachi, Sawako; Yoneda, Tomoko; Ueoka, Yousuke; Kato, Keiji; Hayashi, Shinichi; Asanoma, Kazuo; Wake, Norio

    2010-01-01

    Cancer stem-like cell subpopulations, referred to as “side-population” (SP) cells, have been identified in several tumors based on their ability to efflux the fluorescent dye Hoechst 33342. Although SP cells have been identified in the normal human endometrium and endometrial cancer, little is known about their characteristics. In this study, we isolated and characterized the SP cells in human endometrial cancer cells and in rat endometrial cells expressing oncogenic human K-Ras protein. These SP cells showed i) reduction in the expression levels of differentiation markers; ii) long-term proliferative capacity of the cell cultures; iii) self-renewal capacity in vitro; iv) enhancement of migration, lamellipodia, and, uropodia formation; and v) enhanced tumorigenicity. In nude mice, SP cells formed large, invasive tumors, which were composed of both tumor cells and stromal-like cells with enriched extracellular matrix. The expression levels of vimentin, α-smooth muscle actin, and collagen III were enhanced in SP tumors compared with the levels in non-SP tumors. In addition, analysis of microdissected samples and fluorescence in situ hybridization of Hec1-SP-tumors showed that the stromal-like cells with enriched extracellular matrix contained human DNA, confirming that the stromal-like cells were derived from the inoculated cells. Moreober, in a Matrigel assay, SP cells differentiated into α-smooth muscle actin-expressing cells. These findings demonstrate that SP cells have cancer stem-like cell features, including the potential to differentiate into the mesenchymal cell lineage. PMID:20008133

  14. Win, Place, Show

    Science.gov (United States)

    Fetros, John G.

    1972-01-01

    Items considered to be a basic reference collection in the area of thoroughbred breeding and racing are given along with annotations which identify the content, usage and value of the particular item in the complete collection. (7 references) (Author/NH)

  15. Optimal lot sizing in screening processes with returnable defective items

    Science.gov (United States)

    Vishkaei, Behzad Maleki; Niaki, S. T. A.; Farhangi, Milad; Rashti, Mehdi Ebrahimnezhad Moghadam

    2014-07-01

    This paper is an extension of Hsu and Hsu (Int J Ind Eng Comput 3(5):939-948, 2012) aiming to determine the optimal order quantity of product batches that contain defective items with percentage nonconforming following a known probability density function. The orders are subject to 100 % screening process at a rate higher than the demand rate. Shortage is backordered, and defective items in each ordering cycle are stored in a warehouse to be returned to the supplier when a new order is received. Although the retailer does not sell defective items at a lower price and only trades perfect items (to avoid loss), a higher holding cost incurs to store defective items. Using the renewal-reward theorem, the optimal order and shortage quantities are determined. Some numerical examples are solved at the end to clarify the applicability of the proposed model and to compare the new policy to an existing one. The results show that the new policy provides better expected profit per time.

  16. Evaluation of the box and blocks test, stereognosis and item banks of activity and upper extremity function in youths with brachial plexus birth palsy.

    Science.gov (United States)

    Mulcahey, Mary Jane; Kozin, Scott; Merenda, Lisa; Gaughan, John; Tian, Feng; Gogola, Gloria; James, Michelle A; Ni, Pengsheng

    2012-09-01

    One of the greatest limitations to measuring outcomes in pediatric orthopaedics is the lack of effective instruments. Computer adaptive testing, which uses large item banks, select only items that are relevant to a child's function based on a previous response and filters items that are too easy or too hard or simply not relevant to the child. In this way, computer adaptive testing provides for a meaningful, efficient, and precise method to evaluate patient-reported outcomes. Banks of items that assess activity and upper extremity (UE) function have been developed for children with cerebral palsy and have enabled computer adaptive tests that showed strong reliability, strong validity, and broader content range when compared with traditional instruments. Because of the void in instruments for children with brachial plexus birth palsy (BPBP) and the importance of having an UE and activity scale, we were interested in how well these items worked in this population. Cross-sectional, multicenter study involving 200 children with BPBP was conducted. The box and block test (BBT) and Stereognosis tests were administered and patient reports of UE function and activity were obtained with the cerebral palsy item banks. Differential item functioning (DIF) was examined. Predictive ability of the BBT and stereognosis was evaluated with proportional odds logistic regression model. Spearman correlations coefficients (rs) were calculated to examine correlation between stereognosis and the BBT and between individual stereognosis items and the total stereognosis score. Six of the 86 items showed DIF, indicating that the activity and UE item banks may be useful for computer adaptive tests for children with BPBP. The penny and the button were strongest predictors of impairment level (odds ratio=0.34 to 0.40]. There was a good positive relationship between total stereognosis and BBT scores (rs=0.60). The BBT had a good negative (rs=-0.55) and good positive (rs=0.55) relationship with

  17. Generalizability theory and item response theory

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a

  18. Sharing the cost of redundant items

    DEFF Research Database (Denmark)

    Hougaard, Jens Leth; Moulin, Hervé

    2014-01-01

    We ask how to share the cost of finitely many public goods (items) among users with different needs: some smaller subsets of items are enough to serve the needs of each user, yet the cost of all items must be covered, even if this entails inefficiently paying for redundant items. Typical examples...... are network connectivity problems when an existing (possibly inefficient) network must be maintained. We axiomatize a family cost ratios based on simple liability indices, one for each agent and for each item, measuring the relative worth of this item across agents, and generating cost allocation rules...... additive in costs....

  19. Generalizability theory and item response theory

    OpenAIRE

    Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a selected-response format. This chapter presents a short overview of how item response theory and generalizability theory were integrated to model such assessments. Further, the precision of the esti...

  20. The randomly renewed general item and the randomly inspected item with exponential life distribution

    International Nuclear Information System (INIS)

    Schneeweiss, W.G.

    1979-01-01

    For a randomly renewed item the probability distributions of the time to failure and of the duration of down time and the expectations of these random variables are determined. Moreover, it is shown that the same theory applies to randomly checked items with exponential probability distribution of life such as electronic items. The case of periodic renewals is treated as an example. (orig.) [de

  1. Calcium imaging shows differential sensitivity to cooling and communication in luminous transgenic plants.

    Science.gov (United States)

    Campbell, A K; Trewavas, A J; Knight, M R

    1996-03-01

    Imaging of a recombinant bioluminescent Ca2+ indicator, aequorin, in an entire organism showed three novel features of Ca2+ signals in plants. First, cooling the plant from 25 degrees C to 2 degrees C demonstrated differential sensitivities between organs, the roots firing a Ca2+ signal at some 8-10 degrees C higher than the cotyledons. Secondly, prolonged cooling provoked Ca2+ oscillations, but only in the cotyledons. These oscillations occurred with a frequency of 100 s and damped down within 800 s. Thirdly, cooling the roots of mature plants triggered a Ca2+ signal in the leaves, as a result of organ-organ communication. However, warming and then recooling the roots did not generate a second Ca2+ signal in these leaves. This desensitisation was not due to down-regulation in the leaf since this was able to generate a Ca2+ signal of its own when cooled directly. Thus a combination of a recombinant bioluminescent indicator with photon counting imaging reveals startling new aspects of signalling in intact organs and whole organisms.

  2. An Analysis of the Connectedness to Nature Scale Based on Item Response Theory.

    Science.gov (United States)

    Pasca, Laura; Aragonés, Juan I; Coello, María T

    2017-01-01

    The Connectedness to Nature Scale (CNS) is used as a measure of the subjective cognitive connection between individuals and nature. However, to date, it has not been analyzed at the item level to confirm its quality. In the present study, we conduct such an analysis based on Item Response Theory. We employed data from previous studies using the Spanish-language version of the CNS, analyzing a sample of 1008 participants. The results show that seven items presented appropriate indices of discrimination and difficulty, in addition to a good fit. The remaining six have inadequate discrimination indices and do not present a good fit. A second study with 321 participants shows that the seven-item scale has adequate levels of reliability and validity. Therefore, it would be appropriate to use a reduced version of the scale after eliminating the items that display inappropriate behavior, since they may interfere with research results on connectedness to nature.

  3. Using the Item Response Theory (IRT) for Educational Evaluation through Games

    Science.gov (United States)

    Euzébio Batista, Marcelo Henrique; Victória Barbosa, Jorge Luis; da Rosa Tavares, João Elison; Hackenhaar, Jonathan Luis

    2013-01-01

    This article shows the application of Item Response Theory (IRT) for educational evaluation using games. The article proposes a computational model to create user profiles, called Psychometric Profile Generator (PPG). PPG uses the IRT mathematical model for exploring the levels of skills and behaviors in the form of items and/or stimuli. The model…

  4. Dispersal in the sub-Antarctic: king penguins show remarkably little population genetic differentiation across their range.

    Science.gov (United States)

    Clucas, Gemma V; Younger, Jane L; Kao, Damian; Rogers, Alex D; Handley, Jonathan; Miller, Gary D; Jouventin, Pierre; Nolan, Paul; Gharbi, Karim; Miller, Karen J; Hart, Tom

    2016-10-13

    Seabirds are important components of marine ecosystems, both as predators and as indicators of ecological change, being conspicuous and sensitive to changes in prey abundance. To determine whether fluctuations in population sizes are localised or indicative of large-scale ecosystem change, we must first understand population structure and dispersal. King penguins are long-lived seabirds that occupy a niche across the sub-Antarctic zone close to the Polar Front. Colonies have very different histories of exploitation, population recovery, and expansion. We investigated the genetic population structure and patterns of colonisation of king penguins across their current range using a dataset of 5154 unlinked, high-coverage single nucleotide polymorphisms generated via restriction site associated DNA sequencing (RADSeq). Despite breeding at a small number of discrete, geographically separate sites, we find only very slight genetic differentiation among colonies separated by thousands of kilometers of open-ocean, suggesting migration among islands and archipelagos may be common. Our results show that the South Georgia population is slightly differentiated from all other colonies and suggest that the recently founded Falkland Island colony is likely to have been established by migrants from the distant Crozet Islands rather than nearby colonies on South Georgia, possibly as a result of density-dependent processes. The observed subtle differentiation among king penguin colonies must be considered in future conservation planning and monitoring of the species, and demographic models that attempt to forecast extinction risk in response to large-scale climate change must take into account migration. It is possible that migration could buffer king penguins against some of the impacts of climate change where colonies appear panmictic, although it is unlikely to protect them completely given the widespread physical changes projected for their Southern Ocean foraging grounds

  5. The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory

    Science.gov (United States)

    Sahin, Alper; Anil, Duygu

    2017-01-01

    This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…

  6. The effect of differential motivation on IRT linking

    NARCIS (Netherlands)

    Mittelhaëuser, M.A.; Béguin, A.A.; Sijtsma, K.

    2015-01-01

    The purpose of this study was to investigate whether simulated differential motivation between the stakes for operational tests and anchor items produces an invalid linking result if the Rasch model is used to link the operational tests. This was done for an external anchor design and a variation of

  7. Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form.

    Science.gov (United States)

    Kalpakjian, Claire Z; Tate, Denise G; Kisala, Pamela A; Tulsky, David S

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory-(IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. A total of 717 individuals with SCI completed the self-esteem items. A unidimensional model was observed (CFI=0.946; RMSEA=0.087) and measurement precision was good (theta range between -2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.

  8. Induced pluripotent stem cells with NOTCH1 gene mutation show impaired differentiation into smooth muscle and endothelial cells: Implications for bicuspid aortic valve-related aortopathy.

    Science.gov (United States)

    Jiao, Jiao; Tian, Weihua; Qiu, Ping; Norton, Elizabeth L; Wang, Michael M; Chen, Y Eugene; Yang, Bo

    2018-03-12

    The NOTCH1 gene mutation has been identified in bicuspid aortic valve patients. We developed an in vitro model with human induced pluripotent stem cells (iPSCs) to evaluate the role of NOTCH1 in smooth muscle and endothelial cell (EC) differentiation. The iPSCs were derived from a patient with a normal tricuspid aortic valve and aorta. The NOTCH1 gene was targeted in iPSCs with the Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR-associated protein 9 nuclease (Cas9) system. The NOTCH1 -/- (NOTCH1 homozygous knockout) and isogenic control iPSCs (wild type) were differentiated into neural crest stem cells (NCSCs) and into cardiovascular progenitor cells (CVPCs). The NCSCs were differentiated into smooth muscle cells (SMCs). The CVPCs were differentiated into ECs. The differentiations of SMCs and ECs were compared between NOTCH1 -/- and wild type cells. The expression of NCSC markers (SRY-related HMG-box 10 and transcription factor AP-2 alpha) was significantly lower in NOTCH1 -/- NCSCs than in wild type NCSCs. The SMCs derived from NOTCH1 -/- NCSCs showed immature morphology with smaller size and decreased expression of all SMC-specific contractile proteins. In NOTCH1 -/- CVPCs, the expression of ISL1, NKX2.5, and MYOCD was significantly lower than that in isogenic control CVPCs, indicating impaired differentiation from iPSCs to CVPCs. The NOTCH1 -/- ECs derived from CVPCs showed significantly lower expression of cluster of differentiation 105 and cluster of differentiation 31 mRNA and protein, indicating a defective differentiation process. NOTCH1 is critical in SMC and EC differentiation of iPSCs through NCSCs and CVPCs, respectively. NOTCH1 gene mutations might potentially contribute to the development of thoracic aortic aneurysms by affecting SMC differentiation in some patients with bicuspid aortic valve. Copyright © 2018 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.

  9. Factoring handedness data: I. Item analysis.

    Science.gov (United States)

    Messinger, H B; Messinger, M I

    1995-12-01

    Recently in this journal Peters and Murphy challenged the validity of factor analyses done on bimodal handedness data, suggesting instead that right- and left-handers be studied separately. But bimodality may be avoidable if attention is paid to Oldfield's questionnaire format and instructions for the subjects. Two characteristics appear crucial: a two-column LEFT-RIGHT format for the body of the instrument and what we call Oldfield's Admonition: not to indicate strong preference for handedness item, such as write, unless "... the preference is so strong that you would never try to use the other hand unless absolutely forced to...". Attaining unimodality of an item distribution would seem to overcome the objections of Peters and Murphy. In a 1984 survey in Boston we used Oldfield's ten-item questionnaire exactly as published. This produced unimodal item distributions. With reflection of the five-point item scale and a logarithmic transformation, we achieved a degree of normalization for the items. Two surveys elsewhere based on Oldfield's 20-item list but with changes in the questionnaire format and the instructions, yielded markedly different item distributions with peaks at each extreme and sometimes in the middle as well.

  10. Item Modeling Concept Based on Multimedia Authoring

    Directory of Open Access Journals (Sweden)

    Janez Stergar

    2008-09-01

    Full Text Available In this paper a modern item design framework for computer based assessment based on Flash authoring environment will be introduced. Question design will be discussed as well as the multimedia authoring environment used for item modeling emphasized. Item type templates are a structured means of collecting and storing item information that can be used to improve the efficiency and security of the innovative item design process. Templates can modernize the item design, enhance and speed up the development process. Along with content creation, multimedia has vast potential for use in innovative testing. The introduced item design template is based on taxonomy of innovative items which have great potential for expanding the content areas and construct coverage of an assessment. The presented item design approach is based on GUI's – one for question design based on implemented item design templates and one for user interaction tracking/retrieval. The concept of user interfaces based on Flash technology will be discussed as well as implementation of the innovative approach of the item design forms with multimedia authoring. Also an innovative method for user interaction storage/retrieval based on PHP extending Flash capabilities in the proposed framework will be introduced.

  11. The medial temporal lobes distinguish between within-item and item-context relations during autobiographical memory retrieval.

    Science.gov (United States)

    Sheldon, Signy; Levine, Brian

    2015-12-01

    During autobiographical memory retrieval, the medial temporal lobes (MTL) relate together multiple event elements, including object (within-item relations) and context (item-context relations) information, to create a cohesive memory. There is consistent support for a functional specialization within the MTL according to these relational processes, much of which comes from recognition memory experiments. In this study, we compared brain activation patterns associated with retrieving within-item relations (i.e., associating conceptual and sensory-perceptual object features) and item-context relations (i.e., spatial relations among objects) with respect to naturalistic autobiographical retrieval. We developed a novel paradigm that cued participants to retrieve information about past autobiographical events, non-episodic within-item relations, and non-episodic item-context relations with the perceptuomotor aspects of retrieval equated across these conditions. We used multivariate analysis techniques to extract common and distinct patterns of activity among these conditions within the MTL and across the whole brain, both in terms of spatial and temporal patterns of activity. The anterior MTL (perirhinal cortex and anterior hippocampus) was preferentially recruited for generating within-item relations later in retrieval whereas the posterior MTL (posterior parahippocampal cortex and posterior hippocampus) was preferentially recruited for generating item-context relations across the retrieval phase. These findings provide novel evidence for functional specialization within the MTL with respect to naturalistic memory retrieval. © 2015 Wiley Periodicals, Inc.

  12. A strategy for optimizing item-pool management

    NARCIS (Netherlands)

    Ariel, A.; van der Linden, Willem J.; Veldkamp, Bernard P.

    2006-01-01

    Item-pool management requires a balancing act between the input of new items into the pool and the output of tests assembled from it. A strategy for optimizing item-pool management is presented that is based on the idea of a periodic update of an optimal blueprint for the item pool to tune item

  13. [Instrument to measure adherence in hypertensive patients: contribution of Item Response Theory].

    Science.gov (United States)

    Rodrigues, Malvina Thaís Pacheco; Moreira, Thereza Maria Magalhaes; Vasconcelos, Alexandre Meira de; Andrade, Dalton Francisco de; Silva, Daniele Braz da; Barbetta, Pedro Alberto

    2013-06-01

    To analyze, by means of "Item Response Theory", an instrument to measure adherence to t treatment for hypertension. Analytical study with 406 hypertensive patients with associated complications seen in primary care in Fortaleza, CE, Northeastern Brazil, 2011 using "Item Response Theory". The stages were: dimensionality test, calibrating the items, processing data and creating a scale, analyzed using the gradual response model. A study of the dimensionality of the instrument was conducted by analyzing the polychoric correlation matrix and factor analysis of complete information. Multilog software was used to calibrate items and estimate the scores. Items relating to drug therapy are the most directly related to adherence while those relating to drug-free therapy need to be reworked because they have less psychometric information and low discrimination. The independence of items, the small number of levels in the scale and low explained variance in the adjustment of the models show the main weaknesses of the instrument analyzed. The "Item Response Theory" proved to be a relevant analysis technique because it evaluated respondents for adherence to treatment for hypertension, the level of difficulty of the items and their ability to discriminate between individuals with different levels of adherence, which generates a greater amount of information. The instrument analyzed is limited in measuring adherence to hypertension treatment, by analyzing the "Item Response Theory" of the item, and needs adjustment. The proper formulation of the items is important in order to accurately measure the desired latent trait.

  14. Item response theory - A first approach

    Science.gov (United States)

    Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

    2017-07-01

    The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.

  15. Multi-Robot Item Delivery and Foraging: Two Sides of a Coin

    Directory of Open Access Journals (Sweden)

    Somchaya Liemhetcharat

    2015-09-01

    Full Text Available Multi-robot foraging has been widely studied in the literature, and the general assumption is that the robots are simple, i.e., with limited processing and carrying capacity. We previously studied continuous foraging with slightly more capable robots, and in this article, we are interested in using similar robots for item delivery. Interestingly, item delivery and foraging are two sides of the same coin: foraging an item from a location is similar to satisfying a demand. We formally define the multi-robot item delivery problem and show that the continuous foraging problem is a special case of it. We contribute distributed multi-robot algorithms that solve the item delivery and foraging problems and describe how our shared world model is synchronized across the multi-robot team. We performed extensive experiments on simulated robots using a Java simulator, and we present our results to demonstrate that we outperform benchmark algorithms from multi-robot foraging.

  16. Redefining diagnostic symptoms of depression using Rasch analysis: testing an item bank suitable for DSM-V and computer adaptive testing.

    Science.gov (United States)

    Mitchell, Alex J; Smith, Adam B; Al-salihy, Zerak; Rahim, Twana A; Mahmud, Mahmud Q; Muhyaldin, Asma S

    2011-10-01

    We aimed to redefine the optimal self-report symptoms of depression suitable for creation of an item bank that could be used in computer adaptive testing or to develop a simplified screening tool for DSM-V. Four hundred subjects (200 patients with primary depression and 200 non-depressed subjects), living in Iraqi Kurdistan were interviewed. The Mini International Neuropsychiatric Interview (MINI) was used to define the presence of major depression (DSM-IV criteria). We examined symptoms of depression using four well-known scales delivered in Kurdish. The Partial Credit Model was applied to each instrument. Common-item equating was subsequently used to create an item bank and differential item functioning (DIF) explored for known subgroups. A symptom level Rasch analysis reduced the original 45 items to 24 items of the original after the exclusion of 21 misfitting items. A further six items (CESD13 and CESD17, HADS-D4, HADS-D5 and HADS-D7, and CDSS3 and CDSS4) were removed due to misfit as the items were added together to form the item bank, and two items were subsequently removed following the DIF analysis by diagnosis (CESD20 and CDSS9, both of which were harder to endorse for women). Therefore the remaining optimal item bank consisted of 17 items and produced an area under the curve (AUC) of 0.987. Using a bank restricted to the optimal nine items revealed only minor loss of accuracy (AUC = 0.989, sensitivity 96%, specificity 95%). Finally, when restricted to only four items accuracy was still high (AUC was still 0.976; sensitivity 93%, specificity 96%). An item bank of 17 items may be useful in computer adaptive testing and nine or even four items may be used to develop a simplified screening tool for DSM-V major depressive disorder (MDD). Further examination of this item bank should be conducted in different cultural settings.

  17. Reliability and validity of the Spanish version of the 10-item Connor-Davidson Resilience Scale (10-item CD-RISC in young adults

    Directory of Open Access Journals (Sweden)

    García-Campayo Javier

    2011-08-01

    Full Text Available Abstract Background The 10-item Connor-Davidson Resilience Scale (10-item CD-RISC is an instrument for measuring resilience that has shown good psychometric properties in its original version in English. The aim of this study was to evaluate the validity and reliability of the Spanish version of the 10-item CD-RISC in young adults and to verify whether it is structured in a single dimension as in the original English version. Findings Cross-sectional observational study including 681 university students ranging in age from 18 to 30 years. The number of latent factors in the 10 items of the scale was analyzed by exploratory factor analysis. Confirmatory factor analysis was used to verify whether a single factor underlies the 10 items of the scale as in the original version in English. The convergent validity was analyzed by testing whether the mean of the scores of the mental component of SF-12 (MCS and the quality of sleep as measured with the Pittsburgh Sleep Index (PSQI were higher in subjects with better levels of resilience. The internal consistency of the 10-item CD-RISC was estimated using the Cronbach α test and test-retest reliability was estimated with the intraclass correlation coefficient. The Cronbach α coefficient was 0.85 and the test-retest intraclass correlation coefficient was 0.71. The mean MCS score and the level of quality of sleep in both men and women were significantly worse in subjects with lower resilience scores. Conclusions The Spanish version of the 10-item CD-RISC showed good psychometric properties in young adults and thus can be used as a reliable and valid instrument for measuring resilience. Our study confirmed that a single factor underlies the resilience construct, as was the case of the original scale in English.

  18. Gender Differences in Scientific Literacy of HKPISA 2006: A Multidimensional Differential Item Functioning and Multilevel Mediation Study

    Science.gov (United States)

    Wong, Kwan Yin

    The aim of this study is to investigate the effect of gender differences of 15-year-old students on scientific literacy and their impacts on students’ motivation to pursue science education and careers (Future-oriented Science Motivation) in Hong Kong. The data for this study was collected from the Program for International Student Assessment in Hong Kong (HKPISA). It was carried out in 2006. A total of 4,645 students were randomly selected from 146 secondary schools including government, aided and private schools by two-stage stratified sampling method for the assessment. HKPISA 2006, like most of other large-scale international assessments, presents its assessment frameworks in multidimensional subscales. To fulfill the requirements of this multidimensional assessment framework, this study deployed new approaches to model and investigate gender differences in cognitive and affective latent traits of scientific literacy by using multidimensional differential item functioning (MDIF) and multilevel mediation (MLM). Compared with mean score difference t-test, MDIF improves the precision of each subscales measure at item level and the gender differences in science performance can be accurately estimated. In the light of Eccles et al (1983) Expectancy-value Model of Achievement-related Choices (Eccles’ Model), MLM examines the pattern of gender effects on Future-oriented Science Motivation mediated through cognitive and affective factors. As for MLM investigation, Single-Group Confirmatory Factor Analysis (Single-Group CFA) was used to confirm the applicability and validity of six affective factors which was, originally prepared by OECD. These six factors are Science Self-concept, Personal Value of Science, Interest in Science Learning, Enjoyment of Science Learning, Instrumental Motivation to Learn Science and Future-oriented Science Motivation. Then, Multiple Group CFA was used to verify measurement invariance of these factors across gender groups. The results of

  19. Work ability as prognostic risk marker of disability pension: single-item work ability score versus multi-item work ability index.

    Science.gov (United States)

    Roelen, Corné A M; van Rhenen, Willem; Groothoff, Johan W; van der Klink, Jac J L; Twisk, Jos W R; Heymans, Martijn W

    2014-07-01

    Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. This prospective cohort study comprised 11 537 male construction workers, who completed the WAI at baseline and reported DP after a mean 2.3 years of follow-up. WAS and WAI were calibrated for DP risk predictions with the Hosmer-Lemeshow (H-L) test and their ability to discriminate between high- and low-risk construction workers was investigated with the area under the receiver operating characteristic curve (AUC). At follow-up, 336 (3%) construction workers reported DP. Both WAS [odds ratio (OR) 0.72, 95% confidence interval (95% CI) 0.66-0.78] and WAI (OR 0.57, 95% CI 0.52-0.63) scores were associated with DP at follow-up. The WAS showed miscalibration (H-L model χ (�)=10.60; df=3; P=0.01) and poorly discriminated between high- and low-risk construction workers (AUC 0.67, 95% CI 0.64-0.70). In contrast, calibration (H-L model χ �=8.20; df=8; P=0.41) and discrimination (AUC 0.78, 95% CI 0.75-0.80) were both adequate for the WAI. Although associated with the risk of future DP, the single-item WAS poorly identified male construction workers at risk of DP. We recommend using the multi-item WAI to screen for risk of DP in occupational health practice.

  20. Teaching children with autism spectrum disorders to mand for the removal of stimuli that prevent access to preferred items.

    Science.gov (United States)

    Shillingsburg, M Alice; Powell, Nicole M; Bowen, Crystal N

    2013-01-01

    Mand training is often a primary focus in early language instruction and typically includes mands that are positively reinforced. However, mands maintained by negative reinforcement are also important skills to teach. These include mands to escape aversive demands or unwanted items. Another type of negatively reinforced mand important to teach involves the removal of a stimulus that prevents access to a preferred activity. We taught 5 participants diagnosed with autism spectrum disorders to mand for the removal of a stimulus in order to access a preferred item that had been blocked. An evaluation was conducted to determine if participants responded differentially when the establishing operations for the preferred item were present versus absent. All participants learned to mand for the removal of the stimulus exclusively under conditions when the establishing operation was present.

  1. Investigating Separate and Concurrent Approaches for Item Parameter Drift in 3PL Item Response Theory Equating

    Science.gov (United States)

    Arce-Ferrer, Alvaro J.; Bulut, Okan

    2017-01-01

    This study examines separate and concurrent approaches to combine the detection of item parameter drift (IPD) and the estimation of scale transformation coefficients in the context of the common item nonequivalent groups design with the three-parameter item response theory equating. The study uses real and synthetic data sets to compare the two…

  2. Development of an item bank for computerized adaptive test (CAT) measurement of pain

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Aaronson, Neil K; Chie, Wei-Chu

    2016-01-01

    PURPOSE: Patient-reported outcomes should ideally be adapted to the individual patient while maintaining comparability of scores across patients. This is achievable using computerized adaptive testing (CAT). The aim here was to develop an item bank for CAT measurement of the pain domain as measured...... were obtained from 1103 cancer patients from five countries. Psychometric evaluations showed that 16 items could be retained in a unidimensional item bank. Evaluations indicated that use of the CAT measure may reduce sample size requirements with 15-25 % compared to using the QLQ-C30 pain scale....... CONCLUSIONS: We have established an item bank of 16 items suitable for CAT measurement of pain. While being backward compatible with the QLQ-C30, the new item bank will significantly improve measurement precision of pain. We recommend initiating CAT measurement by screening for pain using the two original QLQ...

  3. Service differentiation in spare parts supply through dedicated stocks

    NARCIS (Netherlands)

    Alvarez, Elisa; van der Heijden, Matthijs C.; Zijm, Willem H.M.

    2012-01-01

    We investigate keeping dedicated stocks at customer sites in addition to stock kept at some central location as a tool for applying service differentiation in spare parts supply. We study the resulting two-echelon system in a multi-item setting, both under backordering and under emergency shipments

  4. Time-limited effects of emotional arousal on item and source memory.

    Science.gov (United States)

    Wang, Bo; Sun, Bukuan

    2015-01-01

    Two experiments investigated the time-limited effects of emotional arousal on consolidation of item and source memory. In Experiment 1, participants memorized words (items) and the corresponding speakers (sources) and then took an immediate free recall test. Then they watched a neutral, positive, or negative video 5, 35, or 50 min after learning, and 24 hours later they took surprise memory tests. Experiment 2 was similar to Experiment 1 except that (a) a reality monitoring task was used; (b) elicitation delays of 5, 30, and 45 min were used; and (c) delayed memory tests were given 60 min after learning. Both experiments showed that, regardless of elicitation delay, emotional arousal did not enhance item recall memory. Second, both experiments showed that negative arousal enhanced delayed item recognition memory only at the medium elicitation delay, but not in the shorter or longer delays. Positive arousal enhanced performance only in Experiment 1. Third, regardless of elicitation delay, emotional arousal had little effect on source memory. These findings have implications for theories of emotion and memory, suggesting that emotion effects are contingent upon the nature of the memory task and elicitation delay.

  5. Murdock free recall data: The initial recall search identifies the context by the location of the least remembered item and produces only better remembered items in proportion to the total recall difference.

    OpenAIRE

    Tarnow, Dr. Eugen

    2009-01-01

    The curious free recall data of Murdock (1962) shows an additional surprise that seems to have gone undetected until now: the probability of guessing an item in the initial recall is not identical to the overall free recall curve. Initial recall of an item is well correlated with the total recall of that item using a straight line but with an unexpected offset. The offset varies with the presentation rate and the total number of list items but in each case it is the same as the total recall ...

  6. 76 FR 60474 - Commercial Item Handbook

    Science.gov (United States)

    2011-09-29

    ... DEPARTMENT OF DEFENSE Defense Acquisition Regulations System Commercial Item Handbook AGENCY.... SUMMARY: DoD has updated its Commercial Item Handbook. The purpose of the Handbook is to help acquisition personnel develop sound business strategies for procuring commercial items. DoD is seeking industry input on...

  7. Item response modeling: a psychometric assessment of the children's fruit, vegetable, water, and physical activity self-efficacy scales among Chinese children.

    Science.gov (United States)

    Wang, Jing-Jing; Chen, Tzu-An; Baranowski, Tom; Lau, Patrick W C

    2017-09-16

    This study aimed to evaluate the psychometric properties of four self-efficacy scales (i.e., self-efficacy for fruit (FSE), vegetable (VSE), and water (WSE) intakes, and physical activity (PASE)) and to investigate their differences in item functioning across sex, age, and body weight status groups using item response modeling (IRM) and differential item functioning (DIF). Four self-efficacy scales were administrated to 763 Hong Kong Chinese children (55.2% boys) aged 8-13 years. Classical test theory (CTT) was used to examine the reliability and factorial validity of scales. IRM was conducted and DIF analyses were performed to assess the characteristics of item parameter estimates on the basis of children's sex, age and body weight status. All self-efficacy scales demonstrated adequate to excellent internal consistency reliability (Cronbach's α: 0.79-0.91). One FSE misfit item and one PASE misfit item were detected. Small DIF were found for all the scale items across children's age groups. Items with medium to large DIF were detected in different sex and body weight status groups, which will require modification. A Wright map revealed that items covered the range of the distribution of participants' self-efficacy for each scale except VSE. Several self-efficacy scales' items functioned differently by children's sex and body weight status. Additional research is required to modify the four self-efficacy scales to minimize these moderating influences for application.

  8. Gender and Socioeconomic Status DIF on The WISC-IV Turkish Form Items: A Comparison of DIF Detection Tecniques

    Directory of Open Access Journals (Sweden)

    Elif Bengi ÜNSAL ÖZBERK

    2017-03-01

    Full Text Available The purpose of this study is to investigate potential gender and socio-economic status bias in theWechler Intelligence Scale for Children: Fourth Edition (WISC-4 by using several differential item functioning detection techniques. In this study, WISC-4 Turkish standardization test pilot data including 817 children were used. In accordance with the purpose of the study, 315 items were used both in polytomously scored subtests such as Block Design, Similarities, Digit Span, Vocabulary, Letter-Number Sequencing, Comprehension, and dichotomously scored subtests such as Picture Concepts, Matrix Reasoning, Picture Completion, Information, Arithmetic, and Word Reasoning. While Rasch Model, Mantel-Haenszel, and SIBTEST DIF detection techniques were used for dichotomously scored items, Partila Credit Model, Mantel, and Poly-SIBTEST techniques were used for polytomously scored items. In terms of DIF techniques, Mantel-Haenszel, SIBTEST and Mantel Test, Poly-SIBTEST analyses provided similar results when DIF based on gender was investigated. In addition Mantel-Haenszel, Rasch estimations and Partial Credit Model, Mantel Test results were similar while investigating DIF according to socioeconomic status.

  9. Spare Items validation

    International Nuclear Information System (INIS)

    Fernandez Carratala, L.

    1998-01-01

    There is an increasing difficulty for purchasing safety related spare items, with certifications by manufacturers for maintaining the original qualifications of the equipment of destination. The main reasons are, on the top of the logical evolution of technology, applied to the new manufactured components, the quitting of nuclear specific production lines and the evolution of manufacturers quality systems, originally based on nuclear codes and standards, to conventional industry standards. To face this problem, for many years different Dedication processes have been implemented to verify whether a commercial grade element is acceptable to be used in safety related applications. In the same way, due to our particular position regarding the spare part supplies, mainly from markets others than the american, C.N. Trillo has developed a methodology called Spare Items Validation. This methodology, which is originally based on dedication processes, is not a single process but a group of coordinated processes involving engineering, quality and management activities. These are to be performed on the spare item itself, its design control, its fabrication and its supply for allowing its use in destinations with specific requirements. The scope of application is not only focussed on safety related items, but also to complex design, high cost or plant reliability related components. The implementation in C.N. Trillo has been mainly curried out by merging, modifying and making the most of processes and activities which were already being performed in the company. (Author)

  10. Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations.

    Science.gov (United States)

    Teresi, Jeanne A; Ocepek-Welikson, Katja; Cook, Karon F; Kleinman, Marjorie; Ramirez, Mildred; Reid, M Carrington; Siu, Albert

    2016-01-01

    Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System ® (PROMIS ® ) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, "How much did pain interfere with enjoyment of social activities?" was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and sensitivity

  11. Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations

    Science.gov (United States)

    Teresi, Jeanne A.; Ocepek-Welikson, Katja; Cook, Karon F.; Kleinman, Marjorie; Ramirez, Mildred; Reid, M. Carrington; Siu, Albert

    2017-01-01

    Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System® (PROMIS®) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. Methods DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. Results The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, “How much did pain interfere with enjoyment of social activities?” was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and

  12. Item Effects in Recognition Memory for Words

    Science.gov (United States)

    Freeman, Emily; Heathcote, Andrew; Chalmers, Kerry; Hockley, William

    2010-01-01

    We investigate the effects of word characteristics on episodic recognition memory using analyses that avoid Clark's (1973) "language-as-a-fixed-effect" fallacy. Our results demonstrate the importance of modeling word variability and show that episodic memory for words is strongly affected by item noise (Criss & Shiffrin, 2004), as measured by the…

  13. Measurement equivalence of the KINDL questionnaire across child self-reports and parent proxy-reports: a comparison between item response theory and ordinal logistic regression.

    Science.gov (United States)

    Jafari, Peyman; Sharafi, Zahra; Bagheri, Zahra; Shalileh, Sara

    2014-06-01

    Measurement equivalence is a necessary assumption for meaningful comparison of pediatric quality of life rated by children and parents. In this study, differential item functioning (DIF) analysis is used to examine whether children and their parents respond consistently to the items in the KINDer Lebensqualitätsfragebogen (KINDL; in German, Children Quality of Life Questionnaire). Two DIF detection methods, graded response model (GRM) and ordinal logistic regression (OLR), were applied for comparability. The KINDL was completed by 1,086 school children and 1,061 of their parents. While the GRM revealed that 12 out of the 24 items were flagged with DIF, the OLR identified 14 out of the 24 items with DIF. Seven items with DIF and five items without DIF were common across the two methods, yielding a total agreement rate of 50 %. This study revealed that parent proxy-reports cannot be used as a substitute for a child's ratings in the KINDL.

  14. Developing and testing items for the South African Personality Inventory (SAPI

    Directory of Open Access Journals (Sweden)

    Carin Hill

    2013-11-01

    Research purpose: This article reports on the process of identifying items for, and provides a quantitative evaluation of, the South African Personality Inventory (SAPI items. Motivation for the study: The study intended to develop an indigenous and psychometrically sound personality instrument that adheres to the requirements of South African legislation and excludes cultural bias. Research design, approach and method: The authors used a cross-sectional design. They measured the nine SAPI clusters identified in the qualitative stage of the SAPI project in 11 separate quantitative studies. Convenience sampling yielded 6735 participants. Statistical analysis focused on the construct validity and reliability of items. The authors eliminated items that showed poor performance, based on common psychometric criteria, and selected the best performing items to form part of the final version of the SAPI. Main findings: The authors developed 2573 items from the nine SAPI clusters. Of these, 2268 items were valid and reliable representations of the SAPI facets. Practical/managerial implications: The authors developed a large item pool. It measures personality in South Africa. Researchers can refine it for the SAPI. Furthermore, the project illustrates an approach that researchers can use in projects that aim to develop culturally-informed psychological measures. Contribution/value-add: Personality assessment is important for recruiting, selecting and developing employees. This study contributes to the current knowledge about the early processes researchers follow when they develop a personality instrument that measures personality fairly in different cultural groups, as the SAPI does.

  15. The Effects of Goal Relevance and Perceptual Features on Emotional Items and Associative Memory.

    Science.gov (United States)

    Mao, Wei B; An, Shu; Yang, Xiao F

    2017-01-01

    Showing an emotional item in a neutral background scene often leads to enhanced memory for the emotional item and impaired associative memory for background details. Meanwhile, both top-down goal relevance and bottom-up perceptual features played important roles in memory binding. We conducted two experiments and aimed to further examine the effects of goal relevance and perceptual features on emotional items and associative memory. By manipulating goal relevance (asking participants to categorize only each item image as living or non-living or to categorize each whole composite picture consisted of item image and background scene as natural scene or manufactured scene) and perceptual features (controlling visual contrast and visual familiarity) in two experiments, we found that both high goal relevance and salient perceptual features (high salience of items vs. high familiarity of items) could promote emotional item memory, but they had different effects on associative memory for emotional items and neutral backgrounds. Specifically, high goal relevance and high perceptual-salience of items could jointly impair the associative memory for emotional items and neutral backgrounds, while the effect of item familiarity on associative memory for emotional items would be modulated by goal relevance. High familiarity of items could increase associative memory for negative items and neutral backgrounds only in the low goal relevance condition. These findings suggest the effect of emotion on associative memory is not only related to attentional capture elicited by emotion, but also can be affected by goal relevance and perceptual features of stimulus.

  16. Internal consistency of a five-item form of the Francis Scale of Attitude Toward Christianity among adolescent students.

    Science.gov (United States)

    Campo-Arias, Adalberto; Oviedo, Heidi Celina; Cogollo, Zuleima

    2009-04-01

    The short form of the Francis Scale of Attitude Toward Christianity (L. J. Francis, 1992) is a 7-item Likert-type scale that shows high homogeneity among adolescents. The psychometric performance of a shorter version of this scale has not been explored. The authors aimed to determine the internal consistency of a 5-item form of the Francis Scale of Attitude Toward Christianity among 405 students from a school in Cartagena, Colombia. The authors computed the Cronbach's alpha coefficient for the 5 items with a greater corrected item-total punctuation correlation. The version without Items 2 and 7 showed internal consistency of .87. The 5-item version of the Francis Scale of Attitude Toward Christianity exhibited higher internal consistency than did the 7-item version. Future researchers should corroborate this finding.

  17. Item Analysis in Introductory Economics Testing.

    Science.gov (United States)

    Tinari, Frank D.

    1979-01-01

    Computerized analysis of multiple choice test items is explained. Examples of item analysis applications in the introductory economics course are discussed with respect to three objectives: to evaluate learning; to improve test items; and to help improve classroom instruction. Problems, costs and benefits of the procedures are identified. (JMD)

  18. Perception that "everything requires a lot of effort": transcultural SCL-25 item validation.

    Science.gov (United States)

    Moreau, Nicolas; Hassan, Ghayda; Rousseau, Cécile; Chenguiti, Khalid

    2009-09-01

    This brief report illustrates how the migration context can affect specific item validity of mental health measures. The SCL-25 was administered to 432 recently settled immigrants (220 Haitian and 212 Arabs). We performed descriptive analyses, as well as Infit and Outfit statistics analyses using WINSTEPS Rasch Measurement Software based on Item Response Theory. The participants' comments about the item You feel everything requires a lot of effort in the SCL-25 were also qualitatively analyzed. Results revealed that the item You feel everything requires a lot of effort is an outlier and does not adjust in an expected and valid fashion with its cluster items, as it is over-endorsed by Haitian and Arab healthy participants. Our study thus shows that, in transcultural mental health research, the cultural and migratory contexts may interact and significantly influence the meaning of some symptom items and consequently, the validity of symptom scales.

  19. A Comparison of the 27-Item and 12-Item Intolerance of Uncertainty Scales

    Science.gov (United States)

    Khawaja, Nigar G.; Yu, Lai Ngo Heidi

    2010-01-01

    The 27-item Intolerance of Uncertainty Scale (IUS) has become one of the most frequently used measures of Intolerance of Uncertainty. More recently, an abridged, 12-item version of the IUS has been developed. The current research used clinical (n = 50) and non-clinical (n = 56) samples to examine and compare the psychometric properties of both…

  20. More is not Always Better: The Relation between Item Response and Item Response Time in Raven’s Matrices

    Directory of Open Access Journals (Sweden)

    Frank Goldhammer

    2015-03-01

    Full Text Available The role of response time in completing an item can have very different interpretations. Responding more slowly could be positively related to success as the item is answered more carefully. However, the association may be negative if working faster indicates higher ability. The objective of this study was to clarify the validity of each assumption for reasoning items considering the mode of processing. A total of 230 persons completed a computerized version of Raven’s Advanced Progressive Matrices test. Results revealed that response time overall had a negative effect. However, this effect was moderated by items and persons. For easy items and able persons the effect was strongly negative, for difficult items and less able persons it was less negative or even positive. The number of rules involved in a matrix problem proved to explain item difficulty significantly. Most importantly, a positive interaction effect between the number of rules and item response time indicated that the response time effect became less negative with an increasing number of rules. Moreover, exploratory analyses suggested that the error type influenced the response time effect.

  1. Item response theory analysis applied to the Spanish version of the Personal Outcomes Scale.

    Science.gov (United States)

    Guàrdia-Olmos, J; Carbó-Carreté, M; Peró-Cebollero, M; Giné, C

    2017-11-01

    The study of measurements of quality of life (QoL) is one of the great challenges of modern psychology and psychometric approaches. This issue has greater importance when examining QoL in populations that were historically treated on the basis of their deficiency, and recently, the focus has shifted to what each person values and desires in their life, as in cases of people with intellectual disability (ID). Many studies of QoL scales applied in this area have attempted to improve the validity and reliability of their components by incorporating various sources of information to achieve consistency in the data obtained. The adaptation of the Personal Outcomes Scale (POS) in Spanish has shown excellent psychometric attributes, and its administration has three sources of information: self-assessment, practitioner and family. The study of possible congruence or incongruence of observed distributions of each item between sources is therefore essential to ensure a correct interpretation of the measure. The aim of this paper was to analyse the observed distribution of items and dimensions from the three Spanish POS information sources cited earlier, using the item response theory. We studied a sample of 529 people with ID and their respective practitioners and family member, and in each case, we analysed items and factors using Samejima's model of polytomic ordinal scales. The results indicated an important number of items with differential effects regarding sources, and in some cases, they indicated significant differences in the distribution of items, factors and sources of information. As a result of this analysis, we must affirm that the administration of the POS, considering three sources of information, was adequate overall, but a correct interpretation of the results requires that it obtain much more information to consider, as well as some specific items in specific dimensions. The overall ratings, if these comments are considered, could result in bias. © 2017

  2. Poisson and negative binomial item count techniques for surveys with sensitive question.

    Science.gov (United States)

    Tian, Guo-Liang; Tang, Man-Lai; Wu, Qin; Liu, Yin

    2017-04-01

    Although the item count technique is useful in surveys with sensitive questions, privacy of those respondents who possess the sensitive characteristic of interest may not be well protected due to a defect in its original design. In this article, we propose two new survey designs (namely the Poisson item count technique and negative binomial item count technique) which replace several independent Bernoulli random variables required by the original item count technique with a single Poisson or negative binomial random variable, respectively. The proposed models not only provide closed form variance estimate and confidence interval within [0, 1] for the sensitive proportion, but also simplify the survey design of the original item count technique. Most importantly, the new designs do not leak respondents' privacy. Empirical results show that the proposed techniques perform satisfactorily in the sense that it yields accurate parameter estimate and confidence interval.

  3. Negative effects of item repetition on source memory.

    Science.gov (United States)

    Kim, Kyungmi; Yi, Do-Joon; Raye, Carol L; Johnson, Marcia K

    2012-08-01

    In the present study, we explored how item repetition affects source memory for new item-feature associations (picture-location or picture-color). We presented line drawings varying numbers of times in Phase 1. In Phase 2, each drawing was presented once with a critical new feature. In Phase 3, we tested memory for the new source feature of each item from Phase 2. Experiments 1 and 2 demonstrated and replicated the negative effects of item repetition on incidental source memory. Prior item repetition also had a negative effect on source memory when different source dimensions were used in Phases 1 and 2 (Experiment 3) and when participants were explicitly instructed to learn source information in Phase 2 (Experiments 4 and 5). Importantly, when the order between Phases 1 and 2 was reversed, such that item repetition occurred after the encoding of critical item-source combinations, item repetition no longer affected source memory (Experiment 6). Overall, our findings did not support predictions based on item predifferentiation, within-dimension source interference, or general interference from multiple traces of an item. Rather, the findings were consistent with the idea that prior item repetition reduces attention to subsequent presentations of the item, decreasing the likelihood that critical item-source associations will be encoded.

  4. Effects of language dominance on item and order memory in free recall, serial recall and order reconstruction.

    Science.gov (United States)

    Francis, Wendy S; Baca, Yuzeth

    2014-01-01

    Spanish-English bilinguals (N = 144) performed free recall, serial recall and order reconstruction tasks in both English and Spanish. Long-term memory for both item and order information was worse in the less fluent language (L2) than in the more fluent language (L1). Item scores exhibited a stronger disadvantage for the L2 in serial recall than in free recall. Relative order scores were lower in the L2 for all three tasks, but adjusted scores for free and serial recall were equivalent across languages. Performance of English-speaking monolinguals (N = 72) was comparable to bilingual performance in the L1, except that monolinguals had higher adjusted order scores in free recall. Bilingual performance patterns in the L2 were consistent with the established effects of concurrent task performance on these memory tests, suggesting that the cognitive resources required for processing words in the L2 encroach on resources needed to commit item and order information to memory. These findings are also consistent with a model in which item memory is connected to the language system, order information is processed by separate mechanisms and attention can be allocated differentially to these two systems.

  5. Using Reversed MFCC and IT-EM for Automatic Speaker Verification

    Directory of Open Access Journals (Sweden)

    Sheeraz Memon

    2012-01-01

    Full Text Available This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/ Reverse Mel Frequency Coefficients and IT-EM (Information Theoretic Expectation Maximization. To perform speaker verification, feature extraction using Mel scale has been widely applied and has established better results. The IMFCC is based on inverse Mel-scale. The IMFCC effectively captures information available at the high frequency formants which is ignored by the MFCC. In this paper the fusion of MFCC and IMFCC at input level is proposed. GMMs (Gaussian Mixture Models based on EM (Expectation Maximization have been widely used for classification of text independent verification. However EM comes across the convergence issue. In this paper we use our proposed IT-EM which has faster convergence, to train speaker models. IT-EM uses information theory principles such as PDE (Parzen Density Estimation and KL (Kullback-Leibler divergence measure. IT-EM acclimatizes the weights, means and covariances, like EM. However, IT-EM process is not performed on feature vector sets but on a set of centroids obtained using IT (Information Theoretic metric. The IT-EM process at once diminishes divergence measure between PDE estimates of features distribution within a given class and the centroids distribution within the same class. The feature level fusion and IT-EM is tested for the task of speaker verification using NIST2001 and NIST2004. The experimental evaluation validates that MFCC/IMFCC has better results than the conventional delta/MFCC feature set. The MFCC/IMFCC feature vector size is also much smaller than the delta MFCC thus reducing the computational burden as well. IT-EM method also showed faster convergence, than the conventional EM method, and thus it leads to higher speaker recognition scores.

  6. Validation of the italian version of the 15-item Myasthenia Gravis Quality-of-Life questionnaire.

    Science.gov (United States)

    Raggi, Alberto; Leonardi, Matilde; Ayadi, Roberta; Antozzi, Carlo; Maggi, Lorenzo; Baggi, Fulvio; Mantegazza, Renato

    2017-10-01

    In this study we assess the Italian version of the 15-item Myasthenia Gravis Quality-of-Life questionnaire (MG-QOL15). The validation protocol included the MG-QOL15, the 36-item Short Form (SF-36), the Besta Neurological Institute Rating Scale for Myasthenia Gravis, and the MG-Composite. We used the Cronbach α to test reliability, the Spearman correlation to test short-term test-retest, the Kruskal-Wallis test to assess differences in MG-QOL15 between patients with different disease severity, and the Wilcoxon signed-rank test to assess sensitivity to change. Seventy-two patients were enrolled in the study. The mean MG-QOL15 score was 15.2 ± 12.2, with α = 0.93 and test-retest correlation = 0.93. Compared with the SF-36, the MG-QOL15 was superior in differentiating patients with different MG types (P = 0.041) and severity (P = 0.004), showed higher sensitivity to change (P = 0.003 for improved and P = 0.024 for worsened patients), and had higher correlations with the MG-Composite (rho = 0.367 vs. -0.213 and -0.154). The Italian version of the MG-QOL15 is valid, reliable, stable, and sensitive to changes. Muscle Nerve 56: 716-720, 2017. © 2016 Wiley Periodicals, Inc.

  7. Location Indices for Ordinal Polytomous Items Based on Item Response Theory. Research Report. ETS RR-15-20

    Science.gov (United States)

    Ali, Usama S.; Chang, Hua-Hua; Anderson, Carolyn J.

    2015-01-01

    Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index is needed to describe an item's location along a latent trait continuum. Situations in which a single index would be needed include item selection in computerized adaptive testing or test assembly. Therefore single…

  8. Identify, Organize, and Retrieve Items Using Zotero

    Science.gov (United States)

    Clark, Brian; Stierman, John

    2009-01-01

    Librarians build collections. To do this they use tools that help them identify, organize, and retrieve items for the collection. Zotero (zoh-TAIR-oh) is such a tool that helps the user build a library of useful books, articles, web sites, blogs, etc., discovered while surfing online. A visit to Zotero's homepage, www.zotero.org, shows a number of…

  9. Loglinear multidimensional IRT models for polytomously scired Items

    NARCIS (Netherlands)

    Kelderman, Henk

    1988-01-01

    A loglinear item response theory (IRT) model is proposed that relates polytomously scored item responses to a multidimensional latent space. Each item may have a different response function where each item response may be explained by one or more latent traits. Item response functions may follow a

  10. The Effects of Goal Relevance and Perceptual Features on Emotional Items and Associative Memory

    Directory of Open Access Journals (Sweden)

    Wei B. Mao

    2017-07-01

    Full Text Available Showing an emotional item in a neutral background scene often leads to enhanced memory for the emotional item and impaired associative memory for background details. Meanwhile, both top–down goal relevance and bottom–up perceptual features played important roles in memory binding. We conducted two experiments and aimed to further examine the effects of goal relevance and perceptual features on emotional items and associative memory. By manipulating goal relevance (asking participants to categorize only each item image as living or non-living or to categorize each whole composite picture consisted of item image and background scene as natural scene or manufactured scene and perceptual features (controlling visual contrast and visual familiarity in two experiments, we found that both high goal relevance and salient perceptual features (high salience of items vs. high familiarity of items could promote emotional item memory, but they had different effects on associative memory for emotional items and neutral backgrounds. Specifically, high goal relevance and high perceptual-salience of items could jointly impair the associative memory for emotional items and neutral backgrounds, while the effect of item familiarity on associative memory for emotional items would be modulated by goal relevance. High familiarity of items could increase associative memory for negative items and neutral backgrounds only in the low goal relevance condition. These findings suggest the effect of emotion on associative memory is not only related to attentional capture elicited by emotion, but also can be affected by goal relevance and perceptual features of stimulus.

  11. Estimating reliability coefficients with heterogeneous item weightings using Stata: A factor based approach

    NARCIS (Netherlands)

    Boermans, M.A.; Kattenberg, M.A.C.

    2011-01-01

    We show how to estimate a Cronbach's alpha reliability coefficient in Stata after running a principal component or factor analysis. Alpha evaluates to what extent items measure the same underlying content when the items are combined into a scale or used for latent variable. Stata allows for testing

  12. 48 CFR 852.214-72 - Alternate item(s).

    Science.gov (United States)

    2010-10-01

    ... AND FORMS SOLICITATION PROVISIONS AND CONTRACT CLAUSES Texts of Provisions and Clauses 852.214-72... 2008) Bids on []* will be given equal consideration along with bids on []** and any such bids received... [].** * Contracting officer will insert an alternate item that is considered acceptable. ** Contracting officer will...

  13. Macrostructural Treatment of Multi-word Lexical Items

    Directory of Open Access Journals (Sweden)

    Alenka Vrbinc

    2011-05-01

    Full Text Available The paper discusses the macrostructural treatment of multi-word lexical items in mono- and bilingual dictionaries. First, the classification of multi-word lexical items is presented, and special attention is paid to the discussion of compounds – a specific group of multi-word lexical items that is most commonly afforded headword status but whose inclusion in the headword list may also depend on spelling. Then the inclusion of multi-word lexical items in monolingual dictionaries is dealt with in greater detail, while the results of a short survey on the inclusion of five randomly chosen multi-word lexical items in seven English monolingual dictionaries are presented. The proposals as to how to treat these five multi-word lexical items in bilingual dictionaries are presented in the section about the inclusion of multi-word lexical items in bilingual dictionaries. The conclusion is that it is most important to take the users’ needs into consideration and to make any dictionary as user friendly as possible.

  14. Losing Items in the Psychogeriatric Nursing Home

    Directory of Open Access Journals (Sweden)

    J. van Hoof PhD

    2016-09-01

    Full Text Available Introduction: Losing items is a time-consuming occurrence in nursing homes that is ill described. An explorative study was conducted to investigate which items got lost by nursing home residents, and how this affects the residents and family caregivers. Method: Semi-structured interviews and card sorting tasks were conducted with 12 residents with early-stage dementia and 12 family caregivers. Thematic analysis was applied to the outcomes of the sessions. Results: The participants stated that numerous personal items and assistive devices get lost in the nursing home environment, which had various emotional, practical, and financial implications. Significant amounts of time are spent on trying to find items, varying from 1 hr up to a couple of weeks. Numerous potential solutions were identified by the interviewees. Discussion: Losing items often goes together with limitations to the participation of residents. Many family caregivers are reluctant to replace lost items, as these items may get lost again.

  15. Further Investigating Method Effects Associated with Negatively Worded Items on Self-Report Surveys

    Science.gov (United States)

    DiStefano, Christine; Motl, Robert W.

    2006-01-01

    This article used multitrait-multimethod methodology and covariance modeling for an investigation of the presence and correlates of method effects associated with negatively worded items on the Rosenberg Self-Esteem (RSE) scale (Rosenberg, 1989) using a sample of 757 adults. Results showed that method effects associated with negative item phrasing…

  16. ‘Forget me (not?’ – Remembering forget-items versus un-cued items in directed forgetting

    Directory of Open Access Journals (Sweden)

    Bastian eZwissler

    2015-11-01

    Full Text Available Humans need to be able to selectively control their memories. Here, we investigate the underlying processes in item-method directed forgetting and compare the classic active memory cues in this paradigm with a passive instruction. Typically, individual items are presented and each is followed by either a forget- or remember-instruction. On a surprise test of all items, memory is then worse for to-be-forgotten items (TBF compared to to-be-remembered items (TBR. This is thought to result from selective rehearsal of TBR, or from active inhibition of TBF, or from both. However, evidence suggests that if a forget instruction initiates active processing, paradoxical effects may also arise. To investigate the underlying mechanisms, four experiments were conducted where un-cued items (UI were introduced and recognition performance was compared between TBR, TBF and UI stimuli. Accuracy was encouraged via a performance-dependent monetary bonus. Across all experiments, including perceptually fully matched variants, memory accuracy for TBF was reduced compared to TBR, but better than for UI. Moreover, participants used a more conservative response criterion when responding to TBF stimuli. Thus, ironically, the F cue results in active processing, but this does not have inhibitory effects that would impair recognition memory beyond a un-cued baseline condition. This casts doubts on inhibitory accounts of item-method directed forgetting and is also difficult to reconcile with pure selective rehearsal of TBR. While the F-cue does induce active processing, this does not result in particularly successful forgetting. The pattern seems most consistent with the notion of ironic processing.

  17. Sex Differential Item Functioning in the Inventory of Early Development III Social-Emotional Skills

    Science.gov (United States)

    Beaver, Jessica L.; French, Brian F.; Finch, W. Holmes; Ullrich-French, Sarah C.

    2014-01-01

    Social-emotional (SE) skills in the early developmental years of children influence outcomes in psychological, behavioral, and learning domains. The adult ratings of a child's SE skills can be influenced by sex stereotypes. These rating differences could lead to differential conclusions about developmental progress or risk. To ensure that…

  18. A New Functional Health Literacy Scale for Japanese Young Adults Based on Item Response Theory.

    Science.gov (United States)

    Tsubakita, Takashi; Kawazoe, Nobuo; Kasano, Eri

    2017-03-01

    Health literacy predicts health outcomes. Despite concerns surrounding the health of Japanese young adults, to date there has been no objective assessment of health literacy in this population. This study aimed to develop a Functional Health Literacy Scale for Young Adults (funHLS-YA) based on item response theory. Each item in the scale requires participants to choose the most relevant term from 3 choices in relation to a target item, thus assessing objective rather than perceived health literacy. The 20-item scale was administered to 1816 university students and 1751 responded. Cronbach's α coefficient was .73. Difficulty and discrimination parameters of each item were estimated, resulting in the exclusion of 1 item. Some items showed different difficulty parameters for male and female participants, reflecting that some aspects of health literacy may differ by gender. The current 19-item version of funHLS-YA can reliably assess the objective health literacy of Japanese young adults.

  19. Behavioral decoding of working memory items inside and outside the focus of attention.

    Science.gov (United States)

    Mallett, Remington; Lewis-Peacock, Jarrod A

    2018-03-31

    How we attend to our thoughts affects how we attend to our environment. Holding information in working memory can automatically bias visual attention toward matching information. By observing attentional biases on reaction times to visual search during a memory delay, it is possible to reconstruct the source of that bias using machine learning techniques and thereby behaviorally decode the content of working memory. Can this be done when more than one item is held in working memory? There is some evidence that multiple items can simultaneously bias attention, but the effects have been inconsistent. One explanation may be that items are stored in different states depending on the current task demands. Recent models propose functionally distinct states of representation for items inside versus outside the focus of attention. Here, we use behavioral decoding to evaluate whether multiple memory items-including temporarily irrelevant items outside the focus of attention-exert biases on visual attention. Only the single item in the focus of attention was decodable. The other item showed a brief attentional bias that dissipated until it returned to the focus of attention. These results support the idea of dynamic, flexible states of working memory across time and priority. © 2018 New York Academy of Sciences.

  20. Statistical power as a function of Cronbach alpha of instrument questionnaire items.

    Science.gov (United States)

    Heo, Moonseong; Kim, Namhee; Faith, Myles S

    2015-10-14

    In countless number of clinical trials, measurements of outcomes rely on instrument questionnaire items which however often suffer measurement error problems which in turn affect statistical power of study designs. The Cronbach alpha or coefficient alpha, here denoted by C(α), can be used as a measure of internal consistency of parallel instrument items that are developed to measure a target unidimensional outcome construct. Scale score for the target construct is often represented by the sum of the item scores. However, power functions based on C(α) have been lacking for various study designs. We formulate a statistical model for parallel items to derive power functions as a function of C(α) under several study designs. To this end, we assume fixed true score variance assumption as opposed to usual fixed total variance assumption. That assumption is critical and practically relevant to show that smaller measurement errors are inversely associated with higher inter-item correlations, and thus that greater C(α) is associated with greater statistical power. We compare the derived theoretical statistical power with empirical power obtained through Monte Carlo simulations for the following comparisons: one-sample comparison of pre- and post-treatment mean differences, two-sample comparison of pre-post mean differences between groups, and two-sample comparison of mean differences between groups. It is shown that C(α) is the same as a test-retest correlation of the scale scores of parallel items, which enables testing significance of C(α). Closed-form power functions and samples size determination formulas are derived in terms of C(α), for all of the aforementioned comparisons. Power functions are shown to be an increasing function of C(α), regardless of comparison of interest. The derived power functions are well validated by simulation studies that show that the magnitudes of theoretical power are virtually identical to those of the empirical power. Regardless

  1. The Effect of Feedback Delay on Perceptual Category Learning and Item Memory: Further Limits of Multiple Systems.

    Science.gov (United States)

    Stephens, Rachel G; Kalish, Michael L

    2018-02-01

    Delayed feedback during categorization training has been hypothesized to differentially affect 2 systems that underlie learning for rule-based (RB) or information-integration (II) structures. We tested an alternative possibility: that II learning requires more precise item representations than RB learning, and so is harmed more by a delay interval filled with a confusable mask. Experiments 1 and 2 examined the effect of feedback delay on memory for RB and II exemplars, both without and with concurrent categorization training. Without the training, II items were indeed more difficult to recognize than RB items, but there was no detectable effect of delay on item memory. In contrast, with concurrent categorization training, there were effects of both category structure and delayed feedback on item memory, which were related to corresponding changes in category learning. However, we did not observe the critical selective impact of delay on II classification performance that has been shown previously. Our own results were also confirmed in a follow-up study (Experiment 3) involving only categorization training. The selective influence of feedback delay on II learning appears to be contingent on the relative size of subgroups of high-performing participants, and in fact does not support that RB and II category learning are qualitatively different. We conclude that a key part of successfully solving perceptual categorization problems is developing more precise item representations, which can be impaired by delayed feedback during training. More important, the evidence for multiple systems of category learning is even weaker than previously proposed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  2. The co-occurrence of PTSD and dissociation: differentiating severe PTSD from dissociative-PTSD

    DEFF Research Database (Denmark)

    Armour, C.; Karstoft, K. I.; Richardson, J. D.

    2014-01-01

    A dissociative-posttraumatic stress disorder (PTSD) subtype has been included in the DSM-5. However, it is not yet clear whether certain socio-demographic characteristics or psychological/clinical constructs such as comorbid psychopathology differentiate between severe PTSD and dissociative-PTSD....... The current study investigated the existence of a dissociative-PTSD subtype and explored whether a number of trauma and clinical covariates could differentiate between severe PTSD alone and dissociative-PTSD. The current study utilized a sample of 432 treatment seeking Canadian military veterans. Participants...... were assessed with the Clinician Administered PTSD Scale (CAPS) and self-report measures of traumatic life events, depression, and anxiety. CAPS severity scores were created reflecting the sum of the frequency and intensity items from each of the 17 PTSD and 3 dissociation items. The CAPS severity...

  3. The Effect of Error in Item Parameter Estimates on the Test Response Function Method of Linking.

    Science.gov (United States)

    Kaskowitz, Gary S.; De Ayala, R. J.

    2001-01-01

    Studied the effect of item parameter estimation for computation of linking coefficients for the test response function (TRF) linking/equating method. Simulation results showed that linking was more accurate when there was less error in the parameter estimates, and that 15 or 25 common items provided better results than 5 common items under both…

  4. A unified factor-analytic approach to the detection of item and test bias: Illustration with the effect of providing calculators to students with dyscalculia

    Directory of Open Access Journals (Sweden)

    Lee, M. K.

    2016-01-01

    Full Text Available An absence of measurement bias against distinct groups is a prerequisite for the use of a given psychological instrument in scientific research or high-stakes assessment. Factor analysis is the framework explicitly adopted for the identification of such bias when the instrument consists of a multi-test battery, whereas item response theory is employed when the focus narrows to a single test composed of discrete items. Item response theory can be treated as a mild nonlinearization of the standard factor model, and thus the essential unity of bias detection at the two levels merits greater recognition. Here we illustrate the benefits of a unified approach with a real-data example, which comes from a statewide test of mathematics achievement where examinees diagnosed with dyscalculia were accommodated with calculators. We found that items that can be solved by explicit arithmetical computation became easier for the accommodated examinees, but the quantitative magnitude of this differential item functioning (measurement bias was small.

  5. On random age and remaining lifetime for populations of items

    DEFF Research Database (Denmark)

    Finkelstein, M.; Vaupel, J.

    2015-01-01

    We consider items that are incepted into operation having already a random (initial) age and define the corresponding remaining lifetime. We show that these lifetimes are identically distributed when the age distribution is equal to the equilibrium distribution of the renewal theory. Then we...... develop the population studies approach to the problem and generalize the setting in terms of stationary and stable populations of items. We obtain new stochastic comparisons for the corresponding population ages and remaining lifetimes that can be useful in applications. Copyright (c) 2014 John Wiley...

  6. Tag-Driven Online Novel Recommendation with Collaborative Item Modeling

    Directory of Open Access Journals (Sweden)

    Fenghuan Li

    2018-04-01

    Full Text Available Online novel recommendation recommends attractive novels according to the preferences and characteristics of users or novels and is increasingly touted as an indispensable service of many online stores and websites. The interests of the majority of users remain stable over a certain period. However, there are broad categories in the initial recommendation list achieved by collaborative filtering (CF. That is to say, it is very possible that there are many inappropriately recommended novels. Meanwhile, most algorithms assume that users can provide an explicit preference. However, this assumption does not always hold, especially in online novel reading. To solve these issues, a tag-driven algorithm with collaborative item modeling (TDCIM is proposed for online novel recommendation. Online novel reading is different from traditional book marketing and lacks preference rating. In addition, collaborative filtering frequently suffers from the Matthew effect, leading to ignored personalized recommendations and serious long tail problems. Therefore, item-based CF is improved by latent preference rating with a punishment mechanism based on novel popularity. Consequently, a tag-driven algorithm is constructed by means of collaborative item modeling and tag extension. Experimental results show that online novel recommendation is improved greatly by a tag-driven algorithm with collaborative item modeling.

  7. Piecewise Polynomial Fitting with Trend Item Removal and Its Application in a Cab Vibration Test

    Directory of Open Access Journals (Sweden)

    Wu Ren

    2018-01-01

    Full Text Available The trend item of a long-term vibration signal is difficult to remove. This paper proposes a piecewise integration method to remove trend items. Examples of direct integration without trend item removal, global integration after piecewise polynomial fitting with trend item removal, and direct integration after piecewise polynomial fitting with trend item removal were simulated. The results showed that direct integration of the fitted piecewise polynomial provided greater acceleration and displacement precision than the other two integration methods. A vibration test was then performed on a special equipment cab. The results indicated that direct integration by piecewise polynomial fitting with trend item removal was highly consistent with the measured signal data. However, the direct integration method without trend item removal resulted in signal distortion. The proposed method can help with frequency domain analysis of vibration signals and modal parameter identification for such equipment.

  8. Item selection via Bayesian IRT models.

    Science.gov (United States)

    Arima, Serena

    2015-02-10

    With reference to a questionnaire that aimed to assess the quality of life for dysarthric speakers, we investigate the usefulness of a model-based procedure for reducing the number of items. We propose a mixed cumulative logit model, which is known in the psychometrics literature as the graded response model: responses to different items are modelled as a function of individual latent traits and as a function of item characteristics, such as their difficulty and their discrimination power. We jointly model the discrimination and the difficulty parameters by using a k-component mixture of normal distributions. Mixture components correspond to disjoint groups of items. Items that belong to the same groups can be considered equivalent in terms of both difficulty and discrimination power. According to decision criteria, we select a subset of items such that the reduced questionnaire is able to provide the same information that the complete questionnaire provides. The model is estimated by using a Bayesian approach, and the choice of the number of mixture components is justified according to information criteria. We illustrate the proposed approach on the basis of data that are collected for 104 dysarthric patients by local health authorities in Lecce and in Milan. Copyright © 2014 John Wiley & Sons, Ltd.

  9. Post-encoding emotional arousal enhances consolidation of item memory, but not reality-monitoring source memory.

    Science.gov (United States)

    Wang, Bo; Sun, Bukuan

    2017-03-01

    The current study examined whether the effect of post-encoding emotional arousal on item memory extends to reality-monitoring source memory and, if so, whether the effect depends on emotionality of learning stimuli and testing format. In Experiment 1, participants encoded neutral words and imagined or viewed their corresponding object pictures. Then they watched a neutral, positive, or negative video. The 24-hour delayed test showed that emotional arousal had little effect on both item memory and reality-monitoring source memory. Experiment 2 was similar except that participants encoded neutral, positive, and negative words and imagined or viewed their corresponding object pictures. The results showed that positive and negative emotional arousal induced after encoding enhanced consolidation of item memory, but not reality-monitoring source memory, regardless of emotionality of learning stimuli. Experiment 3, identical to Experiment 2 except that participants were tested only on source memory for all the encoded items, still showed that post-encoding emotional arousal had little effect on consolidation of reality-monitoring source memory. Taken together, regardless of emotionality of learning stimuli and regardless of testing format of source memory (conjunction test vs. independent test), the facilitatory effect of post-encoding emotional arousal on item memory does not generalize to reality-monitoring source memory.

  10. Item-level psychometrics of the ADL instrument of the Korean National Survey on persons with physical disabilities.

    Science.gov (United States)

    Hong, Ickpyo; Lee, Mi Jung; Kim, Moon Young; Park, Hae Yean

    2017-10-01

    The aim of this study is to investigate the psychometrics of the 12 items of an instrument assessing activities of daily living (ADL) using an item response theory model. A total of 648 adults with physical disabilities and having difficulties in ADLs were retrieved from the 2014 Korean National Survey on People with Disabilities. The psychometric testing included factor analysis, internal consistency, precision, and differential item functioning (DIF) across categories including sex, older age, marital status, and physical impairment area. The sample had a mean age of 69.7 years old (SD = 13.7). The majority of the sample had lower extremity impairments (62.0%) and had at least 2.1 chronic conditions. The instrument demonstrated unidimensional construct and good internal consistency (Cronbach's alpha = 0.95). The instrument precisely estimated person measures within a wide range of theta values (-2.22 logits  5.0%). Our findings indicate that the dressing item would need to be modified to improve its psychometrics. Overall, the ADL instrument demonstrates good psychometrics, and thus, it may be used as a standardized instrument for measuring disability in rehabilitation contexts. However, the findings are limited to adults with physical disabilities. Future studies should replicate psychometric testing for survey respondents with other disorders and for children.

  11. Software Note: Using BILOG for Fixed-Anchor Item Calibration

    Science.gov (United States)

    DeMars, Christine E.; Jurich, Daniel P.

    2012-01-01

    The nonequivalent groups anchor test (NEAT) design is often used to scale item parameters from two different test forms. A subset of items, called the anchor items or common items, are administered as part of both test forms. These items are used to adjust the item calibrations for any differences in the ability distributions of the groups taking…

  12. Inventions on presenting textual items in Graphical User Interface

    OpenAIRE

    Mishra, Umakant

    2014-01-01

    Although a GUI largely replaces textual descriptions by graphical icons, the textual items are not completely removed. The textual items are inevitably used in window titles, message boxes, help items, menu items and popup items. Textual items are necessary for communicating messages that are beyond the limitation of graphical messages. However, it is necessary to harness the textual items on the graphical interface in such a way that they complement each other to produce the best effect. One...

  13. Science Library of Test Items. Volume Eighteen. A Collection of Multiple Choice Test Items Relating Mainly to Chemistry.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  14. Science Library of Test Items. Volume Seventeen. A Collection of Multiple Choice Test Items Relating Mainly to Biology.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  15. Science Library of Test Items. Volume Nineteen. A Collection of Multiple Choice Test Items Relating Mainly to Geology.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  16. Feed mechanism and method for feeding minute items

    Science.gov (United States)

    Stringer, Timothy Kent [Bucyrus, KS; Yerganian, Simon Scott [Lee's Summit, MO

    2009-10-20

    A feeding mechanism and method for feeding minute items, such as capacitors, resistors, or solder preforms. The mechanism is adapted to receive a plurality of the randomly-positioned and randomly-oriented extremely small or minute items, and to isolate, orient, and position one or more of the items in a specific repeatable pickup location wherefrom they may be removed for use by, for example, a computer-controlled automated assembly machine. The mechanism comprises a sliding shelf adapted to receive and support the items; a wiper arm adapted to achieve a single even layer of the items; and a pushing arm adapted to push the items into the pickup location. The mechanism can be adapted for providing the items with a more exact orientation, and can also be adapted for use in a liquid environment.

  17. Memory for Items and Relationships among Items Embedded in Realistic Scenes: Disproportionate Relational Memory Impairments in Amnesia

    Science.gov (United States)

    Hannula, Deborah E.; Tranel, Daniel; Allen, John S.; Kirchhoff, Brenda A.; Nickel, Allison E.; Cohen, Neal J.

    2014-01-01

    Objective The objective of this study was to examine the dependence of item memory and relational memory on medial temporal lobe (MTL) structures. Patients with amnesia, who either had extensive MTL damage or damage that was relatively restricted to the hippocampus, were tested, as was a matched comparison group. Disproportionate relational memory impairments were predicted for both patient groups, and those with extensive MTL damage were also expected to have impaired item memory. Method Participants studied scenes, and were tested with interleaved two-alternative forced-choice probe trials. Probe trials were either presented immediately after the corresponding study trial (lag 1), five trials later (lag 5), or nine trials later (lag 9) and consisted of the studied scene along with a manipulated version of that scene in which one item was replaced with a different exemplar (item memory test) or was moved to a new location (relational memory test). Participants were to identify the exact match of the studied scene. Results As predicted, patients were disproportionately impaired on the test of relational memory. Item memory performance was marginally poorer among patients with extensive MTL damage, but both groups were impaired relative to matched comparison participants. Impaired performance was evident at all lags, including the shortest possible lag (lag 1). Conclusions The results are consistent with the proposed role of the hippocampus in relational memory binding and representation, even at short delays, and suggest that the hippocampus may also contribute to successful item memory when items are embedded in complex scenes. PMID:25068665

  18. Applying Hierarchical Model Calibration to Automatically Generated Items.

    Science.gov (United States)

    Williamson, David M.; Johnson, Matthew S.; Sinharay, Sandip; Bejar, Isaac I.

    This study explored the application of hierarchical model calibration as a means of reducing, if not eliminating, the need for pretesting of automatically generated items from a common item model prior to operational use. Ultimately the successful development of automatic item generation (AIG) systems capable of producing items with highly similar…

  19. 41 CFR 101-27.404 - Review of items.

    Science.gov (United States)

    2010-07-01

    ... 41 Public Contracts and Property Management 2 2010-07-01 2010-07-01 true Review of items. 101-27.404 Section 101-27.404 Public Contracts and Property Management Federal Property Management...-Elimination of Items From Inventory § 101-27.404 Review of items. Except for standby or reserve stocks, items...

  20. Towards an authoring system for item construction

    NARCIS (Netherlands)

    Rikers, Jos H.A.N.

    1988-01-01

    The process of writing test items is analyzed, and a blueprint is presented for an authoring system for test item writing to reduce invalidity and to structure the process of item writing. The developmental methodology is introduced, and the first steps in the process are reported. A historical

  1. Not saying I am happy does not mean I am not: cultural influences on responses to positive affect items in the CES-D.

    Science.gov (United States)

    Jang, Yuri; Kwag, Kyung Hwa; Chiriboga, David A

    2010-11-01

    Given the emphasis on modesty and self-effacement in Asian societies, the present study explored differential item responses for 2 positive affect items (5 = Hopeful and 8 = Happy) on a short form of the Center for Epidemiologic Studies-Depression scale. The samples consisted of elderly non-Hispanic Whites (n = 450), Korean Americans (n = 519), and Koreans (n = 2,030). Multiple Indicator Multiple Cause models were estimated to identify the impact of group membership on responses to the positive affect items while controlling for the latent trait of depressive symptoms. The data revealed that Koreans and Korean Americans were less likely than non-Hispanic Whites to endorse the positive affect items. Compared with Korean Americans who were more acculturated to mainstream American culture, those who were less acculturated were less likely to endorse the positive affect items. Our findings support the notion that the way in which people endorse depressive symptoms is substantially influenced by cultural orientation. These findings call into question the common use of simple mean comparisons and a universal cutoff point across diverse cultural groups.

  2. Modeling Local Item Dependence in Cloze and Reading Comprehension Test Items Using Testlet Response Theory

    Science.gov (United States)

    Baghaei, Purya; Ravand, Hamdollah

    2016-01-01

    In this study the magnitudes of local dependence generated by cloze test items and reading comprehension items were compared and their impact on parameter estimates and test precision was investigated. An advanced English as a foreign language reading comprehension test containing three reading passages and a cloze test was analyzed with a…

  3. Development of the Quantitative Reasoning Items on the National Survey of Student Engagement

    Directory of Open Access Journals (Sweden)

    Amber D. Dumford

    2015-01-01

    Full Text Available As society’s needs for quantitative skills become more prevalent, college graduates require quantitative skills regardless of their career choices. Therefore, it is important that institutions assess students’ engagement in quantitative activities during college. This study chronicles the process taken by the National Survey of Student Engagement (NSSE to develop items that measure students’ participation in quantitative reasoning (QR activities. On the whole, findings across the quantitative and qualitative analyses suggest good overall properties for the developed QR items. The items show great promise to explore and evaluate the frequency with which college students participate in QR-related activities. Each year, hundreds of institutions across the United States and Canada participate in NSSE, and, with the addition of these new items on the core survey, every participating institution will have information on this topic. Our hope is that these items will spur conversations on campuses about students’ use of quantitative reasoning activities.

  4. 10 CFR 835.605 - Labeling items and containers.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false Labeling items and containers. 835.605 Section 835.605... items and containers. Except as provided at § 835.606, each item or container of radioactive material... information to permit individuals handling, using, or working in the vicinity of the items or containers to...

  5. Obtaining a Proportional Allocation by Deleting Items

    NARCIS (Netherlands)

    Dorn, B.; de Haan, R.; Schlotter, I.; Röthe, J.

    2017-01-01

    We consider the following control problem on fair allocation of indivisible goods. Given a set I of items and a set of agents, each having strict linear preference over the items, we ask for a minimum subset of the items whose deletion guarantees the existence of a proportional allocation in the

  6. Item-Based Top-N Recommendation Algorithms

    Science.gov (United States)

    2003-01-20

    basket of items, utilized by many e-commerce sites, cannot take advantage of pre-computed user-to-user similarities. Finally, even though the...not discriminate between items that are present in frequent itemsets and items that are not, while still maintaining the computational advantages of...453219 0.02% 7.74 ccard 42629 68793 398619 0.01% 9.35 ecommerce 6667 17491 91222 0.08% 13.68 em 8002 1648 769311 5.83% 96.14 ml 943 1682 100000 6.31

  7. Working memory for sequences of temporal durations reveals a volatile single-item store

    Directory of Open Access Journals (Sweden)

    Sanjay G Manohar

    2016-10-01

    Full Text Available When a sequence is held in working memory, different items are retained with differing fidelity. Here we ask whether a sequence of brief time intervals that must be remembered show recency effects, similar to those observed in verbal and visuospatial working memory. It has been suggested that prioritising some items over others can be accounted for by a focus of attention, maintaining some items in a privileged state. We therefore also investigated whether such benefits are vulnerable to disruption by attention or expectation. Participants listened to sequences of one to five tones, of varying durations (200ms to 2s. Subsequently, the length of one of the tones in the sequence had to be reproduced by holding a key. The discrepancy between the reproduced and actual durations quantified the fidelity of memory for auditory durations. Recall precision decreased with the number of items that had to be remembered, and was better for the first and last items of sequences, in line with set-size and serial position effects seen in other modalities. To test whether attentional filtering demands might impair performance, an irrelevant variation in pitch was introduced in some blocks of trials. In those blocks, memory precision was worse for sequences that consisted of only one item, i.e. the smallest memory set size. Thus, when irrelevant information was present, the benefit of having only one item in memory is attenuated. Finally we examined whether expectation could interfere with memory. On half the trials, the number of items in the upcoming sequence was cued. When the number of items was known in advance, performance was paradoxically worse when the sequence consisted of only one item. Thus the benefit of having only one item to remember is stronger when it is unexpectedly the only item. Our results suggest that similar mechanisms are used to hold auditory time durations in working memory, as for visual or verbal stimuli. Further, solitary items were

  8. A Review of Classical Methods of Item Analysis.

    Science.gov (United States)

    French, Christine L.

    Item analysis is a very important consideration in the test development process. It is a statistical procedure to analyze test items that combines methods used to evaluate the important characteristics of test items, such as difficulty, discrimination, and distractibility of the items in a test. This paper reviews some of the classical methods for…

  9. Platelet-Rich Plasma Preparation Types Show Impact on Chondrogenic Differentiation, Migration, and Proliferation of Human Subchondral Mesenchymal Progenitor Cells.

    Science.gov (United States)

    Kreuz, Peter Cornelius; Krüger, Jan Philipp; Metzlaff, Sebastian; Freymann, Undine; Endres, Michaela; Pruss, Axel; Petersen, Wolf; Kaps, Christian

    2015-10-01

    To evaluate the chondrogenic potential of platelet concentrates on human subchondral mesenchymal progenitor cells (MPCs) as assessed by histomorphometric analysis of proteoglycans and type II collagen. Furthermore, the migratory and proliferative effect of platelet concentrates were assessed. Platelet-rich plasma (PRP) was prepared using preparation kits (Autologous Conditioned Plasma [ACP] Kit [Arthrex, Naples, FL]; Regen ACR-C Kit [Regen Lab, Le Mont-Sur-Lausanne, Switzerland]; and Dr.PRP Kit [Rmedica, Seoul, Republic of Korea]) by apheresis (PRP-A) and by centrifugation (PRP-C). In contrast to clinical application, freeze-and-thaw cycles were subsequently performed to activate platelets and to prevent medium coagulation by residual fibrinogen in vitro. MPCs were harvested from the cortico-spongious bone of femoral heads. Chondrogenic differentiation of MPCs was induced in high-density pellet cultures and evaluated by histochemical staining of typical cartilage matrix components. Migration of MPCs was assessed using a chemotaxis assay, and proliferation activity was measured by DNA content. MPCs cultured in the presence of 5% ACP, Regen, or Dr.PRP formed fibrous tissue, whereas MPCs stimulated with 5% PRP-A or PRP-C developed compact and dense cartilaginous tissue rich in type II collagen and proteoglycans. All platelet concentrates significantly (ACP, P = .00041; Regen, P = .00029; Dr.PRP, P = .00051; PRP-A, P platelet concentrates but one (Dr.PRP, P = .63) showed a proliferative effect on MPCs, as shown by significant increases (ACP, P = .027; Regen, P = .0029; PRP-A, P = .00021; and PRP-C, P = .00069) in DNA content. Platelet concentrates obtained by different preparation methods exhibit different potentials to stimulate chondrogenic differentiation, migration, and proliferation of MPCs. Platelet concentrates obtained by commercially available preparation kits failed to induce chondrogenic differentiation of MPCs, whereas highly standardized PRP

  10. Electronics. Criterion-Referenced Test (CRT) Item Bank.

    Science.gov (United States)

    Davis, Diane, Ed.

    This document contains 519 criterion-referenced multiple choice and true or false test items for a course in electronics. The test item bank is designed to work with both the Vocational Instructional Management System (VIMS) and the Vocational Administrative Management System (VAMS) in Missouri. The items are grouped into 15 units covering the…

  11. Item response theory scoring and the detection of curvilinear relationships.

    Science.gov (United States)

    Carter, Nathan T; Dalal, Dev K; Guan, Li; LoPilato, Alexander C; Withrow, Scott A

    2017-03-01

    Psychologists are increasingly positing theories of behavior that suggest psychological constructs are curvilinearly related to outcomes. However, results from empirical tests for such curvilinear relations have been mixed. We propose that correctly identifying the response process underlying responses to measures is important for the accuracy of these tests. Indeed, past research has indicated that item responses to many self-report measures follow an ideal point response process-wherein respondents agree only to items that reflect their own standing on the measured variable-as opposed to a dominance process, wherein stronger agreement, regardless of item content, is always indicative of higher standing on the construct. We test whether item response theory (IRT) scoring appropriate for the underlying response process to self-report measures results in more accurate tests for curvilinearity. In 2 simulation studies, we show that, regardless of the underlying response process used to generate the data, using the traditional sum-score generally results in high Type 1 error rates or low power for detecting curvilinearity, depending on the distribution of item locations. With few exceptions, appropriate power and Type 1 error rates are achieved when dominance-based and ideal point-based IRT scoring are correctly used to score dominance and ideal point response data, respectively. We conclude that (a) researchers should be theory-guided when hypothesizing and testing for curvilinear relations; (b) correctly identifying whether responses follow an ideal point versus dominance process, particularly when items are not extreme is critical; and (c) IRT model-based scoring is crucial for accurate tests of curvilinearity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  12. 26 CFR 301.6501(o)-3 - Partnership items.

    Science.gov (United States)

    2010-04-01

    ... 26 Internal Revenue 18 2010-04-01 2010-04-01 false Partnership items. 301.6501(o)-3 Section 301... § 301.6501(o)-3 Partnership items. (a) Partnership item defined. For purposes of section 6501(o) (as it..., and § 301.6511(g)-1, the term “partnership item” means— (1) Any item required to be taken into account...

  13. Evaluation of five guidelines for option development in multiple-choice item-writing.

    Science.gov (United States)

    Martínez, Rafael J; Moreno, Rafael; Martín, Irene; Trigo, M Eva

    2009-05-01

    This paper evaluates certain guidelines for writing multiple-choice test items. The analysis of the responses of 5013 subjects to 630 items from 21 university classroom achievement tests suggests that an option should not differ in terms of heterogeneous content because such error has a slight but harmful effect on item discrimination. This also occurs with the "None of the above" option when it is the correct one. In contrast, results do not show the supposedly negative effects of a different-length option, the use of specific determiners, or the use of the "All of the above" option, which not only decreases difficulty but also improves discrimination when it is the correct option.

  14. A novel multi-item joint replenishment problem considering multiple type discounts.

    Directory of Open Access Journals (Sweden)

    Ligang Cui

    Full Text Available In business replenishment, discount offers of multi-item may either provide different discount schedules with a single discount type, or provide schedules with multiple discount types. The paper investigates the joint effects of multiple discount schemes on the decisions of multi-item joint replenishment. In this paper, a joint replenishment problem (JRP model, considering three discount (all-unit discount, incremental discount, total volume discount offers simultaneously, is constructed to determine the basic cycle time and joint replenishment frequencies of multi-item. To solve the proposed problem, a heuristic algorithm is proposed to find the optimal solutions and the corresponding total cost of the JRP model. Numerical experiment is performed to test the algorithm and the computational results of JRPs under different discount combinations show different significance in the replenishment cost reduction.

  15. A Balance Sheet for Educational Item Banking.

    Science.gov (United States)

    Hiscox, Michael D.

    Educational item banking presents observers with a considerable paradox. The development of test items from scratch is viewed as wasteful, a luxury in times of declining resources. On the other hand, item banking has failed to become a mature technology despite large amounts of money and the efforts of talented professionals. The question of which…

  16. Item and test analysis to identify quality multiple choice questions (MCQS from an assessment of medical students of Ahmedabad, Gujarat

    Directory of Open Access Journals (Sweden)

    Sanju Gajjar

    2014-01-01

    Full Text Available Background: Multiple choice questions (MCQs are frequently used to assess students in different educational streams for their objectivity and wide reach of coverage in less time. However, the MCQs to be used must be of quality which depends upon its difficulty index (DIF I, discrimination index (DI and distracter efficiency (DE. Objective: To evaluate MCQs or items and develop a pool of valid items by assessing with DIF I, DI and DE and also to revise/ store or discard items based on obtained results. Settings: Study was conducted in a medical school of Ahmedabad. Materials and Methods: An internal examination in Community Medicine was conducted after 40 hours teaching during 1 st MBBS which was attended by 148 out of 150 students. Total 50 MCQs or items and 150 distractors were analyzed. Statistical Analysis: Data was entered and analyzed in MS Excel 2007 and simple proportions, mean, standard deviations, coefficient of variation were calculated and unpaired t test was applied. Results: Out of 50 items, 24 had "good to excellent" DIF I (31 - 60% and 15 had "good to excellent" DI (> 0.25. Mean DE was 88.6% considered as ideal/ acceptable and non functional distractors (NFD were only 11.4%. Mean DI was 0.14. Poor DI (< 0.15 with negative DI in 10 items indicates poor preparedness of students and some issues with framing of at least some of the MCQs. Increased proportion of NFDs (incorrect alternatives selected by < 5% students in an item decrease DE and makes it easier. There were 15 items with 17 NFDs, while rest items did not have any NFD with mean DE of 100%. Conclusion: Study emphasizes the selection of quality MCQs which truly assess the knowledge and are able to differentiate the students of different abilities in correct manner.

  17. Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

    Science.gov (United States)

    Wang, Wei

    2013-01-01

    Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

  18. Non-ignorable missingness item response theory models for choice effects in examinee-selected items.

    Science.gov (United States)

    Liu, Chen-Wei; Wang, Wen-Chung

    2017-11-01

    Examinee-selected item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set, always yields incomplete data (i.e., when only the selected items are answered, data are missing for the others) that are likely non-ignorable in likelihood inference. Standard item response theory (IRT) models become infeasible when ESI data are missing not at random (MNAR). To solve this problem, the authors propose a two-dimensional IRT model that posits one unidimensional IRT model for observed data and another for nominal selection patterns. The two latent variables are assumed to follow a bivariate normal distribution. In this study, the mirt freeware package was adopted to estimate parameters. The authors conduct an experiment to demonstrate that ESI data are often non-ignorable and to determine how to apply the new model to the data collected. Two follow-up simulation studies are conducted to assess the parameter recovery of the new model and the consequences for parameter estimation of ignoring MNAR data. The results of the two simulation studies indicate good parameter recovery of the new model and poor parameter recovery when non-ignorable missing data were mistakenly treated as ignorable. © 2017 The British Psychological Society.

  19. Clinical Validation of the Nursing Outcome "Swallowing Status" in People with Stroke: Analysis According to the Classical and Item Response Theories.

    Science.gov (United States)

    Oliveira-Kumakura, Ana Railka de Souza; de Araujo, Thelma Leite; Costa, Alice Gabrielle de Sousa; Cavalcante, Tahissa Frota; Lopes, Marcos Venícios de Oliveira; Carvalho, Emilia Campos

    2017-09-19

    To validate clinically the nursing outcome "Swallowing status". The adjustment of the nursing outcome was investigated according to the Classical and Item Response Theories. The models were compared regarding information loss, goodness-of-fit, and differential item functioning. Stability and internal consistency were examined. The nursing outcome has the best fit in the generalized partial credit model with different discrimination parameters. Strong correlations among the scores of each indicator were observed. There was no differential item functioning of the outcome indicators. The scale presented high internal consistency (Cronbach's α = .954) and stability (and > .800). This study presents a valid nursing outcome. Most accurate monitoring of sensitivity to an intervention. Validar clinicamente o resultado de enefermagem "Estado da Deglutição". MÉTODOS: O ajustamento do resultado foi investigado de acordo com as teorias Clássica e de Resposta ao Item. Os modelos foram comparados assumindo parâmetros de itens cruzados de igual discriminação. Investigaram-se as propriedades de bondade do ajuste, funcionamento diferencial dos itens, estabilidade e consistência interna. O resultado se ajustou melhor a partir do Modelo de crédito parcial generalizado, o qual demonstrou unidimensionalidade do resultado e forte correlação entre os escores de cada indicador. Não houve funcionamento diferencial dos indicadores. A consistência interna para a escala global (Cronbach's α = .954) e a estabilidade (>.800) mantiveram-se elevadas. CONCLUSÃO: O estudo apresenta um resultado de enfermagem válido. RELEVÂNCIA PARA A PRÁTICA CLÍNICA: Maior acurácia para monitorar a sensibilidade da intervenção. © 2017 NANDA International, Inc.

  20. A Case Study on an Item Writing Process: Use of Test Specifications, Nature of Group Dynamics, and Individual Item Writers' Characteristics

    Science.gov (United States)

    Kim, Jiyoung; Chi, Youngshin; Huensch, Amanda; Jun, Heesung; Li, Hongli; Roullion, Vanessa

    2010-01-01

    This article discusses a case study on an item writing process that reflects on our practical experience in an item development project. The purpose of the article is to share our lessons from the experience aiming to demystify item writing process. The study investigated three issues that naturally emerged during the project: how item writers use…

  1. Psychological Literacy Weakly Differentiates Students by Discipline and Year of Enrolment

    Science.gov (United States)

    Heritage, Brody; Roberts, Lynne D.; Gasson, Natalie

    2016-01-01

    Psychological literacy, a construct developed to reflect the types of skills graduates of a psychology degree should possess and be capable of demonstrating, has recently been scrutinized in terms of its measurement adequacy. The recent development of a multi-item measure encompassing the facets of psychological literacy has provided the potential for improved validity in measuring the construct. We investigated the known-groups validity of this multi-item measure of psychological literacy to examine whether psychological literacy could predict (a) students’ course of enrolment and (b) students’ year of enrolment. Five hundred and fifteen undergraduate psychology students, 87 psychology/human resource management students, and 83 speech pathology students provided data. In the first year cohort, the reflective processes (RPs) factor significantly predicted psychology and psychology/human resource management course enrolment, although no facets significantly differentiated between psychology and speech pathology enrolment. Within the second year cohort, generic graduate attributes (GGAs) and RPs differentiated psychology and speech pathology course enrolment. GGAs differentiated first-year and second-year psychology students, with second-year students more likely to have higher scores on this factor. Due to weak support for known-groups validity, further measurement refinements are recommended to improve the construct’s utility. PMID:26909058

  2. Science Library of Test Items. Volume Twenty-Two. A Collection of Multiple Choice Test Items Relating Mainly to Skills.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  3. Science Library of Test Items. Volume Twenty. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 1.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  4. Applying Item Response Theory methods to design a learning progression-based science assessment

    Science.gov (United States)

    Chen, Jing

    the defined boundaries. This ensures the accuracy of the classification. Third, when item threshold parameters vary a bit, the scoring rubrics and the items need to be reviewed to make the threshold parameters similar across items. This is because one important design criterion of the learning progression-based items is that ideally, a student should be at the same level across items, which means that the item threshold parameters (d1, d 2 and d3) should be similar across items. To design a learning progression-based science assessment, we need to understand whether the assessment measures a single construct or several constructs and how items are associated with the constructs being measured. Results from dimension analyses indicate that items of different carbon transforming processes measure different aspects of the carbon cycle construct. However, items of different practices assess the same construct. In general, there are high correlations among different processes or practices. It is not clear whether the strong correlations are due to the inherent links among these process/practice dimensions or due to the fact that the student sample does not show much variation in these process/practice dimensions. Future data are needed to examine the dimensionalities in terms of process/practice in detail. Finally, based on item characteristics analysis, recommendations are made to write more discriminative CR items and better OMC, MTF options. Item writers can follow these recommendations to write better learning progression-based items.

  5. Item-level informant discrepancies across obese-overweight children and their parents on the PedsQL™ 4.0 instrument: an iterative hybrid ordinal logistic regression.

    Science.gov (United States)

    Jafari, Peyman; Allahyari, Elahe; Salarzadeh, Mina; Bagheri, Zahra

    2016-01-01

    Child obesity has become a major health concern worldwide. In order to provide successful intervention strategies, it is necessary to understand how obese-overweight children and their parents perceive obesity and its consequences on child's health-related quality of life (HRQoL). This study aimed to assess measurement equivalence of the PedsQL™ 4.0 across obese-overweight children and their parents. The items in the PedsQL™ 4.0 were analysed for differential item functioning (DIF) across obese-overweight children and their parents using an iterative hybrid ordinal logistic regression/item response theory approach. The sample included 647 overweight-obese children and their parents, who completed child and parent reports of the PedsQL™ 4.0, respectively. Overall, 17 out of 23 (74%) items were flagged with DIF across two groups: eight items exhibited uniform DIF and nine items non-uniform DIF. In addition, parents of obese children rated the child's HRQoL significantly lower than their children in all domains of the PedsQL™ 4.0, and this finding did not change whether or not items with uniform DIF were included. Although obese-overweight children and their parents interpret items of the PedsQL™ 4.0 in a conceptually different manner, removing or retaining DIF items in the subscales had no significant effects on group differences. Accordingly, it appears that observed differences in HRQoL scores across child and parent reports are a true difference and not a reflection of measurement artefact.

  6. Does remembering emotional items impair recall of same-emotion items?

    Science.gov (United States)

    Sison, Jo Ann G; Mather, Mara

    2007-04-01

    In the part-set cuing effect, cuing a subset of previously studied items impairs recall of the remaining noncued items. This experiment reveals that cuing participants with previously-studied emotional pictures (e.g., fear-evoking pictures of people) can impair recall of pictures involving the same emotion but different content (e.g., fear-evoking pictures of animals). This indicates that new events can be organized in memory using emotion as a grouping function to create associations. However, whether new information is organized in memory along emotional or nonemotional lines appears to be a flexible process that depends on people's current focus. Mentioning in the instructions that the pictures were either amusement- or fear-related led to memory impairment for pictures with the same emotion as cued pictures, whereas mentioning that the pictures depicted either animals or people led to memory impairment for pictures with the same type of actor.

  7. Modeling Item-Level and Step-Level Invariance Effects in Polytomous Items Using the Partial Credit Model

    Science.gov (United States)

    Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D.

    2012-01-01

    Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…

  8. The short- and long-term fates of memory items retained outside the focus of attention.

    Science.gov (United States)

    LaRocque, Joshua J; Eichenbaum, Adam S; Starrett, Michael J; Rose, Nathan S; Emrich, Stephen M; Postle, Bradley R

    2015-04-01

    When a test of working memory (WM) requires the retention of multiple items, a subset of them can be prioritized. Recent studies have shown that, although prioritized (i.e., attended) items are associated with active neural representations, unprioritized (i.e., unattended) memory items can be retained in WM despite the absence of such active representations, and with no decrement in their recognition if they are cued later in the trial. These findings raise two intriguing questions about the nature of the short-term retention of information outside the focus of attention. First, when the focus of attention shifts from items in WM, is there a loss of fidelity for those unattended memory items? Second, could the retention of unattended memory items be accomplished by long-term memory mechanisms? We addressed the first question by comparing the precision of recall of attended versus unattended memory items, and found a significant decrease in precision for unattended memory items, reflecting a degradation in the quality of those representations. We addressed the second question by asking subjects to perform a WM task, followed by a surprise memory test for the items that they had seen in the WM task. Long-term memory for unattended memory items from the WM task was not better than memory for items that had remained selected by the focus of attention in the WM task. These results show that unattended WM representations are degraded in quality and are not preferentially represented in long-term memory, as compared to attended memory items.

  9. Item Information in the Rasch Model

    NARCIS (Netherlands)

    Engelen, Ron J.H.; van der Linden, Willem J.; Oosterloo, Sebe J.

    1988-01-01

    Fisher's information measure for the item difficulty parameter in the Rasch model and its marginal and conditional formulations are investigated. It is shown that expected item information in the unconditional model equals information in the marginal model, provided the assumption of sampling

  10. Hes1-deficient mice show precocious differentiation of Paneth cells in the small intestine

    International Nuclear Information System (INIS)

    Suzuki, Katsumasa; Fukui, Hirokazu; Kayahara, Takahisa; Sawada, Mitsutaka; Seno, Hiroshi; Hiai, Hiroshi; Kageyama, Ryoichiro; Okano, Hideyuki; Chiba, Tsutomu

    2005-01-01

    We have previously shown that Hes1 is expressed both in putative epithelial stem cells just above Paneth cells and in the crypt base columnar cells between Paneth cells, while Hes1 is completely absent in Paneth cells. This study was undertaken to clarify the role of Hes1 in Paneth cell differentiation, using Hes1-knockout (KO) newborn (P0) mice. Electron microscopy revealed premature appearance of distinct cells containing cytoplasmic granules in the intervillous region in Hes1-KO P0 mice, whereas those cells were absent in wild-type (WT) P0 mice. In Hes1-KO P0 mice, the gene expressions of cryptdins, exclusively present in Paneth cells, were all enhanced compared with WT P0 mice. Immunohistochemistry demonstrated increased number of both lysozyme-positive and cryptdin-4-positive cells in the small intestinal epithelium of Hes1-KO P0 mice as compared to WT P0 mice. Thus, Hes1 appears to have an inhibitory role in Paneth cell differentiation in the small intestine

  11. A hierarchy of distress and invariant item ordering in the General Health Questionnaire-12.

    Science.gov (United States)

    Doyle, F; Watson, R; Morgan, K; McBride, O

    2012-06-01

    Invariant item ordering (IIO) is defined as the extent to which items have the same ordering (in terms of item difficulty/severity - i.e. demonstrating whether items are difficult [rare] or less difficult [common]) for each respondent who completes a scale. IIO is therefore crucial for establishing a scale hierarchy that is replicable across samples, but no research has demonstrated IIO in scales of psychological distress. We aimed to determine if a hierarchy of distress with IIO exists in a large general population sample who completed a scale measuring distress. Data from 4107 participants who completed the 12-item General Health Questionnaire (GHQ-12) from the Northern Ireland Health and Social Wellbeing Survey 2005-6 were analysed. Mokken scaling was used to determine the dimensionality and hierarchy of the GHQ-12, and items were investigated for IIO. All items of the GHQ-12 formed a single, strong unidimensional scale (H=0.58). IIO was found for six of the 12 items (H-trans=0.55), and these symptoms reflected the following hierarchy: anhedonia, concentration, participation, coping, decision-making and worthlessness. The cross-sectional analysis needs replication. The GHQ-12 showed a hierarchy of distress, but IIO is only demonstrated for six of the items, and the scale could therefore be shortened. Adopting brief, hierarchical scales with IIO may be beneficial in both clinical and research contexts. Copyright © 2011 Elsevier B.V. All rights reserved.

  12. Work ability as prognostic risk marker of disability pension : Single-item work ability score versus multi-item work ability index

    NARCIS (Netherlands)

    Roelen, C.A.M.; Rhenen, van W.; Groothoff, J.W.; Klink, van der J.J.L.; Twisk, W.R.; Heymans, M.W.

    2014-01-01

    Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP.

  13. CERN Running Club – Sale of Items

    CERN Multimedia

    CERN Running club

    2018-01-01

    The CERN Running Club is organising a sale of items  on 26 June from 11:30 – 13:00 in the entry area of Restaurant 2 (504 R-202). The items for sale are souvenir prizes of past Relay Races and comprise: Backpacks, thermos, towels, gloves & caps, lamps, long sleeve winter shirts and windproof vest. All items will be sold at 5 CHF.

  14. An item response theory analysis of the Executive Interview and development of the EXIT8: A Project FRONTIER Study.

    Science.gov (United States)

    Jahn, Danielle R; Dressel, Jeffrey A; Gavett, Brandon E; O'Bryant, Sid E

    2015-01-01

    The Executive Interview (EXIT25) is an effective measure of executive dysfunction, but may be inefficient due to the time it takes to complete 25 interview-based items. The current study aimed to examine psychometric properties of the EXIT25, with a specific focus on determining whether a briefer version of the measure could comprehensively assess executive dysfunction. The current study applied a graded response model (a type of item response theory model for polytomous categorical data) to identify items that were most closely related to the underlying construct of executive functioning and best discriminated between varying levels of executive functioning. Participants were 660 adults ages 40 to 96 years living in West Texas, who were recruited through an ongoing epidemiological study of rural health and aging, called Project FRONTIER. The EXIT25 was the primary measure examined. Participants also completed the Trail Making Test and Controlled Oral Word Association Test, among other measures, to examine the convergent validity of a brief form of the EXIT25. Eight items were identified that provided the majority of the information about the underlying construct of executive functioning; total scores on these items were associated with total scores on other measures of executive functioning and were able to differentiate between cognitively healthy, mildly cognitively impaired, and demented participants. In addition, cutoff scores were recommended based on sensitivity and specificity of scores. A brief, eight-item version of the EXIT25 may be an effective and efficient screening for executive dysfunction among older adults.

  15. Work ability as prognostic risk marker of disability pension: single-item work ability score versus multi-item work ability index

    NARCIS (Netherlands)

    Roelen, C.A.M.; van Rhenen, W.; Groothoff, J.W.; van der Klink, J.J.L.; Twisk, J.W.R.; Heymans, M.W.

    2014-01-01

    Objectives Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. Methods This

  16. Work ability as prognostic risk marker of disability pension : single-item work ability score versus multi-item work ability index

    NARCIS (Netherlands)

    Roelen, Corne A. M.; van Rhenen, Willem; Groothoff, Johan W.; van der Klink, Jac J. L.; Twisk, Jos W. R.; Heymans, Martijn W.

    Objectives Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP. Methods This

  17. Translation Fidelity of Psychological Scales: An Item Response Theory Analysis of an Individualism-Collectivism Scale.

    Science.gov (United States)

    Bontempo, Robert

    1993-01-01

    Describes a method for assessing the quality of translations based on item response theory (IRT). Results from the IRT technique with French and Chinese versions of a scale measuring individualism-collectivism for samples of 250 U.S., 357 French, and 290 Chinese undergraduates show how several biased items are detected. (SLD)

  18. Psychometric properties of the Chinese version of resilience scale specific to cancer: an item response theory analysis.

    Science.gov (United States)

    Ye, Zeng Jie; Liang, Mu Zi; Zhang, Hao Wei; Li, Peng Fei; Ouyang, Xue Ren; Yu, Yuan Liang; Liu, Mei Ling; Qiu, Hong Zhong

    2018-06-01

    Classic theory test has been used to develop and validate the 25-item Resilience Scale Specific to Cancer (RS-SC) in Chinese patients with cancer. This study was designed to provide additional information about the discriminative value of the individual items tested with an item response theory analysis. A two-parameter graded response model was performed to examine whether any of the items of the RS-SC exhibited problems with the ordering and steps of thresholds, as well as the ability of items to discriminate patients with different resilience levels using item characteristic curves. A sample of 214 Chinese patients with cancer diagnosis was analyzed. The established three-dimension structure of the RS-SC was confirmed. Several items showed problematic thresholds or discrimination ability and require further revision. Some problematic items should be refined and a short-form of RS-SC maybe feasible in clinical settings in order to reduce burden on patients. However, the generalizability of these findings warrants further investigations.

  19. Use of indicator items to monitor marine debris on a New Jersey beach from 1991 to 1996

    Science.gov (United States)

    Ribic, C.A.

    1998-01-01

    The US National Marine Debris Monitoring Program is using indicator items from beach surveys to identify whether amounts of marine debris are changing over time. Indicator items were selected through expert opinion and assumed to reflect the trend of all debris. We used monthly data from a 1991-1996 study of debris on a New Jersey beach to determine if indicator and non-indicator items showed similar trends. Total indicator debris levels did not change; this was true regardless of probable source. Non-indicator debris increased about 40% annually. Plastic non-indicator items increased regardless of whether items were whole items, cigarette filters, or pieces. Of the whole items, almost 50% were plastic lids, cups, and utensils, and about 25% were drug-related paraphernalia, tobacco-related products, plastic stirrers, pull rings, and fireworks. When indicator items are used in a monitoring programme to reflect total debris patterns, concordance of trends in indicator and non-indicator debris should be checked.

  20. Binomial test models and item difficulty

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1979-01-01

    In choosing a binomial test model, it is important to know exactly what conditions are imposed on item difficulty. In this paper these conditions are examined for both a deterministic and a stochastic conception of item responses. It appears that they are more restrictive than is generally

  1. Psychometric properties of the PROMIS Physical Function item bank in patients receiving physical therapy.

    Directory of Open Access Journals (Sweden)

    Martine H P Crins

    Full Text Available The Patient-Reported Outcomes Measurement Information System (PROMIS is a universally applicable set of instruments, including item banks, short forms and computer adaptive tests (CATs, measuring patient-reported health across different patient populations. PROMIS CATs are highly efficient and the use in practice is considered feasible with little administration time, offering standardized and routine patient monitoring. Before an item bank can be used as CAT, the psychometric properties of the item bank have to be examined. Therefore, the objective was to assess the psychometric properties of the Dutch-Flemish PROMIS Physical Function item bank (DF-PROMIS-PF in Dutch patients receiving physical therapy.Cross-sectional study.805 patients >18 years, who received any kind of physical therapy in primary care in the past year, completed the full DF-PROMIS-PF (121 items.Unidimensionality was examined by Confirmatory Factor Analysis and local dependence and monotonicity were evaluated. A Graded Response Model was fitted. Construct validity was examined with correlations between DF-PROMIS-PF T-scores and scores on two legacy instruments (SF-36 Health Survey Physical Functioning scale [SF36-PF10] and the Health Assessment Questionnaire Disability-Index [HAQ-DI]. Reliability (standard errors of theta was assessed.The results for unidimensionality were mixed (scaled CFI = 0.924, TLI = 0.923, RMSEA = 0.045, 1th factor explained 61.5% of variance. Some local dependence was found (8.2% of item pairs. The item bank showed a broad coverage of the physical function construct (threshold-parameters range: -4.28-2.33 and good construct validity (correlation with SF36-PF10 = 0.84 and HAQ-DI = -0.85. Furthermore, the DF-PROMIS-PF showed greater reliability over a broader score-range than the SF36-PF10 and HAQ-DI.The psychometric properties of the DF-PROMIS-PF item bank are sufficient. The DF-PROMIS-PF can now be used as short forms or CAT to measure the level of

  2. How did Danish students solve the PISA CBAS items?

    DEFF Research Database (Denmark)

    Sørensen, Helene; Andersen, Annemarie Møller

    2009-01-01

    ’ and boys’ answers. Twelve items were chosen for focus group interviews with two groups of students – three girls and three boys. The analysis shows that the students need other competencies than in the paper-and-pencil test and another problem solving strategy. In the Danish context this may be one...

  3. Vegetable parenting practices scale: Item response modeling analyses

    Science.gov (United States)

    Our objective was to evaluate the psychometric properties of a vegetable parenting practices scale using multidimensional polytomous item response modeling which enables assessing item fit to latent variables and the distributional characteristics of the items in comparison to the respondents. We al...

  4. Overview of Classical Test Theory and Item Response Theory for Quantitative Assessment of Items in Developing Patient-Reported Outcome Measures

    Science.gov (United States)

    Cappelleri, Joseph C.; Lundy, J. Jason; Hays, Ron D.

    2014-01-01

    Introduction The U.S. Food and Drug Administration’s patient-reported outcome (PRO) guidance document defines content validity as “the extent to which the instrument measures the concept of interest” (FDA, 2009, p. 12). “Construct validity is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity” (Strauss & Smith, 2009, p. 7). Hence both qualitative and quantitative information are essential in evaluating the validity of measures. Methods We review classical test theory and item response theory approaches to evaluating PRO measures including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized “difficulty” (severity) order of items is represented by observed responses. Conclusion Classical test theory and item response theory can be useful in providing a quantitative assessment of items and scales during the content validity phase of patient-reported outcome measures. Depending on the particular type of measure and the specific circumstances, either one or both approaches should be considered to help maximize the content validity of PRO measures. PMID:24811753

  5. Sources of interference in item and associative recognition memory.

    Science.gov (United States)

    Osth, Adam F; Dennis, Simon

    2015-04-01

    A powerful theoretical framework for exploring recognition memory is the global matching framework, in which a cue's memory strength reflects the similarity of the retrieval cues being matched against the contents of memory simultaneously. Contributions at retrieval can be categorized as matches and mismatches to the item and context cues, including the self match (match on item and context), item noise (match on context, mismatch on item), context noise (match on item, mismatch on context), and background noise (mismatch on item and context). We present a model that directly parameterizes the matches and mismatches to the item and context cues, which enables estimation of the magnitude of each interference contribution (item noise, context noise, and background noise). The model was fit within a hierarchical Bayesian framework to 10 recognition memory datasets that use manipulations of strength, list length, list strength, word frequency, study-test delay, and stimulus class in item and associative recognition. Estimates of the model parameters revealed at most a small contribution of item noise that varies by stimulus class, with virtually no item noise for single words and scenes. Despite the unpopularity of background noise in recognition memory models, background noise estimates dominated at retrieval across nearly all stimulus classes with the exception of high frequency words, which exhibited equivalent levels of context noise and background noise. These parameter estimates suggest that the majority of interference in recognition memory stems from experiences acquired before the learning episode. (c) 2015 APA, all rights reserved).

  6. Development and psychometric characteristics of the SCI-QOL Bladder Management Difficulties and Bowel Management Difficulties item banks and short forms and the SCI-QOL Bladder Complications scale.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Tate, Denise G; Spungen, Ann M; Kirshblum, Steven C

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury--Quality of Life (SCI-QOL) Bladder Management Difficulties and Bowel Management Difficulties item banks and Bladder Complications scale. Using a mixed-methods design, a pool of items assessing bladder and bowel-related concerns were developed using focus groups with individuals with spinal cord injury (SCI) and SCI clinicians, cognitive interviews, and item response theory (IRT) analytic approaches, including tests of model fit and differential item functioning. Thirty-eight bladder items and 52 bowel items were tested at the University of Michigan, Kessler Foundation Research Center, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters VA Medical Center, Bronx, NY. Seven hundred fifty-seven adults with traumatic SCI. The final item banks demonstrated unidimensionality (Bladder Management Difficulties CFI=0.965; RMSEA=0.093; Bowel Management Difficulties CFI=0.955; RMSEA=0.078) and acceptable fit to a graded response IRT model. The final calibrated Bladder Management Difficulties bank includes 15 items, and the final Bowel Management Difficulties item bank consists of 26 items. Additionally, 5 items related to urinary tract infections (UTI) did not fit with the larger Bladder Management Difficulties item bank but performed relatively well independently (CFI=0.992, RMSEA=0.050) and were thus retained as a separate scale. The SCI-QOL Bladder Management Difficulties and Bowel Management Difficulties item banks are psychometrically robust and are available as computer adaptive tests or short forms. The SCI-QOL Bladder Complications scale is a brief, fixed-length outcomes instrument for individuals with a UTI.

  7. CTTITEM: SAS macro and SPSS syntax for classical item analysis.

    Science.gov (United States)

    Lei, Pui-Wa; Wu, Qiong

    2007-08-01

    This article describes the functions of a SAS macro and an SPSS syntax that produce common statistics for conventional item analysis including Cronbach's alpha, item difficulty index (p-value or item mean), and item discrimination indices (D-index, point biserial and biserial correlations for dichotomous items and item-total correlation for polytomous items). These programs represent an improvement over the existing SAS and SPSS item analysis routines in terms of completeness and user-friendliness. To promote routine evaluations of item qualities in instrument development of any scale, the programs are available at no charge for interested users. The program codes along with a brief user's manual that contains instructions and examples are downloadable from suen.ed.psu.edu/-pwlei/plei.htm.

  8. Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients

    DEFF Research Database (Denmark)

    Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J. B.

    2017-01-01

    on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). METHODS: In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients...... model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study...... sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. CONCLUSION: A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient...

  9. Negative effects of item repetition on source memory

    OpenAIRE

    Kim, Kyungmi; Yi, Do-Joon; Raye, Carol L.; Johnson, Marcia K.

    2012-01-01

    In the present study, we explored how item repetition affects source memory for new item–feature associations (picture–location or picture–color). We presented line drawings varying numbers of times in Phase 1. In Phase 2, each drawing was presented once with a critical new feature. In Phase 3, we tested memory for the new source feature of each item from Phase 2. Experiments 1 and 2 demonstrated and replicated the negative effects of item repetition on incidental source memory. Prior item re...

  10. Development and validation of an item response theory-based Social Responsiveness Scale short form.

    Science.gov (United States)

    Sturm, Alexandra; Kuhfeld, Megan; Kasari, Connie; McCracken, James T

    2017-09-01

    Research and practice in autism spectrum disorder (ASD) rely on quantitative measures, such as the Social Responsiveness Scale (SRS), for characterization and diagnosis. Like many ASD diagnostic measures, SRS scores are influenced by factors unrelated to ASD core features. This study further interrogates the psychometric properties of the SRS using item response theory (IRT), and demonstrates a strategy to create a psychometrically sound short form by applying IRT results. Social Responsiveness Scale analyses were conducted on a large sample (N = 21,426) of youth from four ASD databases. Items were subjected to item factor analyses and evaluation of item bias by gender, age, expressive language level, behavior problems, and nonverbal IQ. Item selection based on item psychometric properties, DIF analyses, and substantive validity produced a reduced item SRS short form that was unidimensional in structure, highly reliable (α = .96), and free of gender, age, expressive language, behavior problems, and nonverbal IQ influence. The short form also showed strong relationships with established measures of autism symptom severity (ADOS, ADI-R, Vineland). Degree of association between all measures varied as a function of expressive language. Results identified specific SRS items that are more vulnerable to non-ASD-related traits. The resultant 16-item SRS short form may possess superior psychometric properties compared to the original scale and emerge as a more precise measure of ASD core symptom severity, facilitating research and practice. Future research using IRT is needed to further refine existing measures of autism symptomatology. © 2017 Association for Child and Adolescent Mental Health.

  11. Three controversies over item disclosure in medical licensure examinations

    Directory of Open Access Journals (Sweden)

    Yoon Soo Park

    2015-09-01

    Full Text Available In response to views on public's right to know, there is growing attention to item disclosure – release of items, answer keys, and performance data to the public – in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations – 1 fairness and validity, 2 impact on passing levels, and 3 utility of item disclosure – by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers’ right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration.

  12. Method of data mining including determining multidimensional coordinates of each item using a predetermined scalar similarity value for each item pair

    Science.gov (United States)

    Meyers, Charles E.; Davidson, George S.; Johnson, David K.; Hendrickson, Bruce A.; Wylie, Brian N.

    1999-01-01

    A method of data mining represents related items in a multidimensional space. Distance between items in the multidimensional space corresponds to the extent of relationship between the items. The user can select portions of the space to perceive. The user also can interact with and control the communication of the space, focusing attention on aspects of the space of most interest. The multidimensional spatial representation allows more ready comprehension of the structure of the relationships among the items.

  13. Guide to good practices for the development of test items

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-01-01

    While the methodology used in developing test items can vary significantly, to ensure quality examinations, test items should be developed systematically. Test design and development is discussed in the DOE Guide to Good Practices for Design, Development, and Implementation of Examinations. This guide is intended to be a supplement by providing more detailed guidance on the development of specific test items. This guide addresses the development of written examination test items primarily. However, many of the concepts also apply to oral examinations, both in the classroom and on the job. This guide is intended to be used as guidance for the classroom and laboratory instructor or curriculum developer responsible for the construction of individual test items. This document focuses on written test items, but includes information relative to open-reference (open book) examination test items, as well. These test items have been categorized as short-answer, multiple-choice, or essay. Each test item format is described, examples are provided, and a procedure for development is included. The appendices provide examples for writing test items, a test item development form, and examples of various test item formats.

  14. 38 CFR 3.1606 - Transportation items.

    Science.gov (United States)

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Transportation items. 3... Burial Benefits § 3.1606 Transportation items. The transportation costs of those persons who come within... shipment. (6) Cost of transportation by common carrier including amounts paid as Federal taxes. (7) Cost of...

  15. Assessing difference between classical test theory and item ...

    African Journals Online (AJOL)

    Assessing difference between classical test theory and item response theory methods in scoring primary four multiple choice objective test items. ... All research participants were ranked on the CTT number correct scores and the corresponding IRT item pattern scores from their performance on the PRISMADAT. Wilcoxon ...

  16. The basics of item response theory using R

    CERN Document Server

    Baker, Frank B

    2017-01-01

    This graduate-level textbook is a tutorial for item response theory that covers both the basics of item response theory and the use of R for preparing graphical presentation in writings about the theory. Item response theory has become one of the most powerful tools used in test construction, yet one of the barriers to learning and applying it is the considerable amount of sophisticated computational effort required to illustrate even the simplest concepts. This text provides the reader access to the basic concepts of item response theory freed of the tedious underlying calculations. It is intended for those who possess limited knowledge of educational measurement and psychometrics. Rather than presenting the full scope of item response theory, this textbook is concise and practical and presents basic concepts without becoming enmeshed in underlying mathematical and computational complexities. Clearly written text and succinct R code allow anyone familiar with statistical concepts to explore and apply item re...

  17. Science Library of Test Items. Volume Twenty-One. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 2.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  18. Using an FSDS-R Item to Screen for Sexually Related Distress: A MsFLASH Analysis

    Science.gov (United States)

    Carpenter, Janet S; Reed, Susan D; Guthrie, Katherine A; Larson, Joseph C; Newton, Katherine M; Lau, R Jane; Learman, Lee A; Shifren, Jan L

    2015-01-01

    Introduction The Female Sexual Distress Scale-Revised (FSDS-R) was created and validated to assess distress associated with impaired sexual function, but it is lengthy for use in clinical practice and research when assessing sexual function is not a primary objective. Aim The study aims to evaluate whether a single item from the FSDS-R could be identified to use to screen midlife women for bothersome diminution in sexual function based on three criteria: (i) highly correlated with total scores; (ii) correlated with commonly assessed domains of female sexual functioning; and (iii) able to differentiate between women reporting high and low sexual concerns during the prior month. Methods Data from 93 midlife women were collected by the Menopause Strategies Finding Lasting Answers to Symptoms and Health (MsFLASH) research network. Main Outcome Measures Women completed the FSDS-R, Female Sexual Function Index (FSFI), and Menopausal Quality of Life Scale (MENQOL). Those who reported a change in the past month on the MENQOL sexual were categorized into a high sexual concerns group, while all others were categorized into a low sexual concerns group. Results Women were an average of 54.6 years old (SD 3.1) and mostly Caucasian (77.4%), college educated (60.2%), married/living as married (64.5%), and postmenopausal (79.6%). The FSDS-R item number 1 “Distressed about sex life” was: (i) highly correlated with FSDS-R total scores (r = 0.90); (ii) moderately correlated with FSFI total scores (r = −0.38) and FSFI desire (r = −0.37) and satisfaction domains (r = −0.40); and (iii) showed one of the largest mean differences between high and low sexual concerns groups (P Guthrie KA, Larson JC, Newton KM, Lau RJ, Learman LA, and Shifren JL. Using an FSDS-R item to screen for sexually related distress: A MsFLASH analysis. Sex Med 2015;3:7–13. PMID:25844170

  19. Schwarzian conditions for linear differential operators with selected differential Galois groups

    International Nuclear Information System (INIS)

    Abdelaziz, Y; Maillard, J-M

    2017-01-01

    We show that non-linear Schwarzian differential equations emerging from covariance symmetry conditions imposed on linear differential operators with hypergeometric function solutions can be generalized to arbitrary order linear differential operators with polynomial coefficients having selected differential Galois groups. For order three and order four linear differential operators we show that this pullback invariance up to conjugation eventually reduces to symmetric powers of an underlying order-two operator. We give, precisely, the conditions to have modular correspondences solutions for such Schwarzian differential equations, which was an open question in a previous paper. We analyze in detail a pullbacked hypergeometric example generalizing modular forms, that ushers a pullback invariance up to operator homomorphisms. We finally consider the more general problem of the equivalence of two different order-four linear differential Calabi–Yau operators up to pullbacks and conjugation, and clarify the cases where they have the same Yukawa couplings. (paper)

  20. Schwarzian conditions for linear differential operators with selected differential Galois groups

    Science.gov (United States)

    Abdelaziz, Y.; Maillard, J.-M.

    2017-11-01

    We show that non-linear Schwarzian differential equations emerging from covariance symmetry conditions imposed on linear differential operators with hypergeometric function solutions can be generalized to arbitrary order linear differential operators with polynomial coefficients having selected differential Galois groups. For order three and order four linear differential operators we show that this pullback invariance up to conjugation eventually reduces to symmetric powers of an underlying order-two operator. We give, precisely, the conditions to have modular correspondences solutions for such Schwarzian differential equations, which was an open question in a previous paper. We analyze in detail a pullbacked hypergeometric example generalizing modular forms, that ushers a pullback invariance up to operator homomorphisms. We finally consider the more general problem of the equivalence of two different order-four linear differential Calabi-Yau operators up to pullbacks and conjugation, and clarify the cases where they have the same Yukawa couplings.

  1. AN EFFICIENT DATA MINING METHOD TO FIND FREQUENT ITEM SETS IN LARGE DATABASE USING TR- FCTM

    Directory of Open Access Journals (Sweden)

    Saravanan Suba

    2016-01-01

    Full Text Available Mining association rules in large database is one of most popular data mining techniques for business decision makers. Discovering frequent item set is the core process in association rule mining. Numerous algorithms are available in the literature to find frequent patterns. Apriori and FP-tree are the most common methods for finding frequent items. Apriori finds significant frequent items using candidate generation with more number of data base scans. FP-tree uses two database scans to find significant frequent items without using candidate generation. This proposed TR-FCTM (Transaction Reduction- Frequency Count Table Method discovers significant frequent items by generating full candidates once to form frequency count table with one database scan. Experimental results of TR-FCTM shows that this algorithm outperforms than Apriori and FP-tree.

  2. Calibration of Automatically Generated Items Using Bayesian Hierarchical Modeling.

    Science.gov (United States)

    Johnson, Matthew S.; Sinharay, Sandip

    For complex educational assessments, there is an increasing use of "item families," which are groups of related items. However, calibration or scoring for such an assessment requires fitting models that take into account the dependence structure inherent among the items that belong to the same item family. C. Glas and W. van der Linden…

  3. ACER Chemistry Test Item Collection. ACER Chemtic Year 12.

    Science.gov (United States)

    Australian Council for Educational Research, Hawthorn.

    The chemistry test item banks contains 225 multiple-choice questions suitable for diagnostic and achievement testing; a three-page teacher's guide; answer key with item facilities; an answer sheet; and a 45-item sample achievement test. Although written for the new grade 12 chemistry course in Victoria, Australia, the items are widely applicable.…

  4. Bilinguals Show Weaker Lexical Access during Spoken Sentence Comprehension

    Science.gov (United States)

    Shook, Anthony; Goldrick, Matthew; Engstler, Caroline; Marian, Viorica

    2015-01-01

    When bilinguals process written language, they show delays in accessing lexical items relative to monolinguals. The present study investigated whether this effect extended to spoken language comprehension, examining the processing of sentences with either low or high semantic constraint in both first and second languages. English-German…

  5. A Fast Recommender System for Cold User Using Categorized Items

    Directory of Open Access Journals (Sweden)

    Hamid Jazayeriy

    2018-01-01

    Full Text Available In recent years, recommender systems (RS provide a considerable progress to users. RSs reduce the cost of a user’s time in order to reach to desired results faster. The main issue of RSs is the presence of cold users which are less active and their preferences are more difficult to detect. The aim of this study is to provide a new way to improve recall and precision in recommender systems for cold users. According to the available categories of items, prioritization of the proposed items is improved and then presented to the cold user. The obtained results show that in addition to increased speed of processing, recall and precision have an acceptable improvement.

  6. Counterfeit and Fraudulent Items - Mitigating the risk

    International Nuclear Information System (INIS)

    Tannenbaum, Marc

    2011-01-01

    This presentation (slides) provides an overview of the industry's challenges and activities. Firstly, it outlines the differences between counterfeit, fraudulent, suspect, and also substandard items. Notice is given that items could be found not to meet the standard, but the difference in the intent to deceive with counterfeit and fraudulent items is the critical element. Examples from other industries are used which also rely heavily on the assurance of quality for safety. It also informs that EPRI has just completed a report in October 2009 in coordination with other US government agencies and industry organizations; this report, entitled Counterfeit, Substandard and Fraudulent Items, number 1019163, is available for free on the EPRI web site. As a follow-up to this report, EPRI is developing a CFSI Database; any country interested in a collaborative agreement is invited to use and contribute to the database information. Finally, it stresses the importance of the oversight of contractors, training to raise the awareness of the employees and the inspectors, and having a response plan for identified items

  7. Construct validity of the items on the Stroke Specific Quality of Life (SS-QOL) questionnaire that evaluate the participation component of the International Classification of Functioning, Disability and Health.

    Science.gov (United States)

    Silva, Soraia Micaela; Corrêa, Fernanda Ishida; Pereira, Gabriela Santos; Faria, Christina Danielli Coelho de Morais; Corrêa, João Carlos Ferrari

    2018-01-01

    Analyze the construct validity and internal consistency of the Stroke Specific Quality of Life (SS-QOL) items that address the participation component of the ICF as well as analyze the ceiling and floor effects. One hundred subjects were analyzed: 85 community-dwelling and 15 institutionalized individuals. The analysis of construct validity was performed using classic psychometrics: (1) the comparison of known groups (individuals without restriction to participation vs. those with restriction to participation) using the Mann-Whitney test and (2) convergent validity - correlation between the scores on the SS-QOL items that address participation and the subscale scores of measures used to evaluate the similar constructs and concepts [the Short-Form Health Survey (SF-36), Functional Independence Measure (FIM) and grip strength test]. Spearman's correlation coefficients were calculated for this analysis. Cronbach's α was used for the analysis of internal consistency and both the ceiling and floor effects were analyzed. The level of significance for all analyses was α = 0.05. The a priori hypotheses regarding construct validity were partially demonstrated, as only five of the eight domains exhibited positive moderate to strong correlations (r > 0.40) with measures that address constructs similar to those addressed on the SS-QOL questionnaire. The items demonstrated adequate internal consistency and are capable of differentiating individuals with and without restriction to participation. The ceiling and floor effects were considered adequate for the total SS-QOL score, but beyond acceptable standards for some domains. The 26 items of the SS-QOL questionnaire measure a multidimensional construct and therefore do not only address participation. However, the items demonstrated adequate internal consistency and are capable of differentiating individuals with and without restriction to participation. Implications for rehabilitation The 26 items of the SS

  8. Changes in the nutritional quality of fast-food items marketed at restaurants, 2010 v. 2013.

    Science.gov (United States)

    Soo, Jackie; Harris, Jennifer L; Davison, Kirsten K; Williams, David R; Roberto, Christina A

    2018-03-27

    To examine the nutritional quality of menu items promoted in four (US) fast-food restaurant chains (McDonald's, Burger King, Wendy's, Taco Bell) in 2010 and 2013. Menu items pictured on signs and menu boards were recorded at 400 fast-food restaurants across the USA. The Nutrient Profile Index (NPI) was used to calculate overall nutrition scores for items (higher scores indicate greater nutritional quality) and was dichotomized to denote healthier v. less healthy items. Changes over time in NPI scores and energy of promoted foods and beverages were analysed using linear regression. Four hundred fast-food restaurants (McDonald's, Burger King, Wendy's, Taco Bell; 100 locations per chain). NPI of fast-food items marketed at fast-food restaurants. Promoted foods and beverages on general menu boards and signs remained below the 'healthier' cut-off at both time points. On general menu boards, pictured items became modestly healthier from 2010 to 2013, increasing (mean (se)) by 3·08 (0·16) NPI score points (Prestaurants showed limited improvements in nutritional quality in 2013 v. 2010.

  9. Pattern analysis of total item score and item response of the Kessler Screening Scale for Psychological Distress (K6 in a nationally representative sample of US adults

    Directory of Open Access Journals (Sweden)

    Shinichiro Tomitaka

    2017-02-01

    Full Text Available Background Several recent studies have shown that total scores on depressive symptom measures in a general population approximate an exponential pattern except for the lower end of the distribution. Furthermore, we confirmed that the exponential pattern is present for the individual item responses on the Center for Epidemiologic Studies Depression Scale (CES-D. To confirm the reproducibility of such findings, we investigated the total score distribution and item responses of the Kessler Screening Scale for Psychological Distress (K6 in a nationally representative study. Methods Data were drawn from the National Survey of Midlife Development in the United States (MIDUS, which comprises four subsamples: (1 a national random digit dialing (RDD sample, (2 oversamples from five metropolitan areas, (3 siblings of individuals from the RDD sample, and (4 a national RDD sample of twin pairs. K6 items are scored using a 5-point scale: “none of the time,” “a little of the time,” “some of the time,” “most of the time,” and “all of the time.” The pattern of total score distribution and item responses were analyzed using graphical analysis and exponential regression model. Results The total score distributions of the four subsamples exhibited an exponential pattern with similar rate parameters. The item responses of the K6 approximated a linear pattern from “a little of the time” to “all of the time” on log-normal scales, while “none of the time” response was not related to this exponential pattern. Discussion The total score distribution and item responses of the K6 showed exponential patterns, consistent with other depressive symptom scales.

  10. Utilizing Response Time Distributions for Item Selection in CAT

    Science.gov (United States)

    Fan, Zhewen; Wang, Chun; Chang, Hua-Hua; Douglas, Jeffrey

    2012-01-01

    Traditional methods for item selection in computerized adaptive testing only focus on item information without taking into consideration the time required to answer an item. As a result, some examinees may receive a set of items that take a very long time to finish, and information is not accrued as efficiently as possible. The authors propose two…

  11. Refreshing memory traces: thinking of an item improves retrieval from visual working memory.

    Science.gov (United States)

    Souza, Alessandra S; Rerko, Laura; Oberauer, Klaus

    2015-03-01

    This article provides evidence that refreshing, a hypothetical attention-based process operating in working memory (WM), improves the accessibility of visual representations for recall. "Thinking of", one of several concurrently active representations, is assumed to refresh its trace in WM, protecting the representation from being forgotten. The link between refreshing and WM performance, however, has only been tenuously supported by empirical evidence. Here, we controlled which and how often individual items were refreshed in a color reconstruction task by presenting cues prompting participants to think of specific WM items during the retention interval. We show that the frequency with which an item is refreshed improves recall of this item from visual WM. Our study establishes a role of refreshing in recall from visual WM and provides a new method for studying the impact of refreshing on the amount of information we can keep accessible for ongoing cognition. © 2014 New York Academy of Sciences.

  12. Item analysis and evaluation in the examinations in the faculty of ...

    African Journals Online (AJOL)

    2014-11-05

    Nov 5, 2014 ... Key words: Classical test theory, item analysis, item difficulty, item discrimination, item response theory, reliability ... the probability of answering an item correctly or of attaining ..... A Monte Carlo comparison of item and person.

  13. Constraint Differentiation

    DEFF Research Database (Denmark)

    Mödersheim, Sebastian Alexander; Basin, David; Viganò, Luca

    2010-01-01

    We introduce constraint differentiation, a powerful technique for reducing search when model-checking security protocols using constraint-based methods. Constraint differentiation works by eliminating certain kinds of redundancies that arise in the search space when using constraints to represent...... results show that constraint differentiation substantially reduces search and considerably improves the performance of OFMC, enabling its application to a wider class of problems....

  14. Why are the Mathematics National Examination Items Difficult and What Is Teachers’ Strategy to Overcome It?

    Directory of Open Access Journals (Sweden)

    Heri Retnawati

    2017-07-01

    Full Text Available The quality of national examination items plays an enormous role in identifying students’ competencies mastery and their difficulties. This study aims to identify the difficult items in the Junior High School Mathematics National Examination, to find the factors that cause students’ difficulty and to reveal the strategies that the teachers and the students might implement in order to overcome them. The study is phenomenological research with the mixed methods. The data were collected using documentation of students’ responses and focus group discussion (FGD of teachers. The data analysis was conducted using Milles & Hubberman steps. The results of the study showed that there were 4 difficult items of the 40 test items for the students. The students’ difficulties were the lack of concept understanding, difficulties in calculating, difficulties in selecting information, being deceived by the distractors, being unaccustomed to completing complex and non-integers test items, and completing contextual test items that have been presented in the form of figures or narrative texts.

  15. An NCME Instructional Module on Polytomous Item Response Theory Models

    Science.gov (United States)

    Penfield, Randall David

    2014-01-01

    A polytomous item is one for which the responses are scored according to three or more categories. Given the increasing use of polytomous items in assessment practices, item response theory (IRT) models specialized for polytomous items are becoming increasingly common. The purpose of this ITEMS module is to provide an accessible overview of…

  16. Examining the Effect of Reverse Worded Items on the Factor Structure of the Need for Cognition Scale.

    Directory of Open Access Journals (Sweden)

    Xijuan Zhang

    Full Text Available Reverse worded (RW items are often used to reduce or eliminate acquiescence bias, but there is a rising concern about their harmful effects on the covariance structure of the scale. Therefore, results obtained via traditional covariance analyses may be distorted. This study examined the effect of the RW items on the factor structure of the abbreviated 18-item Need for Cognition (NFC scale using confirmatory factor analysis. We modified the scale to create three revised versions, varying from no RW items to all RW items. We also manipulated the type of the RW items (polar opposite vs. negated. To each of the four scales, we fit four previously developed models. The four models included a 1-factor model, a 2-factor model distinguishing between positively worded (PW items and RW items, and two 2-factor models, each with one substantive factor and one method factor. Results showed that the number and type of the RW items affected the factor structure of the NFC scale. Consistent with previous research findings, for the original NFC scale, which contains both PW and RW items, the 1-factor model did not have good fit. In contrast, for the revised scales that had no RW items or all RW items, the 1-factor model had reasonably good fit. In addition, for the scale with polar opposite and negated RW items, the factor model with a method factor among the polar opposite items had considerably better fit than the 1-factor model.

  17. 41 CFR 101-27.204 - Types of shelf-life items.

    Science.gov (United States)

    2010-07-01

    ... 41 Public Contracts and Property Management 2 2010-07-01 2010-07-01 true Types of shelf-life items...-Management of Shelf-Life Materials § 101-27.204 Types of shelf-life items. Shelf-life items are classified as nonextendable (Type I) and extendable (Type II). Type I items have a definite storage life after which the item...

  18. Constructing the 32-item Fitness-to-Drive Screening Measure.

    Science.gov (United States)

    Medhizadah, Shabnam; Classen, Sherrilene; Johnson, Andrew M

    2018-04-01

    The Fitness-to-Drive Screening Measure © (FTDS) enables proxies to identify at-risk older drivers via 54 driving-related items, but may be too lengthy for widespread uptake. We reduced the number of items in the FTDS and validated the shorter measure, using 200 caregiver responses. Exploratory factor analysis and classical test theory techniques were used to determine the most interpretable factor model and the minimum number of items to be used for predicting fitness to drive. The extent to which the shorter FTDS predicted the results of the 54-item FTDS was evaluated through correlational analysis. A three-factor model best represented the empirical data. Classical test theory techniques lead to the development of the 32-item FTDS. The 32-item FTDS was highly correlated ( r = .99, p = .05) with the FTDS. The 32-item FTDS may provide raters with a faster and more efficient way to identify at-risk older drivers.

  19. Tailored Cloze: Improved with Classical Item Analysis Techniques.

    Science.gov (United States)

    Brown, James Dean

    1988-01-01

    The reliability and validity of a cloze procedure used as an English-as-a-second-language (ESL) test in China were improved by applying traditional item analysis and selection techniques. The 'best' test items were chosen on the basis of item facility and discrimination indices, and were administered as a 'tailored cloze.' 29 references listed.…

  20. The Large Margin Mechanism for Differentially Private Maximization

    OpenAIRE

    Chaudhuri, Kamalika; Hsu, Daniel; Song, Shuang

    2014-01-01

    A basic problem in the design of privacy-preserving algorithms is the private maximization problem: the goal is to pick an item from a universe that (approximately) maximizes a data-dependent function, all under the constraint of differential privacy. This problem has been used as a sub-routine in many privacy-preserving algorithms for statistics and machine-learning. Previous algorithms for this problem are either range-dependent---i.e., their utility diminishes with the size of the universe...

  1. Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

    Directory of Open Access Journals (Sweden)

    Suttida Rakkapao

    2016-10-01

    Full Text Available This study investigated the multiple-choice test of understanding of vectors (TUV, by applying item response theory (IRT. The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test’s distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.

  2. Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

    Science.gov (United States)

    Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

    2016-12-01

    This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC) that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test's distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.

  3. Cross-cultural validity of the Spanish version of PHQ-9 among pregnant Peruvian women: a Rasch item response theory analysis.

    Science.gov (United States)

    Zhong, Qiuyue; Gelaye, Bizu; Fann, Jesse R; Sanchez, Sixto E; Williams, Michelle A

    2014-04-01

    We sought to evaluate the validity of the Spanish language version of the patient health questionnaire-9 (PHQ-9) depression scale in a large sample of pregnant Peruvian women using Rasch item response theory (IRT) approaches. We further sought to examine the appropriateness of the response formats, reliability and potential differential item functioning (DIF) by maternal age, educational attainment and employment status. This cross-sectional study was conducted among 1520 pregnant women in Lima, Peru. A structured interview was used to collect information on demographic characteristics and PHQ-9 items. Data from the PHQ-9 were fitted to the Rasch IRT model and tested for appropriate category ordering, the assumptions of unidimensionality and local independence, item fit, reliability and presence of DIF. The Spanish language version of PHQ-9 demonstrated unidimensionality, local independence, and acceptable fit for the Rasch IRT model. However, we detected disordered response categories for the original four response categories. After collapsing "more than half the days" and "nearly every day", the response categories ordered properly and the PHQ-9 fit the Rasch IRT model. The PHQ-9 had moderate internal consistency (person separation index, PSI=0.72). Additionally, the items of PHQ-9 were free of DIF with regard to age, educational attainment, and employment status. The Spanish language version of the PHQ-9 was shown to have item properties of an effective screening instrument. Collapsing rating scale categories and reconstructing three-point Likert scale for all items improved the fit of the instrument. Future studies are warranted to establish new cutoff scores and criterion validity of the three-point Likert scale response options for the Spanish language version of the PHQ-9. Copyright © 2014 Elsevier B.V. All rights reserved.

  4. 41 CFR 101-26.605 - Items other than petroleum products and electronic items available from the Defense Logistics...

    Science.gov (United States)

    2010-07-01

    ... petroleum products and electronic items available from the Defense Logistics Agency. 101-26.605 Section 101... available from the Defense Logistics Agency. Agencies required to use GSA supply sources should also use... Logistics Agency, the catalog will contain only those items in Federal supply classification classes which...

  5. Extending item response theory to online homework

    Directory of Open Access Journals (Sweden)

    Gerd Kortemeyer

    2014-05-01

    Full Text Available Item response theory (IRT becomes an increasingly important tool when analyzing “big data” gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large-enrollment physics course for scientists and engineers, the study compares outcomes from IRT analyses of exam and homework data, and then proceeds to investigate the effects of each confounding factor introduced in the online realm. It is found that IRT yields the correct trends for learner ability and meaningful item parameters, yet overall agreement with exam data is moderate. It is also found that learner ability and item discrimination is robust over a wide range with respect to model assumptions and introduced noise. Item difficulty is also robust, but over a narrower range.

  6. Editorial Changes and Item Performance: Implications for Calibration and Pretesting

    Directory of Open Access Journals (Sweden)

    Heather Stoffel

    2014-11-01

    Full Text Available Previous research on the impact of text and formatting changes on test-item performance has produced mixed results. This matter is important because it is generally acknowledged that any change to an item requires that it be recalibrated. The present study investigated the effects of seven classes of stylistic changes on item difficulty, discrimination, and response time for a subset of 65 items that make up a standardized test for physician licensure completed by 31,918 examinees in 2012. One of two versions of each item (original or revised was randomly assigned to examinees such that each examinee saw only two experimental items, with each item being administered to approximately 480 examinees. The stylistic changes had little or no effect on item difficulty or discrimination; however, one class of edits -' changing an item from an open lead-in (incomplete statement to a closed lead-in (direct question -' did result in slightly longer response times. Data for nonnative speakers of English were analyzed separately with nearly identical results. These findings have implications for the conventional practice of repretesting (or recalibrating items that have been subjected to minor editorial changes.

  7. Item response theory analysis of Centers for Disease Control and Prevention Health-Related Quality of Life (CDC HRQOL) items in adults with arthritis.

    Science.gov (United States)

    Mielenz, Thelma J; Callahan, Leigh F; Edwards, Michael C

    2016-03-12

    Examine the feasibility of performing an item response theory (IRT) analysis on two of the Centers for Disease Control and Prevention health-related quality of life (CDC HRQOL) modules - the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM). Previous principal components analyses confirm that the two scales both assess a mix of mental (CDC-MH) and physical health (CDC-PH). The purpose is to conduct item response theory (IRT) analysis on the CDC-MH and CDC-PH scales separately. 2182 patients with self-reported or physician-diagnosed arthritis completed a cross-sectional survey including HDCM and HDSM items. Besides global health, the other 8 items ask the number of days that some statement was true; we chose to recode the data into 8 categories based on observed clustering. The IRT assumptions were assessed using confirmatory factor analysis and the data could be modeled using an unidimensional IRT model. The graded response model was used for IRT analyses and CDC-MH and CDC-PH scales were analyzed separately in flexMIRT. The IRT parameter estimates for the five-item CDC-PH all appeared reasonable. The three-item CDC-MH did not have reasonable parameter estimates. The CDC-PH scale is amenable to IRT analysis but the existing The CDC-MH scale is not. We suggest either using the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM) as they currently stand or the CDC-PH scale alone if the primary goal is to measure physical health related HRQOL.

  8. Detection and validation of unscalable item score patterns using item response theory: an illustration with Harter's Self-Perception Profile for Children.

    Science.gov (United States)

    Meijer, Rob R; Egberink, Iris J L; Emons, Wilco H M; Sijtsma, Klaas

    2008-05-01

    We illustrate the usefulness of person-fit methodology for personality assessment. For this purpose, we use person-fit methods from item response theory. First, we give a nontechnical introduction to existing person-fit statistics. Second, we analyze data from Harter's (1985) Self-Perception Profile for Children (Harter, 1985) in a sample of children ranging from 8 to 12 years of age (N = 611) and argue that for some children, the scale scores should be interpreted with care and caution. Combined information from person-fit indexes and from observation, interviews, and self-concept theory showed that similar score profiles may have a different interpretation. For some children in the sample, item scores did not adequately reflect their trait level. Based on teacher interviews, this was found to be due most likely to a less developed self-concept and/or problems understanding the meaning of the questions. We recommend investigating the scalability of score patterns when using self-report inventories to help the researcher interpret respondents' behavior correctly.

  9. Measurement invariance across educational levels and gender in 12-item Zarit Burden Interview (ZBI) on caregivers of people with dementia.

    Science.gov (United States)

    Lin, Chung-Ying; Ku, Li-Jung Elizabeth; Pakpour, Amir H

    2017-11-01

    The Zarit Burden Interview (ZBI) is a commonly used self-report to assess caregiver burden. A 12-item short form of the ZBI has been developed; however, its measurement invariance has not been examined across some different demographics. It is unclear whether different genders and educational levels of a population interpret the ZBI items similarly. Therefore, this study aimed to examine the measurement invariance of the 12-item ZBI across gender and educational levels in a Taiwanese sample. Caregivers who had a family member with dementia (n = 270) completed the ZBI through telephone interviews. Three confirmatory factor analysis (CFA) models were conducted: Model 1 was the configural model, Model 2 constrained all factor loadings, Model 3 constrained all factor loadings and item intercepts. Multiple group CFAs and the differential item functioning (DIF) contrast under Rasch analyses were used to detect measurement invariance across males (n = 100) and females (n = 170) and across educational levels of junior high schools and below (n = 86) and senior high schools and above (n = 183). The fit index differences between models supported the measurement invariance across gender and across educational levels (∆ comparative fit index (CFI) = -0.010 and 0.003; ∆ root mean square error of approximation (RMSEA) = -0.006 to 0.004). No substantial DIF contrast was found across gender and educational levels (value = -0.36 to 0.29). The ZBI is appropriate for combined use and for comparisons in caregivers across gender and different educational levels in Taiwan.

  10. On multidimensional item response theory -- a coordinate free approach

    OpenAIRE

    Antal, Tamás

    2007-01-01

    A coordinate system free definition of complex structure multidimensional item response theory (MIRT) for dichotomously scored items is presented. The point of view taken emphasizes the possibilities and subtleties of understanding MIRT as a multidimensional extension of the ``classical'' unidimensional item response theory models. The main theorem of the paper is that every monotonic MIRT model looks the same; they are all trivial extensions of univariate item response theory.

  11. Item-level and subscale-level factoring of Biggs' Learning Process Questionnaire (LPQ) in a mainland Chinese sample.

    Science.gov (United States)

    Sachs, J; Gao, L

    2000-09-01

    The learning process questionnaire (LPQ) has been the source of intensive cross-cultural study. However, an item-level factor analysis of all the LPQ items simultaneously has never been reported. Rather, items within each subscale have been factor analysed to establish subscale unidimensionality and justify the use of composite subscale scores. It was of major interest to see if the six logically constructed items groups of the LPQ would be supported by empirical evidence. Additionally, it was of interest to compare the consistency of the reliability and correlational structure of the LPQ subscales in our study with those of previous cross-cultural studies. Confirmatory factor analysis was used to fit the six-factor item level model and to fit five representative subscale level factor models. A total of 1070 students between the ages of 15 to 18 years was drawn from a representative selection of 29 classes from within 15 secondary schools in Guangzhou, China. Males and females were almost equally represented. The six-factor item level model of the LPQ seemed to fit reasonably well, thus supporting the six dimensional structure of the LPQ and justifying the use of composite subscale scores for each LPQ dimension. However, the reliability of many of these subscales was low. Furthermore, only two subscale-level factor models showed marginally acceptable fit. Substantive considerations supported an oblique three-factor model. Because the LPQ subscales often show low internal consistency reliability, experimental and correlational studies that have used these subscales as dependent measures have been disappointing. It is suggested that some LPQ items should be revised and other items added to improve the inventory's overall psychometric properties.

  12. Asenapine effects on individual Young Mania Rating Scale items in bipolar disorder patients with acute manic or mixed episodes: a pooled analysis

    Directory of Open Access Journals (Sweden)

    Cazorla P

    2013-03-01

    Full Text Available Pilar Cazorla, Jun Zhao, Mary Mackle, Armin Szegedi Merck, Rahway, NJ, USA Background: An exploratory post hoc analysis was conducted to evaluate the potential differential effects over time of asenapine and olanzapine compared with placebo on the eleven individual items comprising the Young Mania Rating Scale (YMRS in patients with manic or mixed episodes in bipolar I disorder. Methods: Data were pooled from two 3-week randomized, controlled trials in which the eleven individual items comprising the YMRS were measured over 21 days. An analysis of covariance model adjusted by baseline value was used to test for differences in changes from baseline in YMRS scores between groups. Results: Each of the eleven individual YMRS item scores was significantly reduced compared with placebo at day 21. After 2 days of treatment, asenapine and olanzapine were superior to placebo for six of the YMRS items: disruptive/aggressive behavior, content, irritability, elevated mood, sleep, and speech. Conclusion: Reduction in manic symptoms over 21 days was associated with a broad-based improvement across all symptom domains with no subset of symptoms predominating. Keywords: asenapine, Young Mania Rating Scale, bipolar disorder, YMRS, antipsychotic, olanzapine

  13. Differential Item Functioning Analysis Using a Mixture 3-Parameter Logistic Model with a Covariate on the TIMSS 2007 Mathematics Test

    Science.gov (United States)

    Choi, Youn-Jeng; Alexeev, Natalia; Cohen, Allan S.

    2015-01-01

    The purpose of this study was to explore what may be contributing to differences in performance in mathematics on the Trends in International Mathematics and Science Study 2007. This was done by using a mixture item response theory modeling approach to first detect latent classes in the data and then to examine differences in performance on items…

  14. Understanding and quantifying cognitive complexity level in mathematical problem solving items

    Directory of Open Access Journals (Sweden)

    SUSAN E. EMBRETSON

    2008-09-01

    Full Text Available The linear logistic test model (LLTM; Fischer, 1973 has been applied to a wide variety of new tests. When the LLTM application involves item complexity variables that are both theoretically interesting and empirically supported, several advantages can result. These advantages include elaborating construct validity at the item level, defining variables for test design, predicting parameters of new items, item banking by sources of complexity and providing a basis for item design and item generation. However, despite the many advantages of applying LLTM to test items, it has been applied less often to understand the sources of complexity for large-scale operational test items. Instead, previously calibrated item parameters are modeled using regression techniques because raw item response data often cannot be made available. In the current study, both LLTM and regression modeling are applied to mathematical problem solving items from a widely used test. The findings from the two methods are compared and contrasted for their implications for continued development of ability and achievement tests based on mathematical problem solving items.

  15. The social and community opportunities profile social inclusion measure: Structural equivalence and differential item functioning in community mental health residents in Hong Kong and the United Kingdom.

    Science.gov (United States)

    Huxley, Peter John; Chan, Kara; Chiu, Marcus; Ma, Yanni; Gaze, Sarah; Evans, Sherrill

    2016-03-01

    China's future major health problem will be the management of chronic diseases - of which mental health is a major one. An instrument is needed to measure mental health inclusion outcomes for mental health services in Hong Kong and mainland China as they strive to promote a more inclusive society for their citizens and particular disadvantaged groups. To report on the analysis of structural equivalence and item differentiation in two mentally unhealthy and one healthy sample in the United Kingdom and Hong Kong. The mental health sample in Hong Kong was made up of non-governmental organisation (NGO) referrals meeting the selection/exclusion criteria (being well enough to be interviewed, having a formal psychiatric diagnosis and living in the community). A similar sample in the United Kingdom meeting the same selection criteria was obtained from a community mental health organisation, equivalent to the NGOs in Hong Kong. Exploratory factor analysis and logistic regression were conducted. The single-variable, self-rated 'overall social inclusion' differs significantly between all of the samples, in the way we would expect from previous research, with the healthy population feeling more included than the serious mental illness (SMI) groups. In the exploratory factor analysis, the first two factors explain between a third and half of the variance, and the single variable which enters into all the analyses in the first factor is having friends to visit the home. All the regression models were significant; however, in Hong Kong sample, only one-fifth of the total variance is explained. The structural findings imply that the social and community opportunities profile-Chinese version (SCOPE-C) gives similar results when applied to another culture. As only one-fifth of the variance of 'overall inclusion' was explained in the Hong Kong sample, it may be that the instrument needs to be refined using different or additional items within the structural domains of inclusion.

  16. 大型教育調查研究中的差別試題功能:次級分析中的核心概念及建模方法 Differential Item Functioning Analyses in Large-Scale Educational Surveys: Key Concepts and Modeling Approaches for Secondary Analysts

    Directory of Open Access Journals (Sweden)

    朱小姝 Xiao-Shu Zhu

    2011-03-01

    Full Text Available 大型教育評量研究常採用多階段抽樣的設計(multi-stage sampling design),透過對母群體之抽樣單位進行分層以抽取受測者。此外,還會採用複雜題本設計(complex booklet design)的方式將題目組成多份測驗題本。在此情況下,欲確保公正測量出不同受測群體的能力,關鍵在於能夠有效偵測所採用的題目是否具差別試題功能(differential item functioning, DIF)。本文旨在介紹探討在大型教育評量複雜設計之下能用以偵測差別試題功能的建模方法,並應用六種可用於偵測DIF 的多階層廣義線性模式(hierarchical generalized linear models, HGLMs),再透過電腦模擬比較它們偵測DIF 的效力。接著又將這些模式應用到國際數學與科學教育成就趨勢調查研究(TIMSS)的實證數據上,藉以探測是否存在一致性的性別DIF(uniform gender DIF)。 Many educational surveys employ a multi-stage sampling design for students, which makes use of stratification and/or clustering of population units, as well as a complex booklet design for items from an item pool. In these surveys, the reliable detection of item bias or differential item functioning (DIF across student groups is a key component for ensuring fair representations of different student groups. In this paper, we describe several modeling approaches that can be useful for detecting DIF in educational surveys. We illustrate the key ideas by investigating the performance of six hierarchical generalized linear models (HGLMs using a small simulation study and by applying them to real data from the Trends in Mathematics and Science Study (TIMSS study where we use them to investigate potential uniform gender DIF.

  17. marker development for two novel rice genes showing differential ...

    Indian Academy of Sciences (India)

    2014-08-19

    Aug 19, 2014 ... School of Crop Improvement, College of PostGraduate Studies, Central Agricultural University, ... from the root transcriptome data for tolerance to low P. .... Values show a representative result of three independent experiments ...

  18. An experimental-differential investigation of cognitive complexity

    Directory of Open Access Journals (Sweden)

    2009-12-01

    Full Text Available Cognitive complexity as defined by differential and experimental traditions was explored to investigate the theoretical advantage and utility of relational complexity (RC theory as a common framework for studying fluid cognitive functions. RC theory provides a domain general account of processing demand as a function of task complexity. In total, 142 participants completed two tasks in which RC was manipulated, and two tasks entailing manipulations of complexity derived from the differential psychology literature. A series of analyses indicated that, as expected, task manipulations influenced item difficulty. However, comparable changes in a psychometric index of complexity were not consistently observed. Active maintenance of information across multiple steps of the problem solving process, which entails strategic coordination of storage and processing that cannot be modelled under the RC framework was found to be an important component of cognitive complexity.

  19. Calibration of context-specific survey items to assess youth physical activity behaviour.

    Science.gov (United States)

    Saint-Maurice, Pedro F; Welk, Gregory J; Bartee, R Todd; Heelan, Kate

    2017-05-01

    This study tests calibration models to re-scale context-specific physical activity (PA) items to accelerometer-derived PA. A total of 195 4th-12th grades children wore an Actigraph monitor and completed the Physical Activity Questionnaire (PAQ) one week later. The relative time spent in moderate-to-vigorous PA (MVPA % ) obtained from the Actigraph at recess, PE, lunch, after-school, evening and weekend was matched with a respective item score obtained from the PAQ's. Item scores from 145 participants were calibrated against objective MVPA % using multiple linear regression with age, and sex as additional predictors. Predicted minutes of MVPA for school, out-of-school and total week were tested in the remaining sample (n = 50) using equivalence testing. The results showed that PAQ β-weights ranged from 0.06 (lunch) to 4.94 (PE) MVPA % (P PAQ and accelerometer MVPA at school and out-of-school ranged from -15.6 to +3.8 min and the PAQ was within 10-15% of accelerometer measured activity. This study demonstrated that context-specific items can be calibrated to predict minutes of MVPA in groups of youth during in- and out-of-school periods.

  20. Negative affect impairs associative memory but not item memory.

    OpenAIRE

    Bisby, J. A.; Burgess, N.

    2014-01-01

    The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine the effects of emotion on memory for items and their associations. By presenting neutral and negative items with background contexts, Experiment 1 ...

  1. Hazardous metals in yellow items used in RCAs

    International Nuclear Information System (INIS)

    Brown, K.F.; Rankin, W.N.

    1992-01-01

    Yellow items used in Radiologically Controlled Areas (RCAs) that could contain hazardous metals were identified. X-ray fluorescence analyses indicated that thirty of the fifty-two items do contain hazardous metals. It is important to minimize the hazardous metals put into the wastes. The authors recommend that the specifications for all yellow items stocked in Stores be changed to specify that they contain no hazardous metals

  2. Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André

    2016-01-01

    Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…

  3. Can Item Keyword Feedback Help Remediate Knowledge Gaps?

    Science.gov (United States)

    Feinberg, Richard A; Clauser, Amanda L

    2016-10-01

    In graduate medical education, assessment results can effectively guide professional development when both assessment and feedback support a formative model. When individuals cannot directly access the test questions and responses, a way of using assessment results formatively is to provide item keyword feedback. The purpose of the following study was to investigate whether exposure to item keyword feedback aids in learner remediation. Participants included 319 trainees who completed a medical subspecialty in-training examination (ITE) in 2012 as first-year fellows, and then 1 year later in 2013 as second-year fellows. Performance on 2013 ITE items in which keywords were, or were not, exposed as part of the 2012 ITE score feedback was compared across groups based on the amount of time studying (preparation). For the same items common to both 2012 and 2013 ITEs, response patterns were analyzed to investigate changes in answer selection. Test takers who indicated greater amounts of preparation on the 2013 ITE did not perform better on the items in which keywords were exposed compared to those who were not exposed. The response pattern analysis substantiated overall growth in performance from the 2012 ITE. For items with incorrect responses on both attempts, examinees selected the same option 58% of the time. Results from the current study were unsuccessful in supporting the use of item keywords in aiding remediation. Unfortunately, the results did provide evidence of examinees retaining misinformation.

  4. Method using a density field for locating related items for data mining

    Science.gov (United States)

    Wylie, Brian N.

    2002-01-01

    A method for locating related items in a geometric space transforms relationships among items to geometric locations. The method locates items in the geometric space so that the distance between items corresponds to the degree of relatedness. The method facilitates communication of the structure of the relationships among the items. The method makes use of numeric values as a measure of similarity between each pairing of items. The items are given initial coordinates in the space. An energy is then determined for each item from the item's distance and similarity to other items, and from the density of items assigned coordinates near the item. The distance and similarity component can act to draw items with high similarities close together, while the density component can act to force all items apart. If a terminal condition is not yet reached, then new coordinates can be determined for one or more items, and the energy determination repeated. The iteration can terminate, for example, when the total energy reaches a threshold, when each item's energy is below a threshold, after a certain amount of time or iterations.

  5. Comparison on Computed Tomography using industrial items

    DEFF Research Database (Denmark)

    Angel, Jais Andreas Breusch; De Chiffre, Leonardo

    2014-01-01

    In a comparison involving 27 laboratories from 8 countries, measurements on two common industrial items, a polymer part and a metal part, were carried out using X-ray Computed Tomography. All items were measured using coordinate measuring machines before and after circulation, with reference...

  6. Using Item Response Theory to Develop a 60-Item Representation of the NEO PI-R Using the International Personality Item Pool: Development of the IPIP-NEO-60.

    Science.gov (United States)

    Maples-Keller, Jessica L; Williamson, Rachel L; Sleep, Chelsea E; Carter, Nathan T; Campbell, W Keith; Miller, Joshua D

    2017-10-31

    Given advantages of freely available and modifiable measures, an increase in the use of measures developed from the International Personality Item Pool (IPIP), including the 300-item representation of the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992a ) has occurred. The focus of this study was to use item response theory to develop a 60-item, IPIP-based measure of the Five-Factor Model (FFM) that provides equal representation of the FFM facets and to test the reliability and convergent and criterion validity of this measure compared to the NEO Five Factor Inventory (NEO-FFI). In an undergraduate sample (n = 359), scores from the NEO-FFI and IPIP-NEO-60 demonstrated good reliability and convergent validity with the NEO PI-R and IPIP-NEO-300. Additionally, across criterion variables in the undergraduate sample as well as a community-based sample (n = 757), the NEO-FFI and IPIP-NEO-60 demonstrated similar nomological networks across a wide range of external variables (r ICC = .96). Finally, as expected, in an MTurk sample the IPIP-NEO-60 demonstrated advantages over the Big Five Inventory-2 (Soto & John, 2017 ; n = 342) with regard to the Agreeableness domain content. The results suggest strong reliability and validity of the IPIP-NEO-60 scores.

  7. Recommendations to improve the positive and negative syndrome scale (PANSS) based on item response theory.

    Science.gov (United States)

    Levine, Stephen Z; Rabinowitz, Jonathan; Rizopoulos, Dimitris

    2011-08-15

    The adequacy of the Positive and Negative Syndrome Scale (PANSS) items in measuring symptom severity in schizophrenia was examined using Item Response Theory (IRT). Baseline PANSS assessments were analyzed from two multi-center clinical trials of antipsychotic medication in chronic schizophrenia (n=1872). Generally, the results showed that the PANSS (a) item ratings discriminated symptom severity best for the negative symptoms; (b) has an excess of "Severe" and "Extremely severe" rating options; and (c) assessments are more reliable at medium than very low or high levels of symptom severity. Analysis also showed that the detection of statistically and non-statistically significant differences in treatment were highly similar for the original and IRT-modified PANSS. In clinical trials of chronic schizophrenia, the PANSS appears to require the following modifications: fewer rating options, adjustment of 'Lack of judgment and insight', and improved severe symptom assessment. 2011 Elsevier Ltd. All rights reserved.

  8. Transcranial direct current stimulation over the parietal cortex alters bias in item and source memory tasks.

    Science.gov (United States)

    Pergolizzi, Denise; Chua, Elizabeth F

    2016-10-01

    Neuroimaging data have shown that activity in the lateral posterior parietal cortex (PPC) correlates with item recognition and source recollection, but there is considerable debate about its specific contributions. Performance on both item and source memory tasks were compared between participants who were given bilateral transcranial direct current stimulation (tDCS) over the parietal cortex to those given prefrontal or sham tDCS. The parietal tDCS group, but not the prefrontal group, showed decreased false recognition, and less bias in item and source discrimination tasks compared to sham stimulation. These results are consistent with a causal role of the PPC in item and source memory retrieval, likely based on attentional and decision-making biases. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. 16 CFR 304.6 - Marking requirements for imitation numismatic items.

    Science.gov (United States)

    2010-01-01

    ... 16 Commercial Practices 1 2010-01-01 2010-01-01 false Marking requirements for imitation... for imitation numismatic items. (a) An imitation numismatic item which is manufactured in the United... the item. (3) An imitation numismatic item of incusable material shall be incused with the word “COPY...

  10. Psychometric Evaluation of Chinese-Language 44-Item and 10-Item Big Five Personality Inventories, Including Correlations with Chronotype, Mindfulness and Mind Wandering.

    Science.gov (United States)

    Carciofo, Richard; Yang, Jiaoyan; Song, Nan; Du, Feng; Zhang, Kan

    2016-01-01

    The 44-item and 10-item Big Five Inventory (BFI) personality scales are widely used, but there is a lack of psychometric data for Chinese versions. Eight surveys (total N = 2,496, aged 18-82), assessed a Chinese-language BFI-44 and/or an independently translated Chinese-language BFI-10. Most BFI-44 items loaded strongly or predominantly on the expected dimension, and values of Cronbach's alpha ranged .698-.807. Test-retest coefficients ranged .694-.770 (BFI-44), and .515-.873 (BFI-10). The BFI-44 and BFI-10 showed good convergent and discriminant correlations, and expected associations with gender (females higher for agreeableness and neuroticism), and age (older age associated with more conscientiousness and agreeableness, and also less neuroticism and openness). Additionally, predicted correlations were found with chronotype (morningness positive with conscientiousness), mindfulness (negative with neuroticism, positive with conscientiousness), and mind wandering/daydreaming frequency (negative with conscientiousness, positive with neuroticism). Exploratory analysis found that the Self-discipline facet of conscientiousness positively correlated with morningness and mindfulness, and negatively correlated with mind wandering/daydreaming frequency. Furthermore, Self-discipline was found to be a mediator in the relationships between chronotype and mindfulness, and chronotype and mind wandering/daydreaming frequency. Overall, the results support the utility of the BFI-44 and BFI-10 for Chinese-language big five personality research.

  11. Equivalence and standard scores of the Hurlbert Index of Sexual Assertiveness across Spanish men and women

    Directory of Open Access Journals (Sweden)

    Pablo Santos-Iglesias

    2014-01-01

    Full Text Available The purpose of the present study was to analyze the measurement invariance and differential item functioning of the Spanish version of the Hurlbert Index of Sexual Assertiveness across gender. The sample was composed of 1,600 women and 1,598 men from Spain, with ages ranging from 18 to 84 years old. The Hurlbert Index of Sexual Assertiveness only showed weak invariance for men and women. The differential item functioning analysis showed that only item 2 ("I feel that I am shy when it comes to sex" flagged moderate uniform differential item functioning. More specifically, women tended to respond "Always" to this item more frequently than did men. Results strongly suggested eliminating item 2, resulting in a final version with 18 items clustered into two dimensions with good reliability values for men and women. Standard scores for both Initiation and No Shyness/Refusal reflected traditional sexual scripts for men and women.

  12. Collaborative Filtering Based on Sequential Extraction of User-Item Clusters

    Science.gov (United States)

    Honda, Katsuhiro; Notsu, Akira; Ichihashi, Hidetomo

    Collaborative filtering is a computational realization of “word-of-mouth” in network community, in which the items prefered by “neighbors” are recommended. This paper proposes a new item-selection model for extracting user-item clusters from rectangular relation matrices, in which mutual relations between users and items are denoted in an alternative process of “liking or not”. A technique for sequential co-cluster extraction from rectangular relational data is given by combining the structural balancing-based user-item clustering method with sequential fuzzy cluster extraction appraoch. Then, the tecunique is applied to the collaborative filtering problem, in which some items may be shared by several user clusters.

  13. Feline bone marrow-derived mesenchymal stromal cells (MSCs) show similar phenotype and functions with regards to neuronal differentiation as human MSCs.

    Science.gov (United States)

    Munoz, Jessian L; Greco, Steven J; Patel, Shyam A; Sherman, Lauren S; Bhatt, Suresh; Bhatt, Rekha S; Shrensel, Jeffrey A; Guan, Yan-Zhong; Xie, Guiqin; Ye, Jiang-Hong; Rameshwar, Pranela; Siegel, Allan

    2012-09-01

    Mesenchymal stromal cells (MSCs) show promise for treatment of a variety of neurological and other disorders. Cat has a high degree of linkage with the human genome and has been used as a model for analysis of neurological disorders such as stroke, Alzheimer's disease and motor disorders. The present study was designed to characterize bone marrow-derived MSCs from cats and to investigate the capacity to generate functional peptidergic neurons. MSCs were expanded with cells from the femurs of cats and then characterized by phenotype and function. Phenotypically, feline and human MSCs shared surface markers, and lacked hematopoietic markers, with similar morphology. As compared to a subset of human MSCs, feline MSCs showed no evidence of the major histocompatibility class II. Since the literature suggested Stro-1 as an indicator of pluripotency, we compared early and late passages feline MSCs and found its expression in >90% of the cells. However, the early passage cells showed two distinct populations of Stro-1-expressing cells. At passage 5, the MSCs were more homogeneous with regards to Stro-1 expression. The passage 5 MSCs differentiated to osteogenic and adipogenic cells, and generated neurons with electrophysiological properties. This correlated with the expression of mature neuronal markers with concomitant decrease in stem cell-associated genes. At day 12 induction, the cells were positive for MAP2, Neuronal Nuclei, tubulin βIII, Tau and synaptophysin. This correlated with electrophysiological maturity as presented by excitatory postsynaptic potentials (EPSPs). The findings indicate that the cat may constitute a promising biomedical model for evaluation of novel therapies such as stem cell therapy in such neurological disorders as Alzheimer's disease and stroke. Copyright © 2012 International Society of Differentiation. Published by Elsevier B.V. All rights reserved.

  14. Careful with Those Priors: A Note on Bayesian Estimation in Two-Parameter Logistic Item Response Theory Models

    Science.gov (United States)

    Marcoulides, Katerina M.

    2018-01-01

    This study examined the use of Bayesian analysis methods for the estimation of item parameters in a two-parameter logistic item response theory model. Using simulated data under various design conditions with both informative and non-informative priors, the parameter recovery of Bayesian analysis methods were examined. Overall results showed that…

  15. Item difficulty of multiple choice tests dependant on different item response formats – An experiment in fundamental research on psychological assessment

    Directory of Open Access Journals (Sweden)

    KLAUS D. KUBINGER

    2007-12-01

    Full Text Available Multiple choice response formats are problematical as an item is often scored as solved simply because the test-taker is a lucky guesser. Instead of applying pertinent IRT models which take guessing effects into account, a pragmatic approach of re-conceptualizing multiple choice response formats to reduce the chance of lucky guessing is considered. This paper compares the free response format with two different multiple choice formats. A common multiple choice format with a single correct response option and five distractors (“1 of 6” is used, as well as a multiple choice format with five response options, of which any number of the five is correct and the item is only scored as mastered if all the correct response options and none of the wrong ones are marked (“x of 5”. An experiment was designed, using pairs of items with exactly the same content but different response formats. 173 test-takers were randomly assigned to two test booklets of 150 items altogether. Rasch model analyses adduced a fitting item pool, after the deletion of 39 items. The resulting item difficulty parameters were used for the comparison of the different formats. The multiple choice format “1 of 6” differs significantly from “x of 5”, with a relative effect of 1.63, while the multiple choice format “x of 5” does not significantly differ from the free response format. Therefore, the lower degree of difficulty of items with the “1 of 6” multiple choice format is an indicator of relevant guessing effects. In contrast the “x of 5” multiple choice format can be seen as an appropriate substitute for free response format.

  16. The measurement of cyberbullying: dimensional structure and relative item severity and discrimination.

    Science.gov (United States)

    Menesini, Ersilia; Nocentini, Annalaura; Calussi, Pamela

    2011-05-01

    In relation to a sample of 1,092 Italian adolescents (50.9% females), the present study aims to: (a) analyze the most parsimonious structure of the cyberbullying and cybervictimization construct in male and female Italian adolescents through confirmatory factor analysis; and (b) analyze the severity and the discrimination parameters of each act using the item response theory. Results showed that the structure of the cyberbullying scale for perpetrated and received behaviors in both genders could best be represented by a monodimensional model where each item lies on a continuum of severity of aggressive acts. For both genders, the less severe acts are silent/prank calls and insults on instant messaging, and the most severe acts are unpleasant pictures/photos on Web sites, phone pictures/photos/videos of intimate scenes, and phone pictures/photos/videos of violent scenes. The items nasty text messages, nasty or rude e-mails, insults on Web sites, insults in chatrooms, and insults on blogs range from moderate to high levels of severity. Regarding the discrimination level of the acts, several items emerged as good indicators at various levels of cyberbullying and cybervictimization severity, with the exception of silent/prank calls. Furthermore, gender specificities underlined that the visual items can be considered good indicators of severe cyberbullies and cybervictims only in males. This information can help in understanding better the nature of the phenomenon, its severity in a given population, and to plan more specific prevention and intervention strategies.

  17. Fostering a student's skill for analyzing test items through an authentic task

    Science.gov (United States)

    Setiawan, Beni; Sabtiawan, Wahyu Budi

    2017-08-01

    Analyzing test items is a skill that must be mastered by prospective teachers, in order to determine the quality of test questions which have been written. The main aim of this research was to describe the effectiveness of authentic task to foster the student's skill for analyzing test items involving validity, reliability, item discrimination index, level of difficulty, and distractor functioning through the authentic task. The participant of the research is students of science education study program, science and mathematics faculty, Universitas Negeri Surabaya, enrolled for assessment course. The research design was a one-group posttest design. The treatment in this study is that the students were provided an authentic task facilitating the students to develop test items, then they analyze the items like a professional assessor using Microsoft Excel and Anates Software. The data of research obtained were analyzed descriptively, such as the analysis was presented by displaying the data of students' skill, then they were associated with theories or previous empirical studies. The research showed the task facilitated the students to have the skills. Thirty-one students got a perfect score for the analyzing, five students achieved 97% mastery, two students had 92% mastery, and another two students got 89% and 79% of mastery. The implication of the finding was the students who get authentic tasks forcing them to perform like a professional, the possibility of the students for achieving the professional skills will be higher at the end of learning.

  18. Australian Chemistry Test Item Bank: Years 11 & 12. Volume 1.

    Science.gov (United States)

    Commons, C., Ed.; Martin, P., Ed.

    Volume 1 of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the "ACER…

  19. Varying levels of difficulty index of skills-test items randomly selected by examinees on the Korean emergency medical technician licensing examination.

    Science.gov (United States)

    Koh, Bongyeun; Hong, Sunggi; Kim, Soon-Sim; Hyun, Jin-Sook; Baek, Milye; Moon, Jundong; Kwon, Hayran; Kim, Gyoungyong; Min, Seonggi; Kang, Gu-Hyun

    2016-01-01

    The goal of this study was to characterize the difficulty index of the items in the skills test components of the class I and II Korean emergency medical technician licensing examination (KEMTLE), which requires examinees to select items randomly. The results of 1,309 class I KEMTLE examinations and 1,801 class II KEMTLE examinations in 2013 were subjected to analysis. Items from the basic and advanced skills test sections of the KEMTLE were compared to determine whether some were significantly more difficult than others. In the class I KEMTLE, all 4 of the items on the basic skills test showed significant variation in difficulty index (P<0.01), as well as 4 of the 5 items on the advanced skills test (P<0.05). In the class II KEMTLE, 4 of the 5 items on the basic skills test showed significantly different difficulty index (P<0.01), as well as all 3 of the advanced skills test items (P<0.01). In the skills test components of the class I and II KEMTLE, the procedure in which examinees randomly select questions should be revised to require examinees to respond to a set of fixed items in order to improve the reliability of the national licensing examination.

  20. Varying levels of difficulty index of skills-test items randomly selected by examinees on the Korean emergency medical technician licensing examination

    Directory of Open Access Journals (Sweden)

    Bongyeun Koh

    2016-01-01

    Full Text Available Purpose: The goal of this study was to characterize the difficulty index of the items in the skills test components of the class I and II Korean emergency medical technician licensing examination (KEMTLE, which requires examinees to select items randomly. Methods: The results of 1,309 class I KEMTLE examinations and 1,801 class II KEMTLE examinations in 2013 were subjected to analysis. Items from the basic and advanced skills test sections of the KEMTLE were compared to determine whether some were significantly more difficult than others. Results: In the class I KEMTLE, all 4 of the items on the basic skills test showed significant variation in difficulty index (P<0.01, as well as 4 of the 5 items on the advanced skills test (P<0.05. In the class II KEMTLE, 4 of the 5 items on the basic skills test showed significantly different difficulty index (P<0.01, as well as all 3 of the advanced skills test items (P<0.01. Conclusion: In the skills test components of the class I and II KEMTLE, the procedure in which examinees randomly select questions should be revised to require examinees to respond to a set of fixed items in order to improve the reliability of the national licensing examination.