WorldWideScience

Sample records for cross-language differential item

  1. Gender-Based Differential Item Performance in Mathematics Achievement Items.

    Science.gov (United States)

    Doolittle, Allen E.; Cleary, T. Anne

    1987-01-01

    Eight randomly equivalent samples of high school seniors were each given a unique form of the ACT Assessment Mathematics Usage Test (ACTM). Signed measures of differential item performance (DIP) were obtained for each item in the eight ACTM forms. DIP estimates were analyzed and a significant item category effect was found. (Author/LMO)

  2. MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

    Science.gov (United States)

    Wang, Wen-Chung; Shih, Ching-Lin

    2010-01-01

    Three multiple indicators-multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods…

  3. Item-focussed Trees for the Identification of Items in Differential Item Functioning.

    Science.gov (United States)

    Tutz, Gerhard; Berger, Moritz

    2016-09-01

    A novel method for the identification of differential item functioning (DIF) by means of recursive partitioning techniques is proposed. We assume an extension of the Rasch model that allows for DIF being induced by an arbitrary number of covariates for each item. Recursive partitioning on the item level results in one tree for each item and leads to simultaneous selection of items and variables that induce DIF. For each item, it is possible to detect groups of subjects with different item difficulties, defined by combinations of characteristics that are not pre-specified. The way a DIF item is determined by covariates is visualized in a small tree and therefore easily accessible. An algorithm is proposed that is based on permutation tests. Various simulation studies, including the comparison with traditional approaches to identify items with DIF, show the applicability and the competitive performance of the method. Two applications illustrate the usefulness and the advantages of the new method.

  4. Verification of Differential Item Functioning (DIF) Status of West ...

    African Journals Online (AJOL)

    This study investigated test item bias and Differential Item Functioning (DIF) of West African ... items in chemistry function differentially with respect to gender and location. In Aba education zone of Abia, 50 secondary schools were purposively ...

  5. Using Differential Item Functioning Procedures to Explore Sources of Item Difficulty and Group Performance Characteristics.

    Science.gov (United States)

    Scheuneman, Janice Dowd; Gerritz, Kalle

    1990-01-01

    Differential item functioning (DIF) methodology for revealing sources of item difficulty and performance characteristics of different groups was explored. A total of 150 Scholastic Aptitude Test items and 132 Graduate Record Examination general test items were analyzed. DIF was evaluated for males and females and Blacks and Whites. (SLD)

  6. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    Science.gov (United States)

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  7. A Bifactor Multidimensional Item Response Theory Model for Differential Item Functioning Analysis on Testlet-Based Items

    Science.gov (United States)

    Fukuhara, Hirotaka; Kamata, Akihito

    2011-01-01

    A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…

  8. Effect of Differential Item Functioning on Test Equating

    Science.gov (United States)

    Kabasakal, Kübra Atalay; Kelecioglu, Hülya

    2015-01-01

    This study examines the effect of differential item functioning (DIF) items on test equating through multilevel item response models (MIRMs) and traditional IRMs. The performances of three different equating models were investigated under 24 different simulation conditions, and the variables whose effects were examined included sample size, test…

  9. A scale purification procedure for evaluation of differential item functioning

    NARCIS (Netherlands)

    Khalid, Muhammad Naveed; Glas, Cornelis A.W.

    2014-01-01

    Item bias or differential item functioning (DIF) has an important impact on the fairness of psychological and educational testing. In this paper, DIF is seen as a lack of fit to an item response (IRT) model. Inferences about the presence and importance of DIF require a process of so-called test

  10. Detection of differential item functioning using Lagrange multiplier tests

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    1996-01-01

    In this paper it is shown that differential item functioning can be evaluated using the Lagrange multiplier test or C. R. Rao's efficient score test. The test is presented in the framework of a number of item response theory (IRT) models such as the Rasch model, the one-parameter logistic model, the

  11. 17 CFR 260.7a-16 - Inclusion of items, differentiation between items and answers, omission of instructions.

    Science.gov (United States)

    2010-04-01

    ... 17 Commodity and Securities Exchanges 3 2010-04-01 2010-04-01 false Inclusion of items, differentiation between items and answers, omission of instructions. 260.7a-16 Section 260.7a-16 Commodity and... INDENTURE ACT OF 1939 Formal Requirements § 260.7a-16 Inclusion of items, differentiation between items and...

  12. Differential item functioning magnitude and impact measures from item response theory models.

    Science.gov (United States)

    Kleinman, Marjorie; Teresi, Jeanne A

    2016-01-01

    Measures of magnitude and impact of differential item functioning (DIF) at the item and scale level, respectively are presented and reviewed in this paper. Most measures are based on item response theory models. Magnitude refers to item level effect sizes, whereas impact refers to differences between groups at the scale score level. Reviewed are magnitude measures based on group differences in the expected item scores and impact measures based on differences in the expected scale scores. The similarities among these indices are demonstrated. Various software packages are described that provide magnitude and impact measures, and new software presented that computes all of the available statistics conveniently in one program with explanations of their relationships to one another.

  13. Language-related differential item functioning between English and German PROMIS Depression items is negligible.

    Science.gov (United States)

    Fischer, H Felix; Wahl, Inka; Nolte, Sandra; Liegl, Gregor; Brähler, Elmar; Löwe, Bernd; Rose, Matthias

    2017-12-01

    To investigate differential item functioning (DIF) of PROMIS Depression items between US and German samples we compared data from the US PROMIS calibration sample (n = 780), a German general population survey (n = 2,500) and a German clinical sample (n = 621). DIF was assessed in an ordinal logistic regression framework, with 0.02 as criterion for R 2 -change and 0.096 for Raju's non-compensatory DIF. Item parameters were initially fixed to the PROMIS Depression metric; we used plausible values to account for uncertainty in depression estimates. Only four items showed DIF. Accounting for DIF led to negligible effects for the full item bank as well as a post hoc simulated computer-adaptive test (German general population sample was considerably lower compared to the US reference value of 50. Overall, we found little evidence for language DIF between US and German samples, which could be addressed by either replacing the DIF items by items not showing DIF or by scoring the short form in German samples with the corrected item parameters reported. Copyright © 2016 John Wiley & Sons, Ltd.

  14. Evaluation of psychometric properties and differential item functioning of 8-item Child Perceptions Questionnaires using item response theory.

    Science.gov (United States)

    Yau, David T W; Wong, May C M; Lam, K F; McGrath, Colman

    2015-08-19

    Four-factor structure of the two 8-item short forms of Child Perceptions Questionnaire CPQ11-14 (RSF:8 and ISF:8) has been confirmed. However, the sum scores are typically reported in practice as a proxy of Oral health-related Quality of Life (OHRQoL), which implied a unidimensional structure. This study first assessed the unidimensionality of 8-item short forms of CPQ11-14. Item response theory (IRT) was employed to offer an alternative and complementary approach of validation and to overcome the limitations of classical test theory assumptions. A random sample of 649 12-year-old school children in Hong Kong was analyzed. Unidimensionality of the scale was tested by confirmatory factor analysis (CFA), principle component analysis (PCA) and local dependency (LD) statistic. Graded response model was fitted to the data. Contribution of each item to the scale was assessed by item information function (IIF). Reliability of the scale was assessed by test information function (TIF). Differential item functioning (DIF) across gender was identified by Wald test and expected score functions. Both CPQ11-14 RSF:8 and ISF:8 did not deviate much from the unidimensionality assumption. Results from CFA indicated acceptable fit of the one-factor model. PCA indicated that the first principle component explained >30 % of the total variation with high factor loadings for both RSF:8 and ISF:8. Almost all LD statistic items suggesting little contribution of information to the scale and item removal caused little practical impact. Comparing the TIFs, RSF:8 showed slightly better information than ISF:8. In addition to oral symptoms items, the item "Concerned with what other people think" demonstrated a uniform DIF (p Items related to oral symptoms were not informative to OHRQoL and deletion of these items is suggested. The impact of DIF across gender on the overall score was minimal. CPQ11-14 RSF:8 performed slightly better than ISF:8 in measurement precision. The 6-item short forms

  15. Detection of Uniform and Nonuniform Differential Item Functioning by Item-Focused Trees

    Science.gov (United States)

    Berger, Moritz; Tutz, Gerhard

    2016-01-01

    Detection of differential item functioning (DIF) by use of the logistic modeling approach has a long tradition. One big advantage of the approach is that it can be used to investigate nonuniform (NUDIF) as well as uniform DIF (UDIF). The classical approach allows one to detect DIF by distinguishing between multiple groups. We propose an…

  16. Detection of differential item functioning using Lagrange multiplier tests

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    1998-01-01

    Abstract: In the present paper it is shown that differential item functioning can be evaluated using the Lagrange multiplier test or Rao’s efficient score test. The test is presented in the framework of a number of IRT models such as the Rasch model, the OPLM, the 2-parameter logistic model, the

  17. Differential item functioning of the UWES-17 in South Africa

    Directory of Open Access Journals (Sweden)

    Leanne Goliath-Yarde

    2011-11-01

    Research purpose: This study assesses the Differential Item Functioning (DIF of the Utrecht Work Engagement Scale (UWES-17 for different South African cultural groups in a South African company. Motivation for the study: Organisations are using the UWES-17 more and more in South Africa to assess work engagement. Therefore, research evidence from psychologists or assessment practitioners on its DIF across different cultural groups is necessary. Research design, approach and method: The researchers conducted a Secondary Data Analysis (SDA on the UWES-17 sample (n = 2429 that they obtained from a cross-sectional survey undertaken in a South African Information and Communication Technology (ICT sector company (n = 24 134. Quantitative item data on the UWES-17 scale enabled the authors to address the research question. Main findings: The researchers found uniform and/or non-uniform DIF on five of the vigour items, four of the dedication items and two of the absorption items. This also showed possible Differential Test Functioning (DTF on the vigour and dedication dimensions. Practical/managerial implications: Based on the DIF, the researchers suggested that organisations should not use the UWES-17 comparatively for different cultural groups or employment decisions in South Africa. Contribution/value add: The study provides evidence on DIF and possible DTF for the UWES-17. However, it also raises questions about possible interaction effects that need further investigation.

  18. Assessing Differential Item Functioning on the Test of Relational Reasoning

    Directory of Open Access Journals (Sweden)

    Denis Dumas

    2018-03-01

    Full Text Available The test of relational reasoning (TORR is designed to assess the ability to identify complex patterns within visuospatial stimuli. The TORR is designed for use in school and university settings, and therefore, its measurement invariance across diverse groups is critical. In this investigation, a large sample, representative of a major university on key demographic variables, was collected, and the resulting data were analyzed using a multi-group, multidimensional item-response theory model-comparison procedure. No significant differential item functioning was found on any of the TORR items across any of the demographic groups of interest. This finding is interpreted as evidence of the cultural fairness of the TORR, and potential test-development choices that may have contributed to that cultural fairness are discussed.

  19. Differential Weighting of Items to Improve University Admission Test Validity

    Directory of Open Access Journals (Sweden)

    Eduardo Backhoff Escudero

    2001-05-01

    Full Text Available This paper gives an evaluation of different ways to increase university admission test criterion-related validity, by differentially weighting test items. We compared four methods of weighting multiple-choice items of the Basic Skills and Knowledge Examination (EXHCOBA: (1 punishing incorrect responses by a constant factor, (2 weighting incorrect responses, considering the levels of error, (3 weighting correct responses, considering the item’s difficulty, based on the Classic Measurement Theory, and (4 weighting correct responses, considering the item’s difficulty, based on the Item Response Theory. Results show that none of these methods increased the instrument’s predictive validity, although they did improve its concurrent validity. It was concluded that it is appropriate to score the test by simply adding up correct responses.

  20. An Effect Size Measure for Raju's Differential Functioning for Items and Tests

    Science.gov (United States)

    Wright, Keith D.; Oshima, T. C.

    2015-01-01

    This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…

  1. Racial differences in hypertension knowledge: effects of differential item functioning.

    Science.gov (United States)

    Ayotte, Brian J; Trivedi, Ranak; Bosworth, Hayden B

    2009-01-01

    Health-related knowledge is an important component in the self-management of chronic illnesses. The objective of this study was to more accurately assess racial differences in hypertension knowledge by using a latent variable modeling approach that controlled for sociodemographic factors and accounted for measurement issues in the assessment of hypertension knowledge. Cross-sectional data from 1,177 participants (45% African American; 35% female) were analyzed using a multiple indicator multiple causes (MIMIC) modeling approach. Available sociodemographic data included race, education, sex, financial status, and age. All participants completed six items on a hypertension knowledge questionnaire. Overall, the final model suggested that females, Whites, and patients with at least a high school diploma had higher latent knowledge scores than males, African Americans, and patients with less than a high school diploma, respectively. The model also detected differential item functioning (DIF) based on race for two of the items. Specifically, the error rate for African Americans was lower than would be expected given the lower level of latent knowledge on the items, on the questions related to: (a) the association between high blood pressure and kidney disease, and (b) the increased risk African Americans have for developing hypertension. Not accounting for DIF resulted in the difference between Whites and African Americans to be underestimated. These results are discussed in the context of the need for careful measurement of health-related constructs, and how measurement-related issues can result in an inaccurate estimation of racial differences in hypertension knowledge.

  2. Mixture Item Response Theory-MIMIC Model: Simultaneous Estimation of Differential Item Functioning for Manifest Groups and Latent Classes

    Science.gov (United States)

    Bilir, Mustafa Kuzey

    2009-01-01

    This study uses a new psychometric model (mixture item response theory-MIMIC model) that simultaneously estimates differential item functioning (DIF) across manifest groups and latent classes. Current DIF detection methods investigate DIF from only one side, either across manifest groups (e.g., gender, ethnicity, etc.), or across latent classes…

  3. Evaluating construct validity of the second version of the Copenhagen Psychosocial Questionnaire through analysis of differential item functioning and differential item effect

    DEFF Research Database (Denmark)

    Bjorner, Jakob Bue; Pejtersen, Jan Hyld

    2010-01-01

    AIMS: To evaluate the construct validity of the Copenhagen Psychosocial Questionnaire II (COPSOQ II) by means of tests for differential item functioning (DIF) and differential item effect (DIE). METHODS: We used a Danish general population postal survey (n = 4,732 with 3,517 wage earners) with a ...

  4. Few items in the thyroid-related quality of life instrument ThyPRO exhibited differential item functioning

    DEFF Research Database (Denmark)

    Watt, Torquil; Grønvold, Mogens; Hegedüs, Laszlo

    2014-01-01

    To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis.......To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis....

  5. Measurement equivalence and differential item functioning in family psychology.

    Science.gov (United States)

    Bingenheimer, Jeffrey B; Raudenbush, Stephen W; Leventhal, Tama; Brooks-Gunn, Jeanne

    2005-09-01

    Several hypotheses in family psychology involve comparisons of sociocultural groups. Yet the potential for cross-cultural inequivalence in widely used psychological measurement instruments threatens the validity of inferences about group differences. Methods for dealing with these issues have been developed via the framework of item response theory. These methods deal with an important type of measurement inequivalence, called differential item functioning (DIF). The authors introduce DIF analytic methods, linking them to a well-established framework for conceptualizing cross-cultural measurement equivalence in psychology (C.H. Hui and H.C. Triandis, 1985). They illustrate the use of DIF methods using data from the Project on Human Development in Chicago Neighborhoods (PHDCN). Focusing on the Caregiver Warmth and Environmental Organization scales from the PHDCN's adaptation of the Home Observation for Measurement of the Environment Inventory, the authors obtain results that exemplify the range of outcomes that may result when these methods are applied to psychological measurement instruments. (c) 2005 APA, all rights reserved

  6. Item Response Theory with Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model

    Science.gov (United States)

    Tay, Louis; Huang, Qiming; Vermunt, Jeroen K.

    2016-01-01

    In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…

  7. "Detecting Differential Item Functioning and Differential Step Functioning due to Differences that ""Should"" Matter"

    Directory of Open Access Journals (Sweden)

    Tess Miller

    2010-07-01

    Full Text Available This study illustrates the use of differential item functioning (DIF and differential step functioning (DSF analyses to detect differences in item difficulty that are related to experiences of examinees, such as their teachers' instructional practices, that are relevant to the knowledge, skill, or ability the test is intended to measure. This analysis is in contrast to the typical use of DIF or DSF to detect differences related to characteristics of examinees, such as gender, language, or cultural knowledge, that should be irrelevant. Using data from two forms of Ontario's Grade 9 Assessment of Mathematics, analyses were performed comparing groups of students defined by their teachers' instructional practices. All constructed-response items were tested for DIF using the Mantel Chi-Square, standardized Liu Agresti cumulative common log-odds ratio, and standardized Cox's noncentrality parameter. Items exhibiting moderate to large DIF were subsequently tested for DSF. In contrast to typical DIF or DSF analyses, which inform item development, these analyses have the potential to inform instructional practice.

  8. Comparing Two Versions of the MEOCS Using Differential Item Functioning

    National Research Council Canada - National Science Library

    Truhon, Stephen

    2003-01-01

    ...) from item response theory (IRT). DIF was found for the majority of the 40 items examined, although in many cases the DIF indicated improvements in the revised items. Implications for these scales and for the use of IRT with the MEOCS are discussed.

  9. Use of differential item functioning analysis to assess the equivalence of translations of a questionnaire

    NARCIS (Netherlands)

    Petersen, Morten Aa; Groenvold, Mogens; Bjorner, Jakob B.; Aaronson, Neil; Conroy, Thierry; Cull, Ann; Fayers, Peter; Hjermstad, Marianne; Sprangers, Mirjam; Sullivan, Marianne

    2003-01-01

    In cross-national comparisons based on questionnaires, accurate translations are necessary to obtain valid results. Differential item functioning (DIF) analysis can be used to test whether translations of items in multi-item scales are equivalent to the original. In data from 10,815 respondents

  10. Detection of Differential Item Functioning with Nonlinear Regression: A Non-IRT Approach Accounting for Guessing

    Science.gov (United States)

    Drabinová, Adéla; Martinková, Patrícia

    2017-01-01

    In this article we present a general approach not relying on item response theory models (non-IRT) to detect differential item functioning (DIF) in dichotomous items with presence of guessing. The proposed nonlinear regression (NLR) procedure for DIF detection is an extension of method based on logistic regression. As a non-IRT approach, NLR can…

  11. The emotional memory effect: differential processing or item distinctiveness?

    Science.gov (United States)

    Schmidt, Stephen R; Saari, Bonnie

    2007-12-01

    A color-naming task was followed by incidental free recall to investigate how emotional words affect attention and memory. We compared taboo, nonthreatening negative-affect, and neutral words across three experiments. As compared with neutral words, taboo words led to longer color-naming times and better memory in both within- and between-subjects designs. Color naming of negative-emotion nontaboo words was slower than color naming of neutral words only during block presentation and at relatively short interstimulus intervals (ISIs). The nontaboo emotion words were remembered better than neutral words following blocked and random presentation and at both long and short ISIs, but only in mixed-list designs. Our results support multifactor theories of the effects of emotion on attention and memory. As compared with neutral words, threatening stimuli received increased attention, poststimulus elaboration, and benefit from item distinctiveness, whereas nonthreatening emotional stimuli benefited only from increased item distinctiveness.

  12. Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

    Science.gov (United States)

    Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina

    2015-06-01

    This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism. © The Author(s) 2014.

  13. Few items in the thyroid-related quality of life instrument ThyPRO exhibited differential item functioning.

    Science.gov (United States)

    Watt, Torquil; Groenvold, Mogens; Hegedüs, Laszlo; Bonnema, Steen Joop; Rasmussen, Åse Krogh; Feldt-Rasmussen, Ulla; Bjorner, Jakob Bue

    2014-02-01

    To evaluate the extent of differential item functioning (DIF) within the thyroid-specific quality of life patient-reported outcome measure, ThyPRO, according to sex, age, education and thyroid diagnosis. A total of 838 patients with benign thyroid diseases completed the ThyPRO questionnaire (84 five-point items, 13 scales). Uniform and nonuniform DIF were investigated using ordinal logistic regression, testing for both statistical significance and magnitude (∆R(2) > 0.02). Scale level was estimated by the sum score, after purification. Twenty instances of DIF in 17 of the 84 items were found. Eight according to diagnosis, where the goiter scale was the one most affected, possibly due to differing perceptions in patients with auto-immune thyroid diseases compared to patients with simple goiter. Eight DIFs according to age were found, of which 5 were in positively worded items, which younger patients were more likely to endorse; one according to gender: women were more likely to report crying, and three according to educational level. The vast majority of DIF had only minor influence on the scale scores (0.1-2.3 points on the 0-100 scales), but two DIF corresponded to a difference of 4.6 and 9.8, respectively. Ordinal logistic regression identified DIF in 17 of 84 items. The potential impact of this on the present scales was low, but items displaying DIF could be avoided when developing abbreviated scales, where the potential impact of DIF (due to fewer items) will be larger.

  14. Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS: An item response theory approach

    Directory of Open Access Journals (Sweden)

    JOSEPH P. EIMICKE

    2009-06-01

    Full Text Available The aims of this paper are to present findings related to differential item functioning (DIF in the Patient Reported Outcome Measurement Information System (PROMIS depression item bank, and to discuss potential threats to the validity of results from studies of DIF. The 32 depression items studied were modified from several widely used instruments. DIF analyses of gender, age and education were performed using a sample of 735 individuals recruited by a survey polling firm. DIF hypotheses were generated by asking content experts to indicate whether or not they expected DIF to be present, and the direction of the DIF with respect to the studied comparison groups. Primary analyses were conducted using the graded item response model (for polytomous, ordered response category data with likelihood ratio tests of DIF, accompanied by magnitude measures. Sensitivity analyses were performed using other item response models and approaches to DIF detection. Despite some caveats, the items that are recommended for exclusion or for separate calibration were "I felt like crying" and "I had trouble enjoying things that I used to enjoy." The item, "I felt I had no energy," was also flagged as evidencing DIF, and recommended for additional review. On the one hand, false DIF detection (Type 1 error was controlled to the extent possible by ensuring model fit and purification. On the other hand, power for DIF detection might have been compromised by several factors, including sparse data and small sample sizes. Nonetheless, practical and not just statistical significance should be considered. In this case the overall magnitude and impact of DIF was small for the groups studied, although impact was relatively large for some individuals.

  15. Geriatric Anxiety Scale: item response theory analysis, differential item functioning, and creation of a ten-item short form (GAS-10).

    Science.gov (United States)

    Mueller, Anne E; Segal, Daniel L; Gavett, Brandon; Marty, Meghan A; Yochim, Brian; June, Andrea; Coolidge, Frederick L

    2015-07-01

    The Geriatric Anxiety Scale (GAS; Segal et al. (Segal, D. L., June, A., Payne, M., Coolidge, F. L. and Yochim, B. (2010). Journal of Anxiety Disorders, 24, 709-714. doi:10.1016/j.janxdis.2010.05.002) is a self-report measure of anxiety that was designed to address unique issues associated with anxiety assessment in older adults. This study is the first to use item response theory (IRT) to examine the psychometric properties of a measure of anxiety in older adults. A large sample of older adults (n = 581; mean age = 72.32 years, SD = 7.64 years, range = 60 to 96 years; 64% women; 88% European American) completed the GAS. IRT properties were examined. The presence of differential item functioning (DIF) or measurement bias by age and sex was assessed, and a ten-item short form of the GAS (called the GAS-10) was created. All GAS items had discrimination parameters of 1.07 or greater. Items from the somatic subscale tended to have lower discrimination parameters than items on the cognitive or affective subscales. Two items were flagged for DIF, but the impact of the DIF was negligible. Women scored significantly higher than men on the GAS and its subscales. Participants in the young-old group (60 to 79 years old) scored significantly higher on the cognitive subscale than participants in the old-old group (80 years old and older). Results from the IRT analyses indicated that the GAS and GAS-10 have strong psychometric properties among older adults. We conclude by discussing implications and future research directions.

  16. Differential Item Functioning Analysis of the Mental, Emotional, and Bodily Toughness Inventory

    Science.gov (United States)

    Gao, Yong; Mack, Mick G.; Ragan, Moira A.; Ragan, Brian

    2012-01-01

    In this study the authors used differential item functioning analysis to examine if there were items in the Mental, Emotional, and Bodily Toughness Inventory functioning differently across gender and athletic membership. A total of 444 male (56.3%) and female (43.7%) participants (30.9% athletes and 69.1% non-athletes) responded to the Mental,…

  17. Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

    Science.gov (United States)

    Suh, Youngsuk

    2016-01-01

    This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…

  18. Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

    Science.gov (United States)

    Lee, Yi-Hsuan; Zhang, Jinming

    2017-01-01

    Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

  19. Does Gender-Specific Differential Item Functioning Affect the Structure in Vocational Interest Inventories?

    Science.gov (United States)

    Beinicke, Andrea; Pässler, Katja; Hell, Benedikt

    2014-01-01

    The study investigates consequences of eliminating items showing gender-specific differential item functioning (DIF) on the psychometric structure of a standard RIASEC interest inventory. Holland's hexagonal model was tested for structural invariance using a confirmatory methodological approach (confirmatory factor analysis and randomization…

  20. A Generalized Logistic Regression Procedure to Detect Differential Item Functioning among Multiple Groups

    Science.gov (United States)

    Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul

    2011-01-01

    We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

  1. Parent Ratings of ADHD Symptoms: Generalized Partial Credit Model Analysis of Differential Item Functioning across Gender

    Science.gov (United States)

    Gomez, Rapson

    2012-01-01

    Objective: Generalized partial credit model, which is based on item response theory (IRT), was used to test differential item functioning (DIF) for the "Diagnostic and Statistical Manual of Mental Disorders" (4th ed.), inattention (IA), and hyperactivity/impulsivity (HI) symptoms across boys and girls. Method: To accomplish this, parents completed…

  2. Testing for Nonuniform Differential Item Functioning with Multiple Indicator Multiple Cause Models

    Science.gov (United States)

    Woods, Carol M.; Grimm, Kevin J.

    2011-01-01

    In extant literature, multiple indicator multiple cause (MIMIC) models have been presented for identifying items that display uniform differential item functioning (DIF) only, not nonuniform DIF. This article addresses, for apparently the first time, the use of MIMIC models for testing both uniform and nonuniform DIF with categorical indicators. A…

  3. Stepwise Analysis of Differential Item Functioning Based on Multiple-Group Partial Credit Model.

    Science.gov (United States)

    Muraki, Eiji

    1999-01-01

    Extended an Item Response Theory (IRT) method for detection of differential item functioning to the partial credit model and applied the method to simulated data using a stepwise procedure. Then applied the stepwise DIF analysis based on the multiple-group partial credit model to writing trend data from the National Assessment of Educational…

  4. Detection of Differential Item Functioning with Nonlinear Regression: A Non-IRT Approach Accounting for Guessing

    Czech Academy of Sciences Publication Activity Database

    Drabinová, Adéla; Martinková, Patrícia

    2017-01-01

    Roč. 54, č. 4 (2017), s. 498-517 ISSN 0022-0655 R&D Projects: GA ČR GJ15-15856Y Institutional support: RVO:67985807 Keywords : differential item functioning * non-linear regression * logistic regression * item response theory Subject RIV: AM - Education OBOR OECD: Statistics and probability Impact factor: 0.979, year: 2016

  5. Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression

    DEFF Research Database (Denmark)

    Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.

    2010-01-01

    Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise...

  6. On-Demand Associative Cross-Language Information Retrieval

    Science.gov (United States)

    Geraldo, André Pinto; Moreira, Viviane P.; Gonçalves, Marcos A.

    This paper proposes the use of algorithms for mining association rules as an approach for Cross-Language Information Retrieval. These algorithms have been widely used to analyse market basket data. The idea is to map the problem of finding associations between sales items to the problem of finding term translations over a parallel corpus. The proposal was validated by means of experiments using queries in two distinct languages: Portuguese and Finnish to retrieve documents in English. The results show that the performance of our proposed approach is comparable to the performance of the monolingual baseline and to query translation via machine translation, even though these systems employ more complex Natural Language Processing techniques. The combination between machine translation and our approach yielded the best results, even outperforming the monolingual baseline.

  7. Exploring differential item functioning in the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC

    Directory of Open Access Journals (Sweden)

    Pollard Beth

    2012-12-01

    Full Text Available Abstract Background The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC is a widely used patient reported outcome in osteoarthritis. An important, but frequently overlooked, aspect of validating health outcome measures is to establish if items exhibit differential item functioning (DIF. That is, if respondents have the same underlying level of an attribute, does the item give the same score in different subgroups or is it biased towards one subgroup or another. The aim of the study was to explore DIF in the Likert format WOMAC for the first time in a UK osteoarthritis population with respect to demographic, social, clinical and psychological factors. Methods The sample comprised a community sample of 763 people with osteoarthritis who participated in the Somerset and Avon Survey of Health. The WOMAC was explored for DIF by gender, age, social deprivation, social class, employment status, distress, body mass index and clinical factors. Ordinal regression models were used to identify DIF items. Results After adjusting for age, two items were identified for the physical functioning subscale as having DIF with age identified as the DIF factor for 2 items, gender for 1 item and body mass index for 1 item. For the WOMAC pain subscale, for people with hip osteoarthritis one item was identified with age-related DIF. The impact of the DIF items rarely had a significant effect on the conclusions of group comparisons. Conclusions Overall, the WOMAC performed well with only a small number of DIF items identified. However, as DIF items were identified in for the WOMAC physical functioning subscale it would be advisable to analyse data taking into account the possible impact of the DIF items when weight, gender or especially age effects, are the focus of interest in UK-based osteoarthritis studies. Similarly for the WOMAC pain subscale in people with hip osteoarthritis it would be worthwhile to analyse data taking into account the

  8. Secondary Psychometric Examination of the Dimensional Obsessive-Compulsive Scale: Classical Testing, Item Response Theory, and Differential Item Functioning.

    Science.gov (United States)

    Thibodeau, Michel A; Leonard, Rachel C; Abramowitz, Jonathan S; Riemann, Bradley C

    2015-12-01

    The Dimensional Obsessive-Compulsive Scale (DOCS) is a promising measure of obsessive-compulsive disorder (OCD) symptoms but has received minimal psychometric attention. We evaluated the utility and reliability of DOCS scores. The study included 832 students and 300 patients with OCD. Confirmatory factor analysis supported the originally proposed four-factor structure. DOCS total and subscale scores exhibited good to excellent internal consistency in both samples (α = .82 to α = .96). Patient DOCS total scores reduced substantially during treatment (t = 16.01, d = 1.02). DOCS total scores discriminated between students and patients (sensitivity = 0.76, 1 - specificity = 0.23). The measure did not exhibit gender-based differential item functioning as tested by Mantel-Haenszel chi-square tests. Expected response options for each item were plotted as a function of item response theory and demonstrated that DOCS scores incrementally discriminate OCD symptoms ranging from low to extremely high severity. Incremental differences in DOCS scores appear to represent unbiased and reliable differences in true OCD symptom severity. © The Author(s) 2014.

  9. Do people with and without medical conditions respond similarly to the short health anxiety inventory? An assessment of differential item functioning using item response theory.

    Science.gov (United States)

    LeBouthillier, Daniel M; Thibodeau, Michel A; Alberts, Nicole M; Hadjistavropoulos, Heather D; Asmundson, Gordon J G

    2015-04-01

    Individuals with medical conditions are likely to have elevated health anxiety; however, research has not demonstrated how medical status impacts response patterns on health anxiety measures. Measurement bias can undermine the validity of a questionnaire by overestimating or underestimating scores in groups of individuals. We investigated whether the Short Health Anxiety Inventory (SHAI), a widely-used measure of health anxiety, exhibits medical condition-based bias on item and subscale levels, and whether the SHAI subscales adequately assess the health anxiety continuum. Data were from 963 individuals with diabetes, breast cancer, or multiple sclerosis, and 372 healthy individuals. Mantel-Haenszel tests and item characteristic curves were used to classify the severity of item-level differential item functioning in all three medical groups compared to the healthy group. Test characteristic curves were used to assess scale-level differential item functioning and whether the SHAI subscales adequately assess the health anxiety continuum. Nine out of 14 items exhibited differential item functioning. Two items exhibited differential item functioning in all medical groups compared to the healthy group. In both Thought Intrusion and Fear of Illness subscales, differential item functioning was associated with mildly deflated scores in medical groups with very high levels of the latent traits. Fear of Illness items poorly discriminated between individuals with low and very low levels of the latent trait. While individuals with medical conditions may respond differentially to some items, clinicians and researchers can confidently use the SHAI with a variety of medical populations without concern of significant bias. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Assessment of Differential Item Functioning in the Experiences of Discrimination Index

    Science.gov (United States)

    Cunningham, Timothy J.; Berkman, Lisa F.; Gortmaker, Steven L.; Kiefe, Catarina I.; Jacobs, David R.; Seeman, Teresa E.; Kawachi, Ichiro

    2011-01-01

    The psychometric properties of instruments used to measure self-reported experiences of discrimination in epidemiologic studies are rarely assessed, especially regarding construct validity. The authors used 2000–2001 data from the Coronary Artery Risk Development in Young Adults (CARDIA) Study to examine differential item functioning (DIF) in 2 versions of the Experiences of Discrimination (EOD) Index, an index measuring self-reported experiences of racial/ethnic and gender discrimination. DIF may confound interpretation of subgroup differences. Large DIF was observed for 2 of 7 racial/ethnic discrimination items: White participants reported more racial/ethnic discrimination for the “at school” item, and black participants reported more racial/ethnic discrimination for the “getting housing” item. The large DIF by race/ethnicity in the index for racial/ethnic discrimination probably reflects item impact and is the result of valid group differences between blacks and whites regarding their respective experiences of discrimination. The authors also observed large DIF by race/ethnicity for 3 of 7 gender discrimination items. This is more likely to have been due to item bias. Users of the EOD Index must consider the advantages and disadvantages of DIF adjustment (omitting items, constructing separate measures, and retaining items). The EOD Index has substantial usefulness as an instrument that can assess self-reported experiences of discrimination. PMID:22038104

  11. Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression

    DEFF Research Database (Denmark)

    Scott, Neil W; Fayers, Peter M; Aaronson, Neil K

    2010-01-01

    Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise ...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....

  12. Why Consumers Misattribute Sponsorships to Non-Sponsor Brands: Differential Roles of Item and Relational Communications.

    Science.gov (United States)

    Weeks, Clinton S; Humphreys, Michael S; Cornwell, T Bettina

    2018-02-01

    Brands engaged in sponsorship of events commonly have objectives that depend on consumer memory for the sponsor-event relationship (e.g., sponsorship awareness). Consumers however, often misattribute sponsorships to nonsponsor competitor brands, indicating erroneous memory for these relationships. The current research uses an item and relational memory framework to reveal sponsor brands may inadvertently foster this misattribution when they communicate relational linkages to events. Effects can be explained via differential roles of communicating item information (information that supports processing item distinctiveness) versus relational information (information that supports processing relationships among items) in contributing to memory outcomes. Experiment 1 uses event-cued brand recall to show that correct memory retrieval is best supported by communicating relational information when sponsorship relationships are not obvious (low congruence). In contrast, correct retrieval is best supported by communicating item information when relationships are obvious (high congruence). Experiment 2 uses brand-cued event recall to show that, against conventional marketing recommendations, relational information increases misattribution, whereas item information guards against misattribution. Results suggest sponsor brands must distinguish between item and relational communications to enhance correct retrieval and limit misattribution. Methodologically, the work shows that choice of cueing direction is critical in differentially revealing patterns of correct and incorrect retrieval with pair relationships. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  13. Differential item functioning of the patient-reported outcomes information system (PROMIS®) pain interference item bank by language (Spanish versus English).

    Science.gov (United States)

    Paz, Sylvia H; Spritzer, Karen L; Reise, Steven P; Hays, Ron D

    2017-06-01

    About 70% of Latinos, 5 years old or older, in the United States speak Spanish at home. Measurement equivalence of the PROMIS ® pain interference (PI) item bank by language of administration (English versus Spanish) has not been evaluated. A sample of 527 adult Spanish-speaking Latinos completed the Spanish version of the 41-item PROMIS ® pain interference item bank. We evaluate dimensionality, monotonicity and local independence of the Spanish-language items. Then we evaluate differential item functioning (DIF) using ordinal logistic regression with item response theory scores estimated from DIF-free "anchor" items. One of the 41 items in the Spanish version of the PROMIS ® PI item bank was identified as having significant uniform DIF. English- and Spanish-speaking subjects with the same level of pain interference responded differently to 1 of the 41 items in the PROMIS ® PI item bank. This item was not retained due to proprietary issues. The original English language item parameters can be used when estimating PROMIS ® PI scores.

  14. Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.

    Science.gov (United States)

    Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald

    2006-11-01

    We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.

  15. An Examination of Differential Item Functioning on the Vanderbilt Assessment of Leadership in Education

    Science.gov (United States)

    Polikoff, Morgan S.; May, Henry; Porter, Andrew C.; Elliott, Stephen N.; Goldring, Ellen; Murphy, Joseph

    2009-01-01

    The Vanderbilt Assessment of Leadership in Education is a 360-degree assessment of the effectiveness of principals' learning-centered leadership behaviors. In this report, we present results from a differential item functioning (DIF) study of the assessment. Using data from a national field trial, we searched for evidence of DIF on school level,…

  16. A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales

    DEFF Research Database (Denmark)

    Scott, Neil W.; Fayers, Peter M.; Bottomley, Andrew

    2009-01-01

    Differential item functioning (DIF) analyses are increasingly used to evaluate health-related quality of life (HRQoL) instruments, which often include relatively short subscales. Computer simulations were used to explore how various factors including scale length affect analysis of DIF by ordinal...... logistic regression....

  17. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    Science.gov (United States)

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  18. The MIMIC Method with Scale Purification for Detecting Differential Item Functioning

    Science.gov (United States)

    Wang, Wen-Chung; Shih, Ching-Lin; Yang, Chih-Chien

    2009-01-01

    This study implements a scale purification procedure onto the standard MIMIC method for differential item functioning (DIF) detection and assesses its performance through a series of simulations. It is found that the MIMIC method with scale purification (denoted as M-SP) outperforms the standard MIMIC method (denoted as M-ST) in controlling…

  19. Overcoming the effects of differential skewness of test items in scale construction

    Directory of Open Access Journals (Sweden)

    Johann M. Schepers

    2004-10-01

    Full Text Available The principal objective of the study was to develop a procedure for overcoming the effects of differential skewness of test items in scale construction. It was shown that the degree of skewness of test items places an upper limit on the correlations between the items, regardless of the contents of the items. If the items are ordered in terms of skewness the resulting inter correlation matrix forms a simplex or a pseudo simplex. Factoring such a matrix results in a multiplicity of factors, most of which are artifacts. A procedure for overcoming this problem was demonstrated with items from the Locus of Control Inventory (Schepers, 1995. The analysis was based on a sample of 1662 first year university students. Opsomming Die hoofdoel van die studie was om ’n prosedure te ontwikkel om die gevolge van differensiële skeefheid van toetsitems, in skaalkonstruksie, teen te werk. Daar is getoon dat die graad van skeefheid van toetsitems ’n boonste grens plaas op die korrelasies tussen die items ongeag die inhoud daarvan. Indien die items gerangskik word volgens graad van skeefheid, sal die interkorrelasiematriks van die items ’n simpleks of pseudosimpleks vorm. Indien so ’n matriks aan faktorontleding onderwerp word, lei dit tot ’n veelheid van faktore waarvan die meerderheid artefakte is. ’n Prosedure om hierdie probleem te bowe te kom, is gedemonstreer met behulp van die items van die Lokus van Beheer-vraelys (Schepers, 1995. Die ontledings is op ’n steekproef van 1662 eerstejaaruniversiteitstudente gebaseer.

  20. Differential item functioning analysis of the Vanderbilt Expertise Test for cars.

    Science.gov (United States)

    Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W; Van Gulick, Ana Beth; Gauthier, Isabel

    2015-01-01

    The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.

  1. Differential Item Functioning (DIF) among Spanish-Speaking English Language Learners (ELLs) in State Science Tests

    Science.gov (United States)

    Ilich, Maria O.

    Psychometricians and test developers evaluate standardized tests for potential bias against groups of test-takers by using differential item functioning (DIF). English language learners (ELLs) are a diverse group of students whose native language is not English. While they are still learning the English language, they must take their standardized tests for their school subjects, including science, in English. In this study, linguistic complexity was examined as a possible source of DIF that may result in test scores that confound science knowledge with a lack of English proficiency among ELLs. Two years of fifth-grade state science tests were analyzed for evidence of DIF using two DIF methods, Simultaneous Item Bias Test (SIBTest) and logistic regression. The tests presented a unique challenge in that the test items were grouped together into testlets---groups of items referring to a scientific scenario to measure knowledge of different science content or skills. Very large samples of 10, 256 students in 2006 and 13,571 students in 2007 were examined. Half of each sample was composed of Spanish-speaking ELLs; the balance was comprised of native English speakers. The two DIF methods were in agreement about the items that favored non-ELLs and the items that favored ELLs. Logistic regression effect sizes were all negligible, while SIBTest flagged items with low to high DIF. A decrease in socioeconomic status and Spanish-speaking ELL diversity may have led to inconsistent SIBTest effect sizes for items used in both testing years. The DIF results for the testlets suggested that ELLs lacked sufficient opportunity to learn science content. The DIF results further suggest that those constructed response test items requiring the student to draw a conclusion about a scientific investigation or to plan a new investigation tended to favor ELLs.

  2. A more general model for testing measurement invariance and differential item functioning.

    Science.gov (United States)

    Bauer, Daniel J

    2017-09-01

    The evaluation of measurement invariance is an important step in establishing the validity and comparability of measurements across individuals. Most commonly, measurement invariance has been examined using 1 of 2 primary latent variable modeling approaches: the multiple groups model or the multiple-indicator multiple-cause (MIMIC) model. Both approaches offer opportunities to detect differential item functioning within multi-item scales, and thereby to test measurement invariance, but both approaches also have significant limitations. The multiple groups model allows 1 to examine the invariance of all model parameters but only across levels of a single categorical individual difference variable (e.g., ethnicity). In contrast, the MIMIC model permits both categorical and continuous individual difference variables (e.g., sex and age) but permits only a subset of the model parameters to vary as a function of these characteristics. The current article argues that moderated nonlinear factor analysis (MNLFA) constitutes an alternative, more flexible model for evaluating measurement invariance and differential item functioning. We show that the MNLFA subsumes and combines the strengths of the multiple group and MIMIC models, allowing for a full and simultaneous assessment of measurement invariance and differential item functioning across multiple categorical and/or continuous individual difference variables. The relationships between the MNLFA model and the multiple groups and MIMIC models are shown mathematically and via an empirical demonstration. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  3. Determination of a Differential Item Functioning Procedure Using the Hierarchical Generalized Linear Model

    Directory of Open Access Journals (Sweden)

    Tülin Acar

    2012-01-01

    Full Text Available The aim of this research is to compare the result of the differential item functioning (DIF determining with hierarchical generalized linear model (HGLM technique and the results of the DIF determining with logistic regression (LR and item response theory–likelihood ratio (IRT-LR techniques on the test items. For this reason, first in this research, it is determined whether the students encounter DIF with HGLM, LR, and IRT-LR techniques according to socioeconomic status (SES, in the Turkish, Social Sciences, and Science subtest items of the Secondary School Institutions Examination. When inspecting the correlations among the techniques in terms of determining the items having DIF, it was discovered that there was significant correlation between the results of IRT-LR and LR techniques in all subtests; merely in Science subtest, the results of the correlation between HGLM and IRT-LR techniques were found significant. DIF applications can be made on test items with other DIF analysis techniques that were not taken to the scope of this research. The analysis results, which were determined by using the DIF techniques in different sample sizes, can be compared.

  4. Psychometric evaluation of Persian Nomophobia Questionnaire: Differential item functioning and measurement invariance across gender.

    Science.gov (United States)

    Lin, Chung-Ying; Griffiths, Mark D; Pakpour, Amir H

    2018-03-01

    Background and aims Research examining problematic mobile phone use has increased markedly over the past 5 years and has been related to "no mobile phone phobia" (so-called nomophobia). The 20-item Nomophobia Questionnaire (NMP-Q) is the only instrument that assesses nomophobia with an underlying theoretical structure and robust psychometric testing. This study aimed to confirm the construct validity of the Persian NMP-Q using Rasch and confirmatory factor analysis (CFA) models. Methods After ensuring the linguistic validity, Rasch models were used to examine the unidimensionality of each Persian NMP-Q factor among 3,216 Iranian adolescents and CFAs were used to confirm its four-factor structure. Differential item functioning (DIF) and multigroup CFA were used to examine whether males and females interpreted the NMP-Q similarly, including item content and NMP-Q structure. Results Each factor was unidimensional according to the Rach findings, and the four-factor structure was supported by CFA. Two items did not quite fit the Rasch models (Item 14: "I would be nervous because I could not know if someone had tried to get a hold of me;" Item 9: "If I could not check my smartphone for a while, I would feel a desire to check it"). No DIF items were found across gender and measurement invariance was supported in multigroup CFA across gender. Conclusions Due to the satisfactory psychometric properties, it is concluded that the Persian NMP-Q can be used to assess nomophobia among adolescents. Moreover, NMP-Q users may compare its scores between genders in the knowledge that there are no score differences contributed by different understandings of NMP-Q items.

  5. Differential Item Functioning of Pathological Gambling Criteria: An Examination of Gender, Race/Ethnicity, and Age

    OpenAIRE

    Sacco, Paul; Torres, Luis R.; Cunningham-Williams, Renee M.; Woods, Carol; Unick, G. Jay

    2011-01-01

    This study tested for the presence of differential item functioning (DIF) in DSM-IV Pathological Gambling Disorder (PGD) criteria based on gender, race/ethnicity and age. Using a nationally representative sample of adults from the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC), indicating current gambling (n = 10,899), Multiple Indicator-Multiple Cause (MIMIC) models tested for DIF, controlling for income, education, and marital status. Compared to the reference grou...

  6. Checking Equity: Why Differential Item Functioning Analysis Should Be a Routine Part of Developing Conceptual Assessments

    Czech Academy of Sciences Publication Activity Database

    Martinková, Patrícia; Drabinová, Adéla; Liaw, Y.L.; Sanders, E.A.; McFarland, J.L.; Price, R.M.

    2017-01-01

    Roč. 16, č. 2 (2017), č. článku rm2. ISSN 1931-7913 R&D Projects: GA ČR GJ15-15856Y Grant - others:NSF(US) DUE-1043443 Institutional support: RVO:67985807 Keywords : differential item functioning * fairness * conceptual assessments * concept inventory * undergraduate education * bias Subject RIV: AM - Education OBOR OECD: Education , special (to gifted persons, those with learning disabilities) Impact factor: 3.930, year: 2016

  7. Use of multilevel logistic regression to identify the causes of differential item functioning.

    Science.gov (United States)

    Balluerka, Nekane; Gorostiaga, Arantxa; Gómez-Benito, Juana; Hidalgo, María Dolores

    2010-11-01

    Given that a key function of tests is to serve as evaluation instruments and for decision making in the fields of psychology and education, the possibility that some of their items may show differential behaviour is a major concern for psychometricians. In recent decades, important progress has been made as regards the efficacy of techniques designed to detect this differential item functioning (DIF). However, the findings are scant when it comes to explaining its causes. The present study addresses this problem from the perspective of multilevel analysis. Starting from a case study in the area of transcultural comparisons, multilevel logistic regression is used: 1) to identify the item characteristics associated with the presence of DIF; 2) to estimate the proportion of variation in the DIF coefficients that is explained by these characteristics; and 3) to evaluate alternative explanations of the DIF by comparing the explanatory power or fit of different sequential models. The comparison of these models confirmed one of the two alternatives (familiarity with the stimulus) and rejected the other (the topic area) as being a cause of differential functioning with respect to the compared groups.

  8. Re-evaluating a vision-related quality of life questionnaire with item response theory (IRT and differential item functioning (DIF analyses

    Directory of Open Access Journals (Sweden)

    Knol Dirk L

    2011-09-01

    Full Text Available Abstract Background For the Low Vision Quality Of Life questionnaire (LVQOL it is unknown whether the psychometric properties are satisfactory when an item response theory (IRT perspective is considered. This study evaluates some essential psychometric properties of the LVQOL questionnaire in an IRT model, and investigates differential item functioning (DIF. Methods Cross-sectional data were used from an observational study among visually-impaired patients (n = 296. Calibration was performed for every dimension of the LVQOL in the graded response model. Item goodness-of-fit was assessed with the S-X2-test. DIF was assessed on relevant background variables (i.e. age, gender, visual acuity, eye condition, rehabilitation type and administration type with likelihood-ratio tests for DIF. The magnitude of DIF was interpreted by assessing the largest difference in expected scores between subgroups. Measurement precision was assessed by presenting test information curves; reliability with the index of subject separation. Results All items of the LVQOL dimensions fitted the model. There was significant DIF on several items. For two items the maximum difference between expected scores exceeded one point, and DIF was found on multiple relevant background variables. Item 1 'Vision in general' from the "Adjustment" dimension and item 24 'Using tools' from the "Reading and fine work" dimension were removed. Test information was highest for the "Reading and fine work" dimension. Indices for subject separation ranged from 0.83 to 0.94. Conclusions The items of the LVQOL showed satisfactory item fit to the graded response model; however, two items were removed because of DIF. The adapted LVQOL with 21 items is DIF-free and therefore seems highly appropriate for use in heterogeneous populations of visually impaired patients.

  9. Identifying Country-Specific Cultures of Physics Education: A differential item functioning approach

    Science.gov (United States)

    Mesic, Vanes

    2012-11-01

    In international large-scale assessments of educational outcomes, student achievement is often represented by unidimensional constructs. This approach allows for drawing general conclusions about country rankings with respect to the given achievement measure, but it typically does not provide specific diagnostic information which is necessary for systematic comparisons and improvements of educational systems. Useful information could be obtained by exploring the differences in national profiles of student achievement between low-achieving and high-achieving countries. In this study, we aimed to identify the relative weaknesses and strengths of eighth graders' physics achievement in Bosnia and Herzegovina in comparison to the achievement of their peers from Slovenia. For this purpose, we ran a secondary analysis of Trends in International Mathematics and Science Study (TIMSS) 2007 data. The student sample consisted of 4,220 students from Bosnia and Herzegovina and 4,043 students from Slovenia. After analysing the cognitive demands of TIMSS 2007 physics items, the correspondent differential item functioning (DIF)/differential group functioning contrasts were estimated. Approximately 40% of items exhibited large DIF contrasts, indicating significant differences between cultures of physics education in Bosnia and Herzegovina and Slovenia. The relative strength of students from Bosnia and Herzegovina showed to be mainly associated with the topic area 'Electricity and magnetism'. Classes of items which required the knowledge of experimental method, counterintuitive thinking, proportional reasoning and/or the use of complex knowledge structures proved to be differentially easier for students from Slovenia. In the light of the presented results, the common practice of ranking countries with respect to universally established cognitive categories seems to be potentially misleading.

  10. Statistical and extra-statistical considerations in differential item functioning analyses

    Directory of Open Access Journals (Sweden)

    G. K. Huysamen

    2004-10-01

    Full Text Available This article briefly describes the main procedures for performing differential item functioning (DIF analyses and points out some of the statistical and extra-statistical implications of these methods. Research findings on the sources of DIF, including those associated with translated tests, are reviewed. As DIF analyses are oblivious of correlations between a test and relevant criteria, the elimination of differentially functioning items does not necessarily improve predictive validity or reduce any predictive bias. The implications of the results of past DIF research for test development in the multilingual and multi-cultural South African society are considered. Opsomming Hierdie artikel beskryf kortliks die hoofprosedures vir die ontleding van differensiële itemfunksionering (DIF en verwys na sommige van die statistiese en buite-statistiese implikasies van hierdie metodes. ’n Oorsig word verskaf van navorsingsbevindings oor die bronne van DIF, insluitend dié by vertaalde toetse. Omdat DIF-ontledings nie die korrelasies tussen ’n toets en relevante kriteria in ag neem nie, sal die verwydering van differensieel-funksionerende items nie noodwendig voorspellingsgeldigheid verbeter of voorspellingsydigheid verminder nie. Die implikasies van vorige DIF-navorsingsbevindings vir toetsontwikkeling in die veeltalige en multikulturele Suid-Afrikaanse gemeenskap word oorweeg.

  11. Examining Multiple Sources of Differential Item Functioning on the Clinician & Group CAHPS® Survey

    Science.gov (United States)

    Rodriguez, Hector P; Crane, Paul K

    2011-01-01

    Objective To evaluate psychometric properties of a widely used patient experience survey. Data Sources English-language responses to the Clinician & Group Consumer Assessment of Healthcare Providers and Systems (CG-CAHPS®) survey (n = 12,244) from a 2008 quality improvement initiative involving eight southern California medical groups. Methods We used an iterative hybrid ordinal logistic regression/item response theory differential item functioning (DIF) algorithm to identify items with DIF related to patient sociodemographic characteristics, duration of the physician–patient relationship, number of physician visits, and self-rated physical and mental health. We accounted for all sources of DIF and determined its cumulative impact. Principal Findings The upper end of the CG-CAHPS® performance range is measured with low precision. With sensitive settings, some items were found to have DIF. However, overall DIF impact was negligible, as 0.14 percent of participants had salient DIF impact. Latinos who spoke predominantly English at home had the highest prevalence of salient DIF impact at 0.26 percent. Conclusions The CG-CAHPS® functions similarly across commercially insured respondents from diverse backgrounds. Consequently, previously documented racial and ethnic group differences likely reflect true differences rather than measurement bias. The impact of low precision at the upper end of the scale should be clarified. PMID:22092021

  12. Identifying group-sensitive physical activities: a differential item functioning analysis of NHANES data.

    Science.gov (United States)

    Gao, Yong; Zhu, Weimo

    2011-05-01

    The purpose of this study was to identify subgroup-sensitive physical activities (PA) using differential item functioning (DIF) analysis. A sub-unweighted sample of 1857 (men=923 and women=934) from the 2003-2004 National Health and Nutrition Examination Survey PA questionnaire data was used for the analyses. Using the Mantel-Haenszel, the simultaneous item bias test, and the ANOVA DIF methods, 33 specific leisure-time moderate and/or vigorous PA (MVPA) items were analyzed for DIF across race/ethnicity, gender, education, income, and age groups. Many leisure-time MVPA items were identified as large DIF items. When participating in the same amount of leisure-time MVPA, non-Hispanic blacks were more likely to participate in basketball and dance activities than non-Hispanic whites (NHW); NHW were more likely to participated in golf and hiking than non-Hispanic blacks; Hispanics were more likely to participate in dancing, hiking, and soccer than NHW, whereas NHW were more likely to engage in bicycling, golf, swimming, and walking than Hispanics; women were more likely to participate in aerobics, dancing, stretching, and walking than men, whereas men were more likely to engage in basketball, fishing, golf, running, soccer, weightlifting, and hunting than women; educated persons were more likely to participate in jogging and treadmill exercise than less educated persons; persons with higher incomes were more likely to engage in golf than those with lower incomes; and adults (20-59 yr) were more likely to participate in basketball, dancing, jogging, running, and weightlifting than older adults (60+ yr), whereas older adults were more likely to participate in walking and golf than younger adults. DIF methods are able to identify subgroup-sensitive PA and thus provide useful information to help design group-sensitive, targeted interventions for disadvantaged PA subgroups. © 2011 by the American College of Sports Medicine

  13. Exploring differential item functioning (DIF) with the Rasch model: A comparison of gender differences on eighth-grade science items in the United States and Spain

    Science.gov (United States)

    Calvert, Tasha

    Despite the attention that has been given to gender and science, boys continue to outperform girls in science achievement, particularly by the end of secondary school. Because it is unclear whether gender differences have narrowed over time (Leder, 1992; Willingham & Cole, 1997), it is important to continue a line of inquiry into the nature of gender differences, specifically at the international level. The purpose of this study was to investigate gender differences in science achievement across two countries: United States and Spain. A secondary purpose was to demonstrate an alternative method for exploring gender differences based on the many-faceted Rasch model (1980). A secondary analysis of the data from the Third International Mathematics and Science Study (TIMSS) was used to examine the relationship between gender DIF (differential item functioning) and item characteristics (item type, content, and performance expectation) across both countries. Nationally representative samples of eighth grade students in the United States and Spain who participated in TIMSS were analyzed to answer the research questions in this study. In both countries, girls showed an advantage over boys on life science items and most extended response items, whereas boys, by and large, had an advantage on earth science, physics, and chemistry items. However, even within areas that favored boys, such as physics, there were items that were differentially easier for girls. In general, patterns in gender differences were similar across both countries although there were a few differences between the countries on individual items. It was concluded that simply looking at mean differences does not provide an adequate understanding of the nature of gender differences in science achievement.

  14. The practical impact of differential item functioning analyses in a health-related quality of life instrument

    DEFF Research Database (Denmark)

    Scott, Neil W; Fayers, Peter M; Aaronson, Neil K

    2009-01-01

    Differential item functioning (DIF) analyses are commonly used to evaluate health-related quality of life (HRQoL) instruments. There is, however, a lack of consensus as to how to assess the practical impact of statistically significant DIF results.......Differential item functioning (DIF) analyses are commonly used to evaluate health-related quality of life (HRQoL) instruments. There is, however, a lack of consensus as to how to assess the practical impact of statistically significant DIF results....

  15. Consolidation differentially modulates schema effects on memory for items and associations.

    Science.gov (United States)

    van Kesteren, Marlieke T R; Rijpkema, Mark; Ruiter, Dirk J; Fernández, Guillén

    2013-01-01

    Newly learned information that is congruent with a preexisting schema is often better remembered than information that is incongruent. This schema effect on memory has previously been associated to more efficient encoding and consolidation mechanisms. However, this effect is not always consistently supported in the literature, with differential schema effects reported for different types of memory, different retrieval cues, and the possibility of time-dependent effects related to consolidation processes. To examine these effects more directly, we tested participants on two different types of memory (item recognition and associative memory) for newly encoded visuo-tactile associations at different study-test intervals, thus probing memory retrieval accuracy for schema-congruent and schema-incongruent items and associations at different time points (t = 0, t = 20, and t = 48 hours) after encoding. Results show that the schema effect on visual item recognition only arises after consolidation, while the schema effect on associative memory is already apparent immediately after encoding, persisting, but getting smaller over time. These findings give further insight into different factors influencing the schema effect on memory, and can inform future schema experiments by illustrating the value of considering effects of memory type and consolidation on schema-modulated retrieval.

  16. Consolidation differentially modulates schema effects on memory for items and associations.

    Directory of Open Access Journals (Sweden)

    Marlieke T R van Kesteren

    Full Text Available Newly learned information that is congruent with a preexisting schema is often better remembered than information that is incongruent. This schema effect on memory has previously been associated to more efficient encoding and consolidation mechanisms. However, this effect is not always consistently supported in the literature, with differential schema effects reported for different types of memory, different retrieval cues, and the possibility of time-dependent effects related to consolidation processes. To examine these effects more directly, we tested participants on two different types of memory (item recognition and associative memory for newly encoded visuo-tactile associations at different study-test intervals, thus probing memory retrieval accuracy for schema-congruent and schema-incongruent items and associations at different time points (t = 0, t = 20, and t = 48 hours after encoding. Results show that the schema effect on visual item recognition only arises after consolidation, while the schema effect on associative memory is already apparent immediately after encoding, persisting, but getting smaller over time. These findings give further insight into different factors influencing the schema effect on memory, and can inform future schema experiments by illustrating the value of considering effects of memory type and consolidation on schema-modulated retrieval.

  17. The comparability of English, French and Dutch scores on the Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F: an assessment of differential item functioning in patients with systemic sclerosis.

    Directory of Open Access Journals (Sweden)

    Linda Kwakkenbos

    Full Text Available The Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F is commonly used to assess fatigue in rheumatic diseases, and has shown to discriminate better across levels of the fatigue spectrum than other commonly used measures. The aim of this study was to assess the cross-language measurement equivalence of the English, French, and Dutch versions of the FACIT-F in systemic sclerosis (SSc patients.The FACIT-F was completed by 871 English-speaking Canadian, 238 French-speaking Canadian and 230 Dutch SSc patients. Confirmatory factor analysis was used to assess the factor structure in the three samples. The Multiple-Indicator Multiple-Cause (MIMIC model was utilized to assess differential item functioning (DIF, comparing English versus French and versus Dutch patient responses separately.A unidimensional factor model showed good fit in all samples. Comparing French versus English patients, statistically significant, but small-magnitude DIF was found for 3 of 13 items. French patients had 0.04 of a standard deviation (SD lower latent fatigue scores than English patients and there was an increase of only 0.03 SD after accounting for DIF. For the Dutch versus English comparison, 4 items showed small, but statistically significant, DIF. Dutch patients had 0.20 SD lower latent fatigue scores than English patients. After correcting for DIF, there was a reduction of 0.16 SD in this difference.There was statistically significant DIF in several items, but the overall effect on fatigue scores was minimal. English, French and Dutch versions of the FACIT-F can be reasonably treated as having equivalent scoring metrics.

  18. The Comparability of English, French and Dutch Scores on the Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F): An Assessment of Differential Item Functioning in Patients with Systemic Sclerosis

    Science.gov (United States)

    Kwakkenbos, Linda; Willems, Linda M.; Baron, Murray; Hudson, Marie; Cella, David; van den Ende, Cornelia H. M.; Thombs, Brett D.

    2014-01-01

    Objective The Functional Assessment of Chronic Illness Therapy- Fatigue (FACIT-F) is commonly used to assess fatigue in rheumatic diseases, and has shown to discriminate better across levels of the fatigue spectrum than other commonly used measures. The aim of this study was to assess the cross-language measurement equivalence of the English, French, and Dutch versions of the FACIT-F in systemic sclerosis (SSc) patients. Methods The FACIT-F was completed by 871 English-speaking Canadian, 238 French-speaking Canadian and 230 Dutch SSc patients. Confirmatory factor analysis was used to assess the factor structure in the three samples. The Multiple-Indicator Multiple-Cause (MIMIC) model was utilized to assess differential item functioning (DIF), comparing English versus French and versus Dutch patient responses separately. Results A unidimensional factor model showed good fit in all samples. Comparing French versus English patients, statistically significant, but small-magnitude DIF was found for 3 of 13 items. French patients had 0.04 of a standard deviation (SD) lower latent fatigue scores than English patients and there was an increase of only 0.03 SD after accounting for DIF. For the Dutch versus English comparison, 4 items showed small, but statistically significant, DIF. Dutch patients had 0.20 SD lower latent fatigue scores than English patients. After correcting for DIF, there was a reduction of 0.16 SD in this difference. Conclusions There was statistically significant DIF in several items, but the overall effect on fatigue scores was minimal. English, French and Dutch versions of the FACIT-F can be reasonably treated as having equivalent scoring metrics. PMID:24638101

  19. Cross-Language Measurement Equivalence of the Center for Epidemiologic Studies Depression (CES-D) Scale in Systemic Sclerosis: A Comparison of Canadian and Dutch Patients

    Science.gov (United States)

    Kwakkenbos, Linda; Arthurs, Erin; van den Hoogen, Frank H. J.; Hudson, Marie; van Lankveld, Wim G. J. M.; Baron, Murray; van den Ende, Cornelia H. M.; Thombs, Brett D.

    2013-01-01

    Objectives Increasingly, medical research involves patients who complete outcomes in different languages. This occurs in countries with more than one common language, such as Canada (French/English) or the United States (Spanish/English), as well as in international multi-centre collaborations, which are utilized frequently in rare diseases such as systemic sclerosis (SSc). In order to pool or compare outcomes, instruments should be measurement equivalent (invariant) across cultural or linguistic groups. This study provides an example of how to assess cross-language measurement equivalence by comparing the Center for Epidemiologic Studies Depression (CES-D) scale between English-speaking Canadian and Dutch SSc patients. Methods The CES-D was completed by 922 English-speaking Canadian and 213 Dutch SSc patients. Confirmatory factor analysis (CFA) was used to assess the factor structure in both samples. The Multiple-Indicator Multiple-Cause (MIMIC) model was utilized to assess the amount of differential item functioning (DIF). Results A two-factor model (positive and negative affect) showed excellent fit in both samples. Statistically significant, but small-magnitude, DIF was found for 3 of 20 items on the CES-D. The English-speaking Canadian sample endorsed more feeling-related symptoms, whereas the Dutch sample endorsed more somatic/retarded activity symptoms. The overall estimate in depression scores between English and Dutch was not influenced substantively by DIF. Conclusions CES-D scores from English-speaking Canadian and Dutch SSc patients can be compared and pooled without concern that measurement differences may substantively influence results. The importance of assessing cross-language measurement equivalence in rheumatology studies prior to pooling outcomes obtained in different languages should be emphasized. PMID:23326538

  20. Differential Item Functioning of the Psychological Domain of the Menopause Rating Scale

    Science.gov (United States)

    Portela-Buelvas, Katherin; Oviedo, Heidi C.; Herazo, Edwin; Campo-Arias, Adalberto

    2016-01-01

    Introduction. Quality of life could be quantified with the Menopause Rating Scale (MRS), which evaluates the severity of somatic, psychological, and urogenital symptoms in menopause. However, differential item functioning (DIF) analysis has not been applied previously. Objective. To establish the DIF of the psychological domain of the MRS in Colombian women. Methods. 4,009 women aged between 40 and 59 years, who participated in the CAVIMEC (Calidad de Vida en la Menopausia y Etnias Colombianas) project, were included. Average age was 49.0 ± 5.9 years. Women were classified in mestizo, Afro-Colombian, and indigenous. The results were presented as averages and standard deviation (X ± SD). A p value <0.001 was considered statistically significant. Results. In mestizo women, the highest X ± SD were obtained in physical and mental exhaustion (PME) (0.86 ± 0.93) and the lowest ones in anxiety (0.44 ± 0.79). In Afro-Colombian women, an average score of 0.99 ± 1.07 for PME and 0.63 ± 0.88 for anxiety was gotten. Indigenous women obtained an increased average score for PME (1.33 ± 0.93). The lowest score was evidenced in depressive mood (0.50 ± 0.81), which is different from other Colombian women (p < 0.001). Conclusions. The psychological items of the MRS show differential functioning according to the ethnic group, which may induce systematic error in the measurement of the construct. PMID:27847825

  1. Differential Item Functioning of the Psychological Domain of the Menopause Rating Scale.

    Science.gov (United States)

    Monterrosa-Castro, Alvaro; Portela-Buelvas, Katherin; Oviedo, Heidi C; Herazo, Edwin; Campo-Arias, Adalberto

    2016-01-01

    Introduction. Quality of life could be quantified with the Menopause Rating Scale (MRS), which evaluates the severity of somatic, psychological, and urogenital symptoms in menopause. However, differential item functioning (DIF) analysis has not been applied previously. Objective . To establish the DIF of the psychological domain of the MRS in Colombian women. Methods . 4,009 women aged between 40 and 59 years, who participated in the CAVIMEC (Calidad de Vida en la Menopausia y Etnias Colombianas) project, were included. Average age was 49.0 ± 5.9 years. Women were classified in mestizo, Afro-Colombian, and indigenous. The results were presented as averages and standard deviation ( X ± SD). A p value <0.001 was considered statistically significant. Results . In mestizo women, the highest X ± SD were obtained in physical and mental exhaustion (PME) (0.86 ± 0.93) and the lowest ones in anxiety (0.44 ± 0.79). In Afro-Colombian women, an average score of 0.99 ± 1.07 for PME and 0.63 ± 0.88 for anxiety was gotten. Indigenous women obtained an increased average score for PME (1.33 ± 0.93). The lowest score was evidenced in depressive mood (0.50 ± 0.81), which is different from other Colombian women ( p < 0.001). Conclusions . The psychological items of the MRS show differential functioning according to the ethnic group, which may induce systematic error in the measurement of the construct.

  2. Differential item functioning of pathological gambling criteria: an examination of gender, race/ethnicity, and age.

    Science.gov (United States)

    Sacco, Paul; Torres, Luis R; Cunningham-Williams, Renee M; Woods, Carol; Unick, G Jay

    2011-06-01

    This study tested for the presence of differential item functioning (DIF) in DSM-IV Pathological Gambling Disorder (PGD) criteria based on gender, race/ethnicity and age. Using a nationally representative sample of adults from the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC), indicating current gambling (n = 10,899), Multiple Indicator-Multiple Cause (MIMIC) models tested for DIF, controlling for income, education, and marital status. Compared to the reference groups (i.e., Male, Caucasian, and ages 25-59 years), women (OR = 0.62; P gambling to escape (Criterion 5) (OR = 2.22; P < .001) but young adults (OR = 0.62; P < .05) were less likely to endorse it. African Americans (OR = 2.50; P < .001) and Hispanics were more likely to endorse trying to cut back (Criterion 3) (OR = 2.01; P < .01). African Americans were more likely to endorse the suffering losses (OR = 2.27; P < .01) criterion. Young adults were more likely to endorse chasing losses (Criterion 9) (OR = 1.81; P < .01) while older adults were less likely to endorse this criterion (OR = 0.76; P < .05). Further research is needed to identify factors contributing to DIF, address criteria level bias, and examine differential test functioning.

  3. Cross-Language Support Mechanisms Significantly Aid Software Development

    DEFF Research Database (Denmark)

    Pfeiffer, Rolf-Helge; Wasowski, Andrzej

    2012-01-01

    Contemporary software systems combine many artifacts specified in various modeling and programming languages, domainspecific and general purpose as well. Since multi-language systems are so widespread, working on them calls for tools with cross-language support mechanisms such as (1) visualizatio...

  4. A Differential Item Functional Analysis by Age of Perceived Interpersonal Discrimination in a Multi-racial/ethnic Sample of Adults.

    Science.gov (United States)

    Owens, Sherry; Kristjansson, Alfgeir L; Hunte, Haslyn E R

    2015-11-05

    We investigated whether individual items on the nine item William's Perceived Everyday Discrimination Scale (EDS) functioned differently by age (ethnic group. Overall, Asian and Hispanic respondents reported less discrimination than Whites; on the other hand, African Americans and Black Caribbeans reported more discrimination than Whites. Regardless of race/ethnicity, the younger respondents (aged ethnicity, the results were mixed for 19 out of 45 tests of DIF (40%). No differences in item function were observed among Black Caribbeans. "Being called names or insulted" and others acting as "if they are afraid" of the respondents were the only two items that did not exhibit differential item functioning by age across all racial/ethnic groups. Overall, our findings suggest that the EDS scale should be used with caution in multi-age multi-racial/ethnic samples.

  5. Gender Invariance of the Gambling Behavior Scale for Adolescents (GBS-A): An Analysis of Differential Item Functioning Using Item Response Theory.

    Science.gov (United States)

    Donati, Maria Anna; Chiesi, Francesca; Izzo, Viola A; Primi, Caterina

    2017-01-01

    As there is a lack of evidence attesting the equivalent item functioning across genders for the most employed instruments used to measure pathological gambling in adolescence, the present study was aimed to test the gender invariance of the Gambling Behavior Scale for Adolescents (GBS-A), a new measurement tool to assess the severity of Gambling Disorder (GD) in adolescents. The equivalence of the items across genders was assessed by analyzing Differential Item Functioning within an Item Response Theory framework. The GBS-A was administered to 1,723 adolescents, and the graded response model was employed. The results attested the measurement equivalence of the GBS-A when administered to male and female adolescent gamblers. Overall, findings provided evidence that the GBS-A is an effective measurement tool of the severity of GD in male and female adolescents and that the scale was unbiased and able to relieve truly gender differences. As such, the GBS-A can be profitably used in educational interventions and clinical treatments with young people.

  6. Symptom endorsement in men versus women with a diagnosis of depression: A differential item functioning approach.

    Science.gov (United States)

    Cavanagh, Anna; Wilson, Coralie J; Caputi, Peter; Kavanagh, David J

    2016-09-01

    There is some evidence that, in contrast to depressed women, depressed men tend to report alternative symptoms that are not listed as standard diagnostic criteria. This may possibly lead to an under- or misdiagnosis of depression in men. This study aims to clarify whether depressed men and women report different symptoms. This study used data from the 2007 Australian National Survey of Mental Health and Wellbeing that was collected using the World Health Organization's Composite International Diagnostic Interview. Participants with a diagnosis of a depressive disorder with 12-month symptoms (n = 663) were identified and included in this study. Differential item functioning (DIF) was used to test whether depressed men and women endorse different features associated with their condition. Gender-related DIF was present for three symptoms associated with depression. Depressed women were more likely to report 'appetite/weight disturbance', whereas depressed men were more likely to report 'alcohol misuse' and 'substance misuse'. While the results may reflect a greater risk of co-occurring alcohol and substance misuse in men, inclusion of these features in assessments may improve the detection of depression in men, especially if standard depressive symptoms are under-reported. © The Author(s) 2016.

  7. Differential item functional analysis on pedagogic and content knowledge (PCK) questionnaire for Indonesian teachers using RASCH model

    Science.gov (United States)

    Rahmani, B. D.

    2018-01-01

    The purpose of this paper is to evaluate Indonesian senior high school teacher’s pedagogical content knowledge also their perception toward curriculum changing in West Java Indonesia. The data used in this study were derived from a questionnaire survey conducted among teachers in Bandung, West Java. A total of 61 usable responses were collected. The Differential Item Functioning (DIFF) was used to analyze the data whether the item had a difference or not toward gender, education background also on school location. However, the result showed that there was no any significant difference on gender and school location toward the item response but educational background. As a conclusion, the teacher’s educational background influence on giving the response to the questionnaire. Therefore, it is suggested in the future to construct the items on the questionnaire which is coped the differences of the participant particularly the educational background.

  8. Timing of translation in cross-language qualitative research.

    Science.gov (United States)

    Santos, Hudson P O; Black, Amanda M; Sandelowski, Margarete

    2015-01-01

    Although there is increased understanding of language barriers in cross-language studies, the point at which language transformation processes are applied in research is inconsistently reported, or treated as a minor issue. Differences in translation timeframes raise methodological issues related to the material to be translated, as well as for the process of data analysis and interpretation. In this article we address methodological issues related to the timing of translation from Portuguese to English in two international cross-language collaborative research studies involving researchers from Brazil, Canada, and the United States. One study entailed late-phase translation of a research report, whereas the other study involved early phase translation of interview data. The timing of translation in interaction with the object of translation should be considered, in addition to the language, cultural, subject matter, and methodological competencies of research team members. © The Author(s) 2014.

  9. English-Chinese Cross-Language IR Using Bilingual Dictionaries

    Science.gov (United States)

    2006-01-01

    specialized dictionaries together contain about two million entries [6]. 4 Monolingual Experiment The Chinese documents and the Chinese translations of... monolingual performance. The main performance-limiting factor is the limited coverage of the dictionary used in query translation. Some of the key con...English-Chinese Cross-Language IR using Bilingual Dictionaries Aitao Chen , Hailing Jiang , and Fredric Gey School of Information Management

  10. Differential Item Functioning (DIF) Etnis pada Big Five Inventory (BFI) versi Adaptasi Fakultas Psikologi Universitas Sumatera Utara

    OpenAIRE

    Manik, Hitler

    2014-01-01

    Big Five Inventory (BFI) is one of personality test had been adapted into Indonesia language. More research had been developed to adapt the Indonesian Big Five Inventory. The purpose of this research is to check whether BFI’s personality test is fair if apply to ethnic of Batak Toba and Java. Therefore, examination of BFI’s items is needed. In psychology, especially in psychometric study, it is called Differential Item Functioning (DIF). Subject in this research is 327 people around 18 to 40 ...

  11. Use of differential item functioning (DIF analysis for bias analysis in test construction

    Directory of Open Access Journals (Sweden)

    Marié De Beer

    2004-10-01

    Opsomming Waar differensiële itemfunksioneringsprosedures (DIF-prosedures vir itemontleding gebaseer op itemresponsteorie (IRT tydens toetskonstruksie gebruik word, is dit moontlik om itemkarakteristiekekrommes vir dieselfde item vir verskillende subgroepe voor te stel. Hierdie krommes dui aan hoe elke item vir die verskillende subgroepe op verskillende vermoënsvlakke te funksioneer. DIF word aangetoon deur die area tussen die krommes. DIF is in die konstruksie van die 'Learning Potential Computerised Adaptive test (LPCAT' gebruik om die items te identifiseer wat sydigheid ten opsigte van geslag, kultuur, taal of opleidingspeil geopenbaar het. Items wat ’n voorafbepaalde vlak van DIF oorskry het, is uit die finale itembank weggelaat, ongeag die subgroep wat bevoordeel of benadeel is. Die proses en resultate van die DIF-ontleding word bespreek.

  12. Cross-language and second language speech perception

    DEFF Research Database (Denmark)

    Bohn, Ocke-Schwen

    2017-01-01

    in cross-language and second language speech perception research: The mapping issue (the perceptual relationship of sounds of the native and the nonnative language in the mind of the native listener and the L2 learner), the perceptual and learning difficulty/ease issue (how this relationship may or may...... not cause perceptual and learning difficulty), and the plasticity issue (whether and how experience with the nonnative language affects the perceptual organization of speech sounds in the mind of L2 learners). One important general conclusion from this research is that perceptual learning is possible at all...

  13. Assessment of Differential Item Functioning in Health-Related Outcomes: A Simulation and Empirical Analysis with Hierarchical Polytomous Data

    Directory of Open Access Journals (Sweden)

    Zahra Sharafi

    2017-01-01

    Full Text Available Background. The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning (DIF in the presence of multilevel data and polytomously scored items. The assessment of DIF with multilevel data (e.g., patients nested within hospitals, hospitals nested within districts from large-scale assessment programs has received considerable attention but very few studies evaluated the effect of hierarchical structure of data on DIF detection for polytomously scored items. Methods. The ordinal logistic regression (OLR and hierarchical ordinal logistic regression (HOLR were utilized to assess DIF in simulated and real multilevel polytomous data. Six factors (DIF magnitude, grouping variable, intraclass correlation coefficient, number of clusters, number of participants per cluster, and item discrimination parameter with a fully crossed design were considered in the simulation study. Furthermore, data of Pediatric Quality of Life Inventory™ (PedsQL™ 4.0 collected from 576 healthy school children were analyzed. Results. Overall, results indicate that both methods performed equivalently in terms of controlling Type I error and detection power rates. Conclusions. The current study showed negligible difference between OLR and HOLR in detecting DIF with polytomously scored items in a hierarchical structure. Implications and considerations while analyzing real data were also discussed.

  14. Assessing the Straightforwardly-Worded Brief Fear of Negative Evaluation Scale for Differential Item Functioning Across Gender and Ethnicity.

    Science.gov (United States)

    Harpole, Jared K; Levinson, Cheri A; Woods, Carol M; Rodebaugh, Thomas L; Weeks, Justin W; Brown, Patrick J; Heimberg, Richard G; Menatti, Andrew R; Blanco, Carlos; Schneier, Franklin; Liebowitz, Michael

    2015-06-01

    The Brief Fear of Negative Evaluation Scale (BFNE; Leary Personality and Social Psychology Bulletin , 9, 371-375, 1983) assesses fear and worry about receiving negative evaluation from others. Rodebaugh et al. Psychological Assessment, 16 , 169-181, (2004) found that the BFNE is composed of a reverse-worded factor (BFNE-R) and straightforwardly-worded factor (BFNE-S). Further, they found the BFNE-S to have better psychometric properties and provide more information than the BFNE-R. Currently there is a lack of research regarding the measurement invariance of the BFNE-S across gender and ethnicity with respect to item thresholds. The present study uses item response theory (IRT) to test the BFNE-S for differential item functioning (DIF) related to gender and ethnicity (White, Asian, and Black). Six data sets consisting of clinical, community, and undergraduate participants were utilized ( N =2,109). The factor structure of the BFNE-S was confirmed using categorical confirmatory factor analysis, IRT model assumptions were tested, and the BFNE-S was evaluated for DIF. Item nine demonstrated significant non-uniform DIF between White and Black participants. No other items showed significant uniform or non-uniform DIF across gender or ethnicity. Results suggest the BFNE-S can be used reliably with men and women and Asian and White participants. More research is needed to understand the implications of using the BFNE-S with Black participants.

  15. Disparities in Sense of Community: True Race Differences or Differential Item Functioning?

    Science.gov (United States)

    Coffman, Donna L.; BeLue, Rhonda

    2009-01-01

    The sense of community index (SCI) has been widely used to measure psychological sense of community (SOC). Furthermore, SOC has been found to differ among racial groups. Because different ethnic groups have different cultural and historical experiences that may lead to different interpretations of measurement items, it is important to know whether…

  16. An Anthropologist among the Psychometricians: Assessment Events, Ethnography, and Differential Item Functioning in the Mongolian Gobi

    Science.gov (United States)

    Maddox, Bryan; Zumbo, Bruno D.; Tay-Lim, Brenda; Qu, Demin

    2015-01-01

    This article explores the potential for ethnographic observations to inform the analysis of test item performance. In 2010, a standardized, large-scale adult literacy assessment took place in Mongolia as part of the United Nations Educational, Scientific and Cultural Organization Literacy Assessment and Monitoring Programme (LAMP). In a novel form…

  17. A symptom profile of depression among Asian Americans: is there evidence for differential item functioning of depressive symptoms?

    Science.gov (United States)

    Kalibatseva, Z; Leong, F T L; Ham, E H

    2014-09-01

    Theoretical and clinical publications suggest the existence of cultural differences in the expression and experience of depression. Measurement non-equivalence remains a potential methodological explanation for the lower prevalence of depression among Asian Americans compared to European Americans. This study compared DSM-IV depressive symptoms among Asian Americans and European Americans using secondary data analysis of the Collaborative Psychiatric Epidemiology Surveys (CPES). The Composite International Diagnostic Interview (CIDI) was used for the assessment of depressive symptoms. Of the entire sample, 310 Asian Americans and 1974 European Americans reported depressive symptoms and were included in the analyses. Measurement variance was examined with an item response theory differential item functioning (IRT DIF) analysis. χ2 analyses indicated that, compared to Asian Americans, European American participants more frequently endorsed affective symptoms such as 'feeling depressed', 'feeling discouraged' and 'cried more often'. The IRT analysis detected DIF for four out of the 15 depression symptom items. At equal levels of depression, Asian Americans endorsed feeling worthless and appetite changes more easily than European Americans, and European Americans endorsed feeling nervous and crying more often than Asian Americans. Asian Americans did not seem to over-report somatic symptoms; however, European Americans seemed to report more affective symptoms than Asian Americans. The results suggest that there was measurement variance in a few of the depression items.

  18. Differential items functioning to assess aggressiveness in college students / Funcionamento diferencial de itens para avaliar a agressividade de universitários

    Directory of Open Access Journals (Sweden)

    Fermino Fernandes Sisto

    2008-01-01

    Full Text Available In this research evidences of construct validity were searched analyzing the differential functioning items related to aggressiveness. The participants were 445 college students of both genders, attending the courses of Engineering, Computing and Psychology. The scale of aggressiveness composed by 81 items was collectively applied, in the classroom, to the students who consented to participate in the study. The items of the instrument were studied by means of the Rasch model. Twenty-eight items presented differential functioning item, 15 were characterized as typical for females and 13 for males. The reliability coefficients were 0.99 to the items and 0.86 to the persons. It was concluded that the aggressiveness can be measured separately on the basis of gender.

  19. Standard Errors for National Trends in International Large-Scale Assessments in the Case of Cross-National Differential Item Functioning

    Science.gov (United States)

    Sachse, Karoline A.; Haag, Nicole

    2017-01-01

    Standard errors computed according to the operational practices of international large-scale assessment studies such as the Programme for International Student Assessment's (PISA) or the Trends in International Mathematics and Science Study (TIMSS) may be biased when cross-national differential item functioning (DIF) and item parameter drift are…

  20. Detection of Differential Item Functioning on the Kirton Adaption-Innovation Inventory Using Multiple-Group Mean and Covariance Structure Analyses.

    Science.gov (United States)

    Chan, David

    2000-01-01

    Demonstrates how the mean and covariance structure analysis model of D. Sorbom (1974) can be used to detect uniform and nonuniform differential item functioning (DIF) on polytomous ordered response items assumed to approximate a continuous scale. Uses results from 773 civil service employees administered the Kirton Adaption-Innovation Inventory…

  1. Sex Differential Item Functioning in the Inventory of Early Development III Social-Emotional Skills

    Science.gov (United States)

    Beaver, Jessica L.; French, Brian F.; Finch, W. Holmes; Ullrich-French, Sarah C.

    2014-01-01

    Social-emotional (SE) skills in the early developmental years of children influence outcomes in psychological, behavioral, and learning domains. The adult ratings of a child's SE skills can be influenced by sex stereotypes. These rating differences could lead to differential conclusions about developmental progress or risk. To ensure that…

  2. Cross-cultural and sex differences in the Emotional Skills and Competence Questionnaire scales: Challenges of differential item functioning analyses

    Directory of Open Access Journals (Sweden)

    Bo Molander

    2009-11-01

    Full Text Available University students in Croatia, Slovenia, and Sweden (N = 1129 were examined by means of the Emotional Skills and Competence Questionnaire (Takšić, 1998. Results showed a significant effect for the sex factor only on the total-score scale, women scoring higher than men, but significant effects were obtained for country, as well as for sex, on the Express and Label (EL and Perceive and Understand (PU subscales. Sweden showed higher scores than Croatia and Slovenia on the EL scale, and Slovenia showed higher scores than Croatia and Sweden on the PU scale. In subsequent analyses of differential item functioning (DIF, comparisons were carried out for pairs of countries. The analyses revealed that a large proportion of the items in the total-score scale were potentially biased, most so for the Croatian-Swedish comparison, less for the Slovenian-Swedish comparison, and least for the Croatian-Slovenian comparison. These findings give doubts about the validity of mean score differences in comparisons of countries. However, DIF analyses of sex differences within each country show very few DIF items, indicating that the ESCQ instrument works well within each cultural/linguistic setting. Possible explanations of the findings are discussed, and improvements for future studies are suggested.

  3. Cross-language information retrieval using PARAFAC2.

    Energy Technology Data Exchange (ETDEWEB)

    Bader, Brett William; Chew, Peter; Abdelali, Ahmed (New Mexico State University, Las Cruces, NM); Kolda, Tamara Gibson

    2007-05-01

    A standard approach to cross-language information retrieval (CLIR) uses Latent Semantic Analysis (LSA) in conjunction with a multilingual parallel aligned corpus. This approach has been shown to be successful in identifying similar documents across languages - or more precisely, retrieving the most similar document in one language to a query in another language. However, the approach has severe drawbacks when applied to a related task, that of clustering documents 'language-independently', so that documents about similar topics end up closest to one another in the semantic space regardless of their language. The problem is that documents are generally more similar to other documents in the same language than they are to documents in a different language, but on the same topic. As a result, when using multilingual LSA, documents will in practice cluster by language, not by topic. We propose a novel application of PARAFAC2 (which is a variant of PARAFAC, a multi-way generalization of the singular value decomposition [SVD]) to overcome this problem. Instead of forming a single multilingual term-by-document matrix which, under LSA, is subjected to SVD, we form an irregular three-way array, each slice of which is a separate term-by-document matrix for a single language in the parallel corpus. The goal is to compute an SVD for each language such that V (the matrix of right singular vectors) is the same across all languages. Effectively, PARAFAC2 imposes the constraint, not present in standard LSA, that the 'concepts' in all documents in the parallel corpus are the same regardless of language. Intuitively, this constraint makes sense, since the whole purpose of using a parallel corpus is that exactly the same concepts are expressed in the translations. We tested this approach by comparing the performance of PARAFAC2 with standard LSA in solving a particular CLIR problem. From our results, we conclude that PARAFAC2 offers a very promising alternative to

  4. Exploring differential item functioning (DIF) with the Rasch model: a comparison of gender differences on eighth grade science items in the United States and Spain.

    Science.gov (United States)

    Babiar, Tasha Calvert

    2011-01-01

    Traditionally, women and minorities have not been fully represented in science and engineering. Numerous studies have attributed these differences to gaps in science achievement as measured by various standardized tests. Rather than describe mean group differences in science achievement across multiple cultures, this study focused on an in-depth item-level analysis across two countries: Spain and the United States. This study investigated eighth-grade gender differences on science items across the two countries. A secondary purpose of the study was to explore the nature of gender differences using the many-faceted Rasch Model as a way to estimate gender DIF. A secondary analysis of data from the Third International Mathematics and Science Study (TIMSS) was used to address three questions: 1) Does gender DIF in science achievement exist? 2) Is there a relationship between gender DIF and characteristics of the science items? 3) Do the relationships between item characteristics and gender DIF in science items replicate across countries. Participants included 7,087 eight grade students from the United States and 3,855 students from Spain who participated in TIMSS. The Facets program (Linacre and Wright, 1992) was used to estimate gender DIF. The results of the analysis indicate that the content of the item seemed to be related to gender DIF. The analysis also suggests that there is a relationship between gender DIF and item format. No pattern of gender DIF related to cognitive demand was found. The general pattern of gender DIF was similar across the two countries used in the analysis. The strength of item-level analysis as opposed to group mean difference analysis is that gender differences can be detected at the item level, even when no mean differences can be detected at the group level.

  5. Being Bilingual: Issues for Cross-Language Research

    Directory of Open Access Journals (Sweden)

    Bogusia Temple

    2006-01-01

    Full Text Available The current political debates in England highlight the role of language in citizenship, social exclusion, and discrimination. Similar debates can also be found around the world. Correspondingly, research addressing different language communities is burgeoning. Service providers and academics are increasingly employing bilingual community researchers or interpreters to carry out research. However, there is very little written about the effect of working with bilingual researchers. What it means to be bilingual is often essentialised and rarely problematised. Bilingual researchers are seen as unproblematically acting as bridges between communities just because they are bilingual. Their ties to communities, their use of language, and their perspectives on the research are rarely investigated. Language is tied in an unproblematic way to meaning, values, and beliefs. In this article, I use examples from my own research to question what it means to be bilingual and to do cross-language research. I argue that there is no straightforward way in which meanings can be read off from researchers’ ties to language and that being bilingual is not the same for everyone.

  6. Numerical Differentiation Methods for Computing Error Covariance Matrices in Item Response Theory Modeling: An Evaluation and a New Proposal

    Science.gov (United States)

    Tian, Wei; Cai, Li; Thissen, David; Xin, Tao

    2013-01-01

    In item response theory (IRT) modeling, the item parameter error covariance matrix plays a critical role in statistical inference procedures. When item parameters are estimated using the EM algorithm, the parameter error covariance matrix is not an automatic by-product of item calibration. Cai proposed the use of Supplemented EM algorithm for…

  7. A Differential Item Functioning (DIF) Analysis of the Communicative Participation Item Bank (CPIB): Comparing Individuals with Parkinson's Disease from the United States and New Zealand

    Science.gov (United States)

    Baylor, Carolyn; McAuliffe, Megan J.; Hughes, Louise E.; Yorkston, Kathryn; Anderson, Tim; Jiseon, Kim; Amtmann, Dagmar

    2014-01-01

    Purpose: To examine the cross-cultural applicability of the Communicative Participation Item Bank (CPIB) through a comparison of respondents with Parkinson's disease (PD) from the United States and New Zealand. Method: A total of 428 respondents--218 from the United States and 210 from New Zealand-completed the self-report CPIB and a series of…

  8. A comparison of discriminant logistic regression and Item Response Theory Likelihood-Ratio Tests for Differential Item Functioning (IRTLRDIF) in polytomous short tests.

    Science.gov (United States)

    Hidalgo, María D; López-Martínez, María D; Gómez-Benito, Juana; Guilera, Georgina

    2016-01-01

    Short scales are typically used in the social, behavioural and health sciences. This is relevant since test length can influence whether items showing DIF are correctly flagged. This paper compares the relative effectiveness of discriminant logistic regression (DLR) and IRTLRDIF for detecting DIF in polytomous short tests. A simulation study was designed. Test length, sample size, DIF amount and item response categories number were manipulated. Type I error and power were evaluated. IRTLRDIF and DLR yielded Type I error rates close to nominal level in no-DIF conditions. Under DIF conditions, Type I error rates were affected by test length DIF amount, degree of test contamination, sample size and number of item response categories. DLR showed a higher Type I error rate than did IRTLRDIF. Power rates were affected by DIF amount and sample size, but not by test length. DLR achieved higher power rates than did IRTLRDIF in very short tests, although the high Type I error rate involved means that this result cannot be taken into account. Test length had an important impact on the Type I error rate. IRTLRDIF and DLR showed a low power rate in short tests and with small sample sizes.

  9. Using response-time constraints in item selection to control for differential speededness in computerized adaptive testing

    NARCIS (Netherlands)

    van der Linden, Willem J.; Scrams, David J.; Schnipke, Deborah L.

    2003-01-01

    This paper proposes an item selection algorithm that can be used to neutralize the effect of time limits in computer adaptive testing. The method is based on a statistical model for the response-time distributions of the test takers on the items in the pool that is updated each time a new item has

  10. Funcionamiento diferencial del item en la evaluación internacional PISA. Detección y comprensión. [Differential Item Functioning in the PISA Project: Detection and Understanding

    Directory of Open Access Journals (Sweden)

    Paula Elosua

    2006-08-01

    Full Text Available This report analyses the differential item functioning (DIF in the Programme for Indicators of Student Achievement PISA2000. The items studied are coming from the Reading Comprehension Test. We analyzed the released items from this year because we wanted to join the detection of DIF and its understanding. The reference group is the sample of United Kingdom and the focal group is the Spanish sample. The procedures of detection are Mantel-Haenszel, Logistic Regression and the standardized mean difference, and their extensions for polytomous items. Two items were flagged and the post-hoc analysis didn’t explain the causes of DIF entirely. Este trabajo analiza el funcionamiento diferencial del ítem (FDI de la prueba de comprensión lectora de la evaluación PISA2000 entre la muestras del Reino Unido y España. Se estudian los ítems liberados con el fin de aunar las fases de detección del FDI con la comprensión de sus causas. En la fase de detección se comparan los resultados de los procedimientos Mantel-Haenszel, Regresión Logística y Medias Estandarizadas en sus versiones para ítems dicotómicos y politómicos. Los resultados muestran que dos ítems presentan funcionamiento diferencial aunque el estudio post-hoc llevado a cabo sobre su contenido no ha podido precisar sus causas.

  11. The differential item functioning and structural equivalence of a nonverbal cognitive ability test for five language groups

    Directory of Open Access Journals (Sweden)

    Pieter Schaap

    2011-10-01

    Research purpose: The aim of the study was to determine the differential item functioning (DIF and structural equivalence of a nonverbal cognitive ability test (the PiB/SpEEx Observance test [401] for five South African language groups. Motivation for study: Cultural and language group sensitive tests can lead to unfair discrimination and is a contentious workplace issue in South Africa today. Misconceptions about psychometric testing in industry can cause tests to lose credibility if industries do not use a scientifically sound test-by-test evaluation approach. Research design, approach and method: The researcher used a quasi-experimental design and factor analytic and logistic regression techniques to meet the research aims. The study used a convenience sample drawn from industry and an educational institution. Main findings: The main findings of the study show structural equivalence of the test at a holistic level and nonsignificant DIF effect sizes for most of the comparisons that the researcher made. Practical/managerial implications: This research shows that the PIB/SpEEx Observance Test (401 is not completely language insensitive. One should see it rather as a language-reduced test when people from different language groups need testing. Contribution/value-add: The findings provide supporting evidence that nonverbal cognitive tests are plausible alternatives to verbal tests when one compares people from different language groups.

  12. Gender-based Differential Item Functioning in the Application of the Theory of Planned Behavior for the Study of Entrepreneurial Intentions

    Science.gov (United States)

    Zampetakis, Leonidas A.; Bakatsaki, Maria; Litos, Charalambos; Kafetsios, Konstantinos G.; Moustakis, Vassilis

    2017-01-01

    Over the past years the percentage of female entrepreneurs has increased, yet it is still far below of that for males. Although various attempts have been made to explain differences in mens’ and women’s entrepreneurial attitudes and intentions, the extent to which those differences are due to self-report biases has not been yet considered. The present study utilized Differential Item Functioning (DIF) to compare men and women’s reporting on entrepreneurial intentions. DIF occurs in situations where members of different groups show differing probabilities of endorsing an item despite possessing the same level of the ability that the item is intended to measure. Drawing on the theory of planned behavior (TPB), the present study investigated whether constructs such as entrepreneurial attitudes, perceived behavioral control, subjective norms and intention would show gender differences and whether these gender differences could be explained by DIF. Using DIF methods on a dataset of 1800 Greek participants (50.4% female) indicated that differences at the item-level are almost non-existent. Moreover, the differential test functioning (DTF) analysis, which allows assessing the overall impact of DIF effects with all items being taken into account simultaneously, suggested that the effect of DIF across all the items for each scale was negligible. Future research should consider that measurement invariance can be assumed when using TPB constructs for the study of entrepreneurial motivation independent of gender. PMID:28386244

  13. Gender-based Differential Item Functioning in the Application of the Theory of Planned Behavior for the Study of Entrepreneurial Intentions.

    Science.gov (United States)

    Zampetakis, Leonidas A; Bakatsaki, Maria; Litos, Charalambos; Kafetsios, Konstantinos G; Moustakis, Vassilis

    2017-01-01

    Over the past years the percentage of female entrepreneurs has increased, yet it is still far below of that for males. Although various attempts have been made to explain differences in mens' and women's entrepreneurial attitudes and intentions, the extent to which those differences are due to self-report biases has not been yet considered. The present study utilized Differential Item Functioning (DIF) to compare men and women's reporting on entrepreneurial intentions. DIF occurs in situations where members of different groups show differing probabilities of endorsing an item despite possessing the same level of the ability that the item is intended to measure. Drawing on the theory of planned behavior (TPB), the present study investigated whether constructs such as entrepreneurial attitudes, perceived behavioral control, subjective norms and intention would show gender differences and whether these gender differences could be explained by DIF. Using DIF methods on a dataset of 1800 Greek participants (50.4% female) indicated that differences at the item-level are almost non-existent. Moreover, the differential test functioning (DTF) analysis, which allows assessing the overall impact of DIF effects with all items being taken into account simultaneously, suggested that the effect of DIF across all the items for each scale was negligible. Future research should consider that measurement invariance can be assumed when using TPB constructs for the study of entrepreneurial motivation independent of gender.

  14. Analysis of Nonequivalent Assessments across Different Linguistic Groups Using a Mixed Methods Approach: Understanding the Causes of Differential Item Functioning by Cognitive Interviewing

    Science.gov (United States)

    Benítez, Isabel; Padilla, José-Luis

    2014-01-01

    Differential item functioning (DIF) can undermine the validity of cross-lingual comparisons. While a lot of efficient statistics for detecting DIF are available, few general findings have been found to explain DIF results. The objective of the article was to study DIF sources by using a mixed method design. The design involves a quantitative phase…

  15. Differential item functioning (DIF) in the EORTC QLQ-C30: a comparison of baseline, on-treatment and off-treatment data

    DEFF Research Database (Denmark)

    Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.

    2009-01-01

    Differential item functioning (DIF) analyses can be used to explore translation, cultural, gender or other differences in the performance of quality of life (QoL) instruments. These analyses are commonly performed using "baseline" or pretreatment data. We previously reported DIF analyses to examine...

  16. Fitting a Mixture Rasch Model to English as a Foreign Language Listening Tests: The Role of Cognitive and Background Variables in Explaining Latent Differential Item Functioning

    Science.gov (United States)

    Aryadoust, Vahid

    2015-01-01

    The present study uses a mixture Rasch model to examine latent differential item functioning in English as a foreign language listening tests. Participants (n = 250) took a listening and lexico-grammatical test and completed the metacognitive awareness listening questionnaire comprising problem solving (PS), planning and evaluation (PE), mental…

  17. Differential Item Functioning in While-Listening Performance Tests: The Case of the International English Language Testing System (IELTS) Listening Module

    Science.gov (United States)

    Aryadoust, Vahid

    2012-01-01

    This article investigates a version of the International English Language Testing System (IELTS) listening test for evidence of differential item functioning (DIF) based on gender, nationality, age, and degree of previous exposure to the test. Overall, the listening construct was found to be underrepresented, which is probably an important cause…

  18. Differential Item Functioning in the SF-36 Physical Functioning and Mental Health Sub-Scales: A Population-Based Investigation in the Canadian Multicentre Osteoporosis Study.

    Science.gov (United States)

    Lix, Lisa M; Wu, Xiuyun; Hopman, Wilma; Mayo, Nancy; Sajobi, Tolulope T; Liu, Juxin; Prior, Jerilynn C; Papaioannou, Alexandra; Josse, Robert G; Towheed, Tanveer E; Davison, K Shawn; Sawatzky, Richard

    2016-01-01

    Self-reported health status measures, like the Short Form 36-item Health Survey (SF-36), can provide rich information about the overall health of a population and its components, such as physical, mental, and social health. However, differential item functioning (DIF), which arises when population sub-groups with the same underlying (i.e., latent) level of health have different measured item response probabilities, may compromise the comparability of these measures. The purpose of this study was to test for DIF on the SF-36 physical functioning (PF) and mental health (MH) sub-scale items in a Canadian population-based sample. Study data were from the prospective Canadian Multicentre Osteoporosis Study (CaMos), which collected baseline data in 1996-1997. DIF was tested using a multiple indicators multiple causes (MIMIC) method. Confirmatory factor analysis defined the latent variable measurement model for the item responses and latent variable regression with demographic and health status covariates (i.e., sex, age group, body weight, self-perceived general health) produced estimates of the magnitude of DIF effects. The CaMos cohort consisted of 9423 respondents; 69.4% were female and 51.7% were less than 65 years. Eight of 10 items on the PF sub-scale and four of five items on the MH sub-scale exhibited DIF. Large DIF effects were observed on PF sub-scale items about vigorous and moderate activities, lifting and carrying groceries, walking one block, and bathing or dressing. On the MH sub-scale items, all DIF effects were small or moderate in size. SF-36 PF and MH sub-scale scores were not comparable across population sub-groups defined by demographic and health status variables due to the effects of DIF, although the magnitude of this bias was not large for most items. We recommend testing and adjusting for DIF to ensure comparability of the SF-36 in population-based investigations.

  19. Beneficial effects of semantic memory support on older adults' episodic memory: Differential patterns of support of item and associative information.

    Science.gov (United States)

    Mohanty, Praggyan Pam; Naveh-Benjamin, Moshe; Ratneshwar, Srinivasan

    2016-02-01

    The effects of two types of semantic memory support-meaningfulness of an item and relatedness between items-in mitigating age-related deficits in item and associative, memory are examined in a marketing context. In Experiment 1, participants studied less (vs. more) meaningful brand logo graphics (pictures) paired with meaningful brand names (words) and later were assessed by item (old/new) and associative (intact/recombined) memory recognition tests. Results showed that meaningfulness of items eliminated age deficits in item memory, while equivalently boosting associative memory for older and younger adults. Experiment 2, in which related and unrelated brand logo graphics and brand name pairs served as stimuli, revealed that relatedness between items eliminated age deficits in associative memory, while improving to the same degree item memory in older and younger adults. Experiment 2 also provided evidence for a probable boundary condition that could reconcile seemingly contradictory extant results. Overall, these experiments provided evidence that although the two types of semantic memory support can improve both item and associative memory in older and younger adults, older adults' memory deficits can be eliminated when the type of support provided is compatible with the type of information required to perform well on the test. (c) 2016 APA, all rights reserved).

  20. Spanish translation and cross-language validation of a sleep habits questionnaire for use in clinical and research settings.

    Science.gov (United States)

    Baldwin, Carol M; Choi, Myunghan; McClain, Darya Bonds; Celaya, Alma; Quan, Stuart F

    2012-04-15

    To translate, back-translate and cross-language validate (English/Spanish) the Sleep Heart Health Study Sleep Habits Questionnaire for use with Spanish-speakers in clinical and research settings. Following rigorous translation and back-translation, this cross-sectional cross-language validation study recruited bilingual participants from academic, clinic, and community-based settings (N = 50; 52% women; mean age 38.8 ± 12 years; 90% of Mexican heritage). Participants completed English and Spanish versions of the Sleep Habits Questionnaire, the Epworth Sleepiness Scale, and the Acculturation Rating Scale for Mexican Americans II one week apart in randomized order. Psychometric properties were assessed, including internal consistency, convergent validity, scale equivalence, language version intercorrelations, and exploratory factor analysis using PASW (Version18) software. Grade level readability of the sleep measure was evaluated. All sleep categories (duration, snoring, apnea, insomnia symptoms, other sleep symptoms, sleep disruptors, restless legs syndrome) showed Cronbach α, Spearman-Brown coefficients and intercorrelations ≥ 0.700, suggesting robust internal consistency, correlation, and agreement between language versions. The Epworth correlated significantly with snoring, apnea, sleep symptoms, restless legs, and sleep disruptors) on both versions, supporting convergent validity. Items loaded on 4 factors accounted for 68% and 67% of the variance on the English and Spanish versions, respectively. The Spanish-language Sleep Habits Questionnaire demonstrates conceptual and content equivalency. It has appropriate measurement properties and should be useful for assessing sleep health in community-based clinics and intervention studies among Spanish-speaking Mexican Americans. Both language versions showed readability at the fifth grade level. Further testing is needed with larger samples.

  1. Gender Differences in Scientific Literacy of HKPISA 2006: A Multidimensional Differential Item Functioning and Multilevel Mediation Study

    Science.gov (United States)

    Wong, Kwan Yin

    The aim of this study is to investigate the effect of gender differences of 15-year-old students on scientific literacy and their impacts on students’ motivation to pursue science education and careers (Future-oriented Science Motivation) in Hong Kong. The data for this study was collected from the Program for International Student Assessment in Hong Kong (HKPISA). It was carried out in 2006. A total of 4,645 students were randomly selected from 146 secondary schools including government, aided and private schools by two-stage stratified sampling method for the assessment. HKPISA 2006, like most of other large-scale international assessments, presents its assessment frameworks in multidimensional subscales. To fulfill the requirements of this multidimensional assessment framework, this study deployed new approaches to model and investigate gender differences in cognitive and affective latent traits of scientific literacy by using multidimensional differential item functioning (MDIF) and multilevel mediation (MLM). Compared with mean score difference t-test, MDIF improves the precision of each subscales measure at item level and the gender differences in science performance can be accurately estimated. In the light of Eccles et al (1983) Expectancy-value Model of Achievement-related Choices (Eccles’ Model), MLM examines the pattern of gender effects on Future-oriented Science Motivation mediated through cognitive and affective factors. As for MLM investigation, Single-Group Confirmatory Factor Analysis (Single-Group CFA) was used to confirm the applicability and validity of six affective factors which was, originally prepared by OECD. These six factors are Science Self-concept, Personal Value of Science, Interest in Science Learning, Enjoyment of Science Learning, Instrumental Motivation to Learn Science and Future-oriented Science Motivation. Then, Multiple Group CFA was used to verify measurement invariance of these factors across gender groups. The results of

  2. A Psychometric Evaluation of the DSM-IV Criteria for Antisocial Personality Disorder: Dimensionality, Local Reliability, and Differential Item Functioning Across Gender.

    Science.gov (United States)

    Paap, Muirne C S; Braeken, Johan; Pedersen, Geir; Urnes, Øyvind; Karterud, Sigmund; Wilberg, Theresa; Hummelen, Benjamin

    2017-12-01

    This study aims at evaluating the psychometric properties of the antisocial personality disorder (ASPD) criteria in a large sample of patients, most of whom had one or more personality disorders (PD). PD diagnoses were assessed by experienced clinicians using the Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders, 4th edition, Axis II PDs. Analyses were performed within an item response theory framework. Results of the analyses indicated that ASPD is a unidimensional construct that can be measured reliably at the upper range of the latent trait scale. Differential item functioning across gender was restricted to two criteria and had little impact on the latent ASPD trait level. Patients fulfilling both the adult ASPD criteria and the conduct disorder criteria had similar latent trait distributions as patients fulfilling only the adult ASPD criteria. Overall, the ASPD items fit the purpose of a diagnostic instrument well, that is, distinguishing patients with moderate from those with high antisocial personality scores.

  3. Quality of life in infants and children with atopic dermatitis: Addressing issues of differential item functioning across countries in multinational clinical trials

    Directory of Open Access Journals (Sweden)

    Tennant Alan

    2007-07-01

    Full Text Available Abstract Background A previous study had identified 45 items assessing the impact of atopic dermatitis (AD on the whole family. From these it was intended to develop two separate scales, one assessing impact on carers and the other determining the effect on the child. Methods The 45 items were included in three clinical trials designed to test the efficacy of a new topical treatment (pimecrolimus, Elidel cream 1% in the treatment of AD in infants and children and in validation studies in the UK, US, Germany, France and the Netherlands. Rasch analyses were undertaken to determine whether an internationally valid, unidimensional scale could be developed that would inform on the direct impact of AD on the child. Results Rasch analyses applied to the data from the trials indicated that the draft measure consisted of two scales, one assessing the QoL of the carer and the other (consisting of 12 items measuring the impact of AD on the child. Three of the 12 potential items failed to fit the measurement model in Europe and five in the US. In addition, four items exhibiting differential item functioning (DIF by country were identified. After removing the misfitting items and controlling for DIF it was possible to derive a scale; The Childhood Impact of Atopic Dermatitis (CIAD with good item fit for each trial analysis. Analysis of the validation data from each of the different countries confirmed that the CIAD had adequate internal consistency, reproducibility and construct validity. The CIAD demonstrated the benefits of treatment with Elidel over placebo in the European trial. A similar (non-significant trend was found for the US trials. Conclusion The study represents a novel method of dealing with the problem of DIF associated with different cultures. Such problems are likely to arise in any multinational study involving patient-reported outcome measures, as items in the scales are likely to be valued differently in different cultures. However, where

  4. Funcionamento diferencial de itens para avaliar a agressividade de universitários Differential items functioning to assess aggressiveness in college students

    Directory of Open Access Journals (Sweden)

    Fermino Fernandes Sisto

    2008-01-01

    Full Text Available Nesta pesquisa buscou-se evidência de validade de construto relacionada ao funcionamento dos itens para diferenciar sexos em um instrumento de agressividade. Participaram 445 universitários, de ambos os sexos, dos cursos de Engenharia, Computação e Psicologia. A escala de agressividade composta por 81 itens foi aplicada coletivamente, em sala de aula, nos estudantes que consentiram em participar do estudo. Os itens do instrumento foram analisados por meio do modelo Rasch. Vinte e oito itens apresentaram funcionamento diferencial, sendo 15 condutas mais características de pessoas do sexo feminino e outras 13 mais características do masculino. Os índices de precisão foram de 0,99 para os itens e 0,86 para as pessoas. Conclui-se que a agressividade pode ser medida separadamente em razão do sexo.In this research evidences of construct validity were searched analyzing the differential functioning items related to aggressiveness. The participants were 445 college students of both genders, attending the courses of Engineering, Computing and Psychology. The scale of aggressiveness composed by 81 items was collectively applied, in the classroom, to the students who consented to participate in the study. The items of the instrument were studied by means of the Rasch model. Twenty-eight items presented differential functioning item, 15 were characterized as typical for females and 13 for males. The reliability coefficients were 0.99 to the items and 0.86 to the persons. It was concluded that the aggressiveness can be measured separately on the basis of gender.

  5. DISC Predictive Scales (DPS): Factor Structure and Uniform Differential Item Functioning Across Gender and Three Racial/Ethnic Groups for ADHD, Conduct Disorder, and Oppositional Defiant Disorder Symptoms

    OpenAIRE

    Wiesner, Margit; Kanouse, David E.; Elliott, Marc N.; Windle, Michael; Schuster, Mark A.

    2015-01-01

    The factor structure and potential uniform differential item functioning (DIF) among gender and three racial/ethnic groups of adolescents (African American, Latino, White) were evaluated for attention deficit/hyperactivity disorder (ADHD), conduct disorder (CD), and oppositional defiant disorder (ODD) symptom scores of the DISC Predictive Scales (DPS; Leung et al., 2005; Lucas et al., 2001). Primary caregivers reported on DSM–IV ADHD, CD, and ODD symptoms for a probability sample of 4,491 chi...

  6. Differential Item Functioning in the SF-36 Physical Functioning and Mental Health Sub-Scales: A Population-Based Investigation in the Canadian Multicentre Osteoporosis Study.

    Directory of Open Access Journals (Sweden)

    Lisa M Lix

    Full Text Available Self-reported health status measures, like the Short Form 36-item Health Survey (SF-36, can provide rich information about the overall health of a population and its components, such as physical, mental, and social health. However, differential item functioning (DIF, which arises when population sub-groups with the same underlying (i.e., latent level of health have different measured item response probabilities, may compromise the comparability of these measures. The purpose of this study was to test for DIF on the SF-36 physical functioning (PF and mental health (MH sub-scale items in a Canadian population-based sample.Study data were from the prospective Canadian Multicentre Osteoporosis Study (CaMos, which collected baseline data in 1996-1997. DIF was tested using a multiple indicators multiple causes (MIMIC method. Confirmatory factor analysis defined the latent variable measurement model for the item responses and latent variable regression with demographic and health status covariates (i.e., sex, age group, body weight, self-perceived general health produced estimates of the magnitude of DIF effects.The CaMos cohort consisted of 9423 respondents; 69.4% were female and 51.7% were less than 65 years. Eight of 10 items on the PF sub-scale and four of five items on the MH sub-scale exhibited DIF. Large DIF effects were observed on PF sub-scale items about vigorous and moderate activities, lifting and carrying groceries, walking one block, and bathing or dressing. On the MH sub-scale items, all DIF effects were small or moderate in size.SF-36 PF and MH sub-scale scores were not comparable across population sub-groups defined by demographic and health status variables due to the effects of DIF, although the magnitude of this bias was not large for most items. We recommend testing and adjusting for DIF to ensure comparability of the SF-36 in population-based investigations.

  7. Methods and models for quantative assessment of speech intelligibility in cross-language communication

    NARCIS (Netherlands)

    Wijngaarden, S.J. van; Steeneken, H.J.M.; Houtgast, T.

    2001-01-01

    To deal with the effects of nonnative speech communication on speech intelligibility, one must know the magnitude of these effects. To measure this magnitude, suitable test methods must be available. Many of the methods used in cross-language speech communication research are not very suitable for

  8. A Domain Specific Lexicon Acquisition Tool for Cross-Language Information Retrieval

    NARCIS (Netherlands)

    Hiemstra, Djoerd; de Jong, Franciska M.G.; Kraaij, Wessel

    1997-01-01

    With the recent enormous increase of information dissemination via the web as incentive there is a growing interest in supporting tools for cross-language retrieval. In this paper we describe a disclosure and retrieval approach that fulfils the needs of both information providers and users by

  9. Cross-language activation in children's speech production: Evidence from second language learners, bilinguals, and trilinguals

    NARCIS (Netherlands)

    Poarch, G.J.; Hell, J.G. van

    2012-01-01

    In five experiments, we examined cross-language activation during speech production in various groups of bilinguals and trilinguals who differed in nonnative language proficiency, language learning background, and age. In Experiments 1, 2, 3, and 5, German 5- to 8-year-old second language learners

  10. Cross-language activation in same-script and different-script trilinguals

    NARCIS (Netherlands)

    Poarch, G.J.; Hell, J.G. van

    2014-01-01

    In a picture naming study, we examined cross-language activation during speech production in three groups of trilinguals: L3-immersed German-English-Dutch, non-L3-immersed Dutch-English-German, and L3-immersed Russian-English-German trilinguals. All trilinguals named pictures with cognate and

  11. Acquisition of compound words in Chinese-English bilingual children: Decomposition and cross-language activation

    NARCIS (Netherlands)

    Cheng, C.; Wang, M.; Perfetti, C.A.

    2011-01-01

    This study investigated compound processing and cross-language activation in a group of Chinese–English bilingual children, and they were divided into four groups based on the language proficiency levels in their two languages. A lexical decision task was designed using compound words in both

  12. Twenty-One at TREC-7: ad-hoc and cross-language track

    NARCIS (Netherlands)

    Hiemstra, Djoerd; Kraaij, Wessel; Voorhees, E.M; Harman, D.K.

    1999-01-01

    This paper describes the official runs of the Twenty-One group for TREC-7. The Twenty-One group participated in the ad-hoc and the cross-language track and made the following accomplishments: We developed a new weighting algorithm, which outperforms the popular Cornell version of BM25 on the ad-hoc

  13. Cross-Language Activation in Children's Speech Production: Evidence from Second Language Learners, Bilinguals, and Trilinguals

    Science.gov (United States)

    Poarch, Gregory J.; van Hell, Janet G.

    2012-01-01

    In five experiments, we examined cross-language activation during speech production in various groups of bilinguals and trilinguals who differed in nonnative language proficiency, language learning background, and age. In Experiments 1, 2, 3, and 5, German 5- to 8-year-old second language learners of English, German-English bilinguals,…

  14. Differential Item Functioning Analysis Using a Mixture 3-Parameter Logistic Model with a Covariate on the TIMSS 2007 Mathematics Test

    Science.gov (United States)

    Choi, Youn-Jeng; Alexeev, Natalia; Cohen, Allan S.

    2015-01-01

    The purpose of this study was to explore what may be contributing to differences in performance in mathematics on the Trends in International Mathematics and Science Study 2007. This was done by using a mixture item response theory modeling approach to first detect latent classes in the data and then to examine differences in performance on items…

  15. Cross-cultural differences in knee functional status outcomes in a polyglot society represented true disparities not biased by differential item functioning.

    Science.gov (United States)

    Deutscher, Daniel; Hart, Dennis L; Crane, Paul K; Dickstein, Ruth

    2010-12-01

    Comparative effectiveness research across cultures requires unbiased measures that accurately detect clinical differences between patient groups. The purpose of this study was to assess the presence and impact of differential item functioning (DIF) in knee functional status (FS) items administered using computerized adaptive testing (CAT) as a possible cause for observed differences in outcomes between 2 cultural patient groups in a polyglot society. This study was a secondary analysis of prospectively collected data. We evaluated data from 9,134 patients with knee impairments from outpatient physical therapy clinics in Israel. Items were analyzed for DIF related to sex, age, symptom acuity, surgical history, exercise history, and language used to complete the functional survey (Hebrew versus Russian). Several items exhibited DIF, but unadjusted FS estimates and FS estimates that accounted for DIF were essentially equal (intraclass correlation coefficient [2,1]>.999). No individual patient had a difference between unadjusted and adjusted FS estimates as large as the median standard error of the unadjusted estimates. Differences between groups defined by any of the covariates considered were essentially unchanged when using adjusted instead of unadjusted FS estimates. The greatest group-level impact was <0.3% of 1 standard deviation of the unadjusted FS estimates. Complete data where patients answered all items in the scale would have been preferred for DIF analysis, but only CAT data were available. Differences in FS outcomes between groups of patients with knee impairments who answered the knee CAT in Hebrew or Russian in Israel most likely reflected true differences that may reflect societal disparities in this health outcome.

  16. Compounds in dictionary-based Cross-language information retrieval_revised

    Directory of Open Access Journals (Sweden)

    2002-01-01

    Full Text Available Compound words form an important part of natural language. From the cross-lingual information retrieval (CLIR point of view it is important that many natural languages are highly productive with compounds, and translation resources cannot include entries for all compounds. Also, compounds are often content bearing words in a sentence. In Swedish, German and Finnish roughly one tenth of the words in a text prepared for information retrieval purposes are compounds. Important research questions concerning compound handling in dictionary-based cross-language information retrieval are 1 compound splitting into components, 2 normalisation of components, 3 translation of components and 4 query structuring for compounds and their components in the target language. The impact of compound processing on the performance of the cross-language information retrieval process is evaluated in this study and the results indicate that the effect is clearly positive.

  17. Knowledge Graphs as Context Models: Improving the Detection of Cross-Language Plagiarism with Paraphrasing

    OpenAIRE

    Franco-Salvador, Marc; Gupta, Parth; Rosso, Paolo

    2013-01-01

    Cross-language plagiarism detection attempts to identify and extract automatically plagiarism among documents in different languages. Plagiarized fragments can be translated verbatim copies or may alter their structure to hide the copying, which is known as paraphrasing and is more difficult to detect. In order to improve the paraphrasing detection, we use a knowledge graph-based approach to obtain and compare context models of document fragments in different languages. Experimental results i...

  18. Cross-Language Plagiarism Detection System Using Latent Semantic Analysis and Learning Vector Quantization

    Directory of Open Access Journals (Sweden)

    Anak Agung Putri Ratna

    2017-06-01

    Full Text Available Computerized cross-language plagiarism detection has recently become essential. With the scarcity of scientific publications in Bahasa Indonesia, many Indonesian authors frequently consult publications in English in order to boost the quantity of scientific publications in Bahasa Indonesia (which is currently rising. Due to the syntax disparity between Bahasa Indonesia and English, most of the existing methods for automated cross-language plagiarism detection do not provide satisfactory results. This paper analyses the probability of developing Latent Semantic Analysis (LSA for a computerized cross-language plagiarism detector for two languages with different syntax. To improve performance, various alterations in LSA are suggested. By using a linear vector quantization (LVQ classifier in the LSA and taking into account the Frobenius norm, output has reached up to 65.98% in accuracy. The results of the experiments showed that the best accuracy achieved is 87% with a document size of 6 words, and the document definition size must be kept below 10 words in order to maintain high accuracy. Additionally, based on experimental results, this paper suggests utilizing the frequency occurrence method as opposed to the binary method for the term–document matrix construction.

  19. DISC Predictive Scales (DPS): Factor structure and uniform differential item functioning across gender and three racial/ethnic groups for ADHD, conduct disorder, and oppositional defiant disorder symptoms.

    Science.gov (United States)

    Wiesner, Margit; Windle, Michael; Kanouse, David E; Elliott, Marc N; Schuster, Mark A

    2015-12-01

    The factor structure and potential uniform differential item functioning (DIF) among gender and three racial/ethnic groups of adolescents (African American, Latino, White) were evaluated for attention deficit/hyperactivity disorder (ADHD), conduct disorder (CD), and oppositional defiant disorder (ODD) symptom scores of the DISC Predictive Scales (DPS; Leung et al., 2005; Lucas et al., 2001). Primary caregivers reported on DSM-IV ADHD, CD, and ODD symptoms for a probability sample of 4,491 children from three geographical regions who took part in the Healthy Passages study (mean age = 12.60 years, SD = 0.66). Confirmatory factor analysis indicated that the expected 3-factor structure was tenable for the data. Multiple indicators multiple causes (MIMIC) modeling revealed uniform DIF for three ADHD and 9 ODD item scores, but not for any of the CD item scores. Uniform DIF was observed predominantly as a function of child race/ethnicity, but minimally as a function of child gender. On the positive side, uniform DIF had little impact on latent mean differences of ADHD, CD, and ODD symptomatology among gender and racial/ethnic groups. Implications of the findings for researchers and practitioners are discussed. (c) 2015 APA, all rights reserved).

  20. Citation-based plagiarism detection detecting disguised and cross-language plagiarism using citation pattern analysis

    CERN Document Server

    Gipp, Bela

    2014-01-01

    Plagiarism is a problem with far-reaching consequences for the sciences. However, even today's best software-based systems can only reliably identify copy & paste plagiarism. Disguised plagiarism forms, including paraphrased text, cross-language plagiarism, as well as structural and idea plagiarism often remain undetected. This weakness of current systems results in a large percentage of scientific plagiarism going undetected. Bela Gipp provides an overview of the state-of-the art in plagiarism detection and an analysis of why these approaches fail to detect disguised plagiarism forms. The aut

  1. Designing and Implementing a Cross-Language Information Retrieval System Using Linguistic Corpora

    Directory of Open Access Journals (Sweden)

    Amin Nezarat

    2012-03-01

    Full Text Available Information retrieval (IR is a crucial area of natural language processing (NLP and can be defined as finding documents whose content is relevant to the query need of a user. Cross-language information retrieval (CLIR refers to a kind of information retrieval in which the language of the query and that of searched document are different. In fact, it is a retrieval process where the user presents queries in one language to retrieve documents in another language. This paper tried to construct a bilingual lexicon of parallel chunks of English and Persian from two very large monolingual corpora an English-Persian parallel corpus which could be directly applied to cross-language information retrieval tasks. For this purpose, a statistical measure known as Association Score (AS was used to compute the association value between every two corresponding chunks in the corpus using a couple of complicated algorithms. Once the CLIR system was developed using this bilingual lexicon, an experiment was performed on a set of one hundred English and Persian phrases and collocations to see to what extend this system was effective in assisting the users find the most relevant and suitable equivalents of their queries in either language.

  2. The social and community opportunities profile social inclusion measure: Structural equivalence and differential item functioning in community mental health residents in Hong Kong and the United Kingdom.

    Science.gov (United States)

    Huxley, Peter John; Chan, Kara; Chiu, Marcus; Ma, Yanni; Gaze, Sarah; Evans, Sherrill

    2016-03-01

    China's future major health problem will be the management of chronic diseases - of which mental health is a major one. An instrument is needed to measure mental health inclusion outcomes for mental health services in Hong Kong and mainland China as they strive to promote a more inclusive society for their citizens and particular disadvantaged groups. To report on the analysis of structural equivalence and item differentiation in two mentally unhealthy and one healthy sample in the United Kingdom and Hong Kong. The mental health sample in Hong Kong was made up of non-governmental organisation (NGO) referrals meeting the selection/exclusion criteria (being well enough to be interviewed, having a formal psychiatric diagnosis and living in the community). A similar sample in the United Kingdom meeting the same selection criteria was obtained from a community mental health organisation, equivalent to the NGOs in Hong Kong. Exploratory factor analysis and logistic regression were conducted. The single-variable, self-rated 'overall social inclusion' differs significantly between all of the samples, in the way we would expect from previous research, with the healthy population feeling more included than the serious mental illness (SMI) groups. In the exploratory factor analysis, the first two factors explain between a third and half of the variance, and the single variable which enters into all the analyses in the first factor is having friends to visit the home. All the regression models were significant; however, in Hong Kong sample, only one-fifth of the total variance is explained. The structural findings imply that the social and community opportunities profile-Chinese version (SCOPE-C) gives similar results when applied to another culture. As only one-fifth of the variance of 'overall inclusion' was explained in the Hong Kong sample, it may be that the instrument needs to be refined using different or additional items within the structural domains of inclusion.

  3. ADAPTING HYBRID MACHINE TRANSLATION TECHNIQUES FOR CROSS-LANGUAGE TEXT RETRIEVAL SYSTEM

    Directory of Open Access Journals (Sweden)

    P. ISWARYA

    2017-03-01

    Full Text Available This research work aims in developing Tamil to English Cross - language text retrieval system using hybrid machine translation approach. The hybrid machine translation system is a combination of rule based and statistical based approaches. In an existing word by word translation system there are lot of issues and some of them are ambiguity, Out-of-Vocabulary words, word inflections, and improper sentence structure. To handle these issues, proposed architecture is designed in such a way that, it contains Improved Part-of-Speech tagger, machine learning based morphological analyser, collocation based word sense disambiguation procedure, semantic dictionary, and tense markers with gerund ending rules, and two pass transliteration algorithm. From the experimental results it is clear that the proposed Tamil Query based translation system achieves significantly better translation quality over existing system, and reaches 95.88% of monolingual performance.

  4. Experiments with Cross-Language Information Retrieval on a Health Portal for Psychology and Psychotherapy.

    Science.gov (United States)

    Andrenucci, Andrea

    2016-01-01

    Few studies have been performed within cross-language information retrieval (CLIR) in the field of psychology and psychotherapy. The aim of this paper is to to analyze and assess the quality of available query translation methods for CLIR on a health portal for psychology. A test base of 100 user queries, 50 Multi Word Units (WUs) and 50 Single WUs, was used. Swedish was the source language and English the target language. Query translation methods based on machine translation (MT) and dictionary look-up were utilized in order to submit query translations to two search engines: Google Site Search and Quick Ask. Standard IR evaluation measures and a qualitative analysis were utilized to assess the results. The lexicon extracted with word alignment of the portal's parallel corpus provided better statistical results among dictionary look-ups. Google Translate provided more linguistically correct translations overall and also delivered better retrieval results in MT.

  5. Lexical activation in bilinguals’ speech production is dynamic: How language ambiguous words can affect cross-language activation

    NARCIS (Netherlands)

    Hermans, D.; Ormel, E.A.; Besselaar, R. van; Hell, J.G. van

    2011-01-01

    Is the bilingual language production system a dynamic system that can operate in different language activation states? Three experiments investigated to what extent cross-language phonological co-activation effects in language production are sensitive to the composition of the stimulus list. L1

  6. The influence of cross-language similarity on within- and between-language Stroop effects in trilinguals

    NARCIS (Netherlands)

    Heuven, W.J.B. van; Conklin, K.; Coderre, E.L.; Guo, T.; Dijkstra, A.F.J.

    2011-01-01

    This study investigated effects of cross-language similarity on within- and between-language Stroop interference and facilitation in three groups of trilinguals. Trilinguals were either proficient in three languages that use the same-script (alphabetic in German–English–Dutch trilinguals), two

  7. Learning to Read Setswana and English: Cross-Language Transference of Letter Knowledge, Phonological Awareness and Word Reading Skills

    Science.gov (United States)

    Lekgoko, Olemme; Winskel, Heather

    2008-01-01

    The current study investigates how beginner readers learn to read Setswana and English, and whether there is cross-language transference of skills between these two languages. Letter knowledge, phoneme awareness and reading of words and pseudowords in both Setswana and English were assessed in 36 Grade 2 children. A complex pattern emerged.…

  8. THE CONSTRUCTION OF INDONESIAN-ENGLISH CROSS LANGUAGE PLAGIARISM DETECTION SYSTEM USING FINGERPRINTING TECHNIQUE

    Directory of Open Access Journals (Sweden)

    Zakiy Firdaus Alfikri

    2012-07-01

    Full Text Available Cross language plagiarism detection is an important task since it can protect person intellectual property right. Since English is the most popular international language, we proposed an Indonesian-English cross language plagiarism detection to handle such problem in Indonesian-English domain where the suspected plagiarism document is written in Indonesian and the source document is written in English. To minimize translation error, we build the system by translating the Indonesian document into English and then compare the translated document with the English document collection. The detection system consists of preprocess component, heuristic retrieval component, and detailed analysis component. The main technique used in retrieval process is fingerprinting which can extract lexical features from text which is suitable to be used to detect plagiarism done using literal translation method. In this paper, we also propose additional methods to be implemented in heuristic retrieval component to increase the performance of the system: phrase chunking, stop word removal, stemming, and synonym selection. We evaluated system’s performance and the effects of additional methods to system’s performance, provided several data test sets which represents a plagiarism type. From the experiments, we concluded that the system works on 83.33% of test cases. We also concluded that mainly all additional methods except the phrase chunking have good effects in enhancing the system accuracy. Deteksi plagiarisme lintas bahasa merupakan hal yang penting untuk melindungi hak kekayaan intelektual. Bahasa Inggris adalah bahasa internasional yang paling populer, karenanya peneliti mengusulkan deteksi plagiarisme lintas bahasa Indonesia-Inggris untuk menangani masalah tersebut di mana domain dokumen yang diduga plagiat ditulis dalam bahasa Indonesia dan dokumen sumber ditulis dalam bahasa Inggris. Untuk meminimalkan kesalahan terjemahan, peneliti membangun

  9. Cross-Language Associations in the Development of Preschoolers' Receptive and Expressive Vocabulary.

    Science.gov (United States)

    Maier, Michelle F; Bohlmann, Natalie L; Palacios, Natalia A

    The increasing population of dual language learners (DLLs) entering preschool classrooms highlights a continued need for research on the development of dual language acquisition, and specifically vocabulary skills, in this age group. This study describes young DLL children's ( N = 177) vocabulary development in both English and Spanish simultaneously, and how vocabulary skills in each language relate to one another, during a contextual shift that places greater emphasis on the acquisition of academic English language skills. Findings demonstrated that DLL preschoolers made gains in vocabulary in both languages with more change evidenced in receptive, in comparison to expressive, vocabulary as well as in English in comparison to Spanish. When examining whether children's vocabulary scores in one language at the beginning of preschool interact with their vocabulary scores in the other language to predict vocabulary growth, no significant associations were found for receptive vocabulary. In contrast, the interaction between initial English and Spanish expressive vocabulary scores was negatively related to growth in English expressive vocabulary. This cross-language association suggests that children who have low expressive vocabulary skills in both languages tend to grow faster in their English expressive vocabulary. The study extends previous work on dual language development by examining growth in expressive and receptive vocabulary in both English and Spanish. It also provides suggestions for future work to inform a more comprehensive understanding of DLL children's development in both languages.

  10. Cross-Language Associations in the Development of Preschoolers’ Receptive and Expressive Vocabulary

    Science.gov (United States)

    Maier, Michelle F.; Bohlmann, Natalie L.; Palacios, Natalia A.

    2016-01-01

    The increasing population of dual language learners (DLLs) entering preschool classrooms highlights a continued need for research on the development of dual language acquisition, and specifically vocabulary skills, in this age group. This study describes young DLL children's (N = 177) vocabulary development in both English and Spanish simultaneously, and how vocabulary skills in each language relate to one another, during a contextual shift that places greater emphasis on the acquisition of academic English language skills. Findings demonstrated that DLL preschoolers made gains in vocabulary in both languages with more change evidenced in receptive, in comparison to expressive, vocabulary as well as in English in comparison to Spanish. When examining whether children's vocabulary scores in one language at the beginning of preschool interact with their vocabulary scores in the other language to predict vocabulary growth, no significant associations were found for receptive vocabulary. In contrast, the interaction between initial English and Spanish expressive vocabulary scores was negatively related to growth in English expressive vocabulary. This cross-language association suggests that children who have low expressive vocabulary skills in both languages tend to grow faster in their English expressive vocabulary. The study extends previous work on dual language development by examining growth in expressive and receptive vocabulary in both English and Spanish. It also provides suggestions for future work to inform a more comprehensive understanding of DLL children's development in both languages. PMID:26807002

  11. Cross-language categorization of French and German vowels by naïve American listeners

    Science.gov (United States)

    Strange, Winifred; Levy, Erika S.; Law, Franzo F.

    2009-01-01

    American English (AE) speakers’ perceptual assimilation of 14 North German (NG) and 9 Parisian French (PF) vowels was examined in two studies using citation-form disyllables (study 1) and sentences with vowels surrounded by labial and alveolar consonants in multisyllabic nonsense words (study 2). Listeners categorized multiple tokens of each NG and PF vowel as most similar to selected AE vowels and rated their category “goodness” on a nine-point Likert scale. Front, rounded vowels were assimilated primarily to back AE vowels, despite their acoustic similarity to front AE vowels. In study 1, they were considered poorer exemplars of AE vowels than were NG and PF back, rounded vowels; in study 2, front and back, rounded vowels were perceived as similar to each other. Assimilation of some front, unrounded and back, rounded NG and PF vowels varied with language, speaking style, and consonantal context. Differences in perceived similarity often could not be predicted from context-specific cross-language spectral similarities. Results suggest that listeners can access context-specific, phonetic details when listening to citation-form materials, but assimilate non-native vowels on the basis of context-independent phonological equivalence categories when processing continuous speech. Results are interpreted within the Automatic Selective Perception model of speech perception. PMID:19739759

  12. Cross-language categorization of French and German vowels by naive American listeners.

    Science.gov (United States)

    Strange, Winifred; Levy, Erika S; Law, Franzo F

    2009-09-01

    American English (AE) speakers' perceptual assimilation of 14 North German (NG) and 9 Parisian French (PF) vowels was examined in two studies using citation-form disyllables (study 1) and sentences with vowels surrounded by labial and alveolar consonants in multisyllabic nonsense words (study 2). Listeners categorized multiple tokens of each NG and PF vowel as most similar to selected AE vowels and rated their category "goodness" on a nine-point Likert scale. Front, rounded vowels were assimilated primarily to back AE vowels, despite their acoustic similarity to front AE vowels. In study 1, they were considered poorer exemplars of AE vowels than were NG and PF back, rounded vowels; in study 2, front and back, rounded vowels were perceived as similar to each other. Assimilation of some front, unrounded and back, rounded NG and PF vowels varied with language, speaking style, and consonantal context. Differences in perceived similarity often could not be predicted from context-specific cross-language spectral similarities. Results suggest that listeners can access context-specific, phonetic details when listening to citation-form materials, but assimilate non-native vowels on the basis of context-independent phonological equivalence categories when processing continuous speech. Results are interpreted within the Automatic Selective Perception model of speech perception.

  13. Methodological challenges of cross-language qualitative research with South Asian communities living in the UK

    Directory of Open Access Journals (Sweden)

    Manbinder S. Sidhu

    2016-05-01

    Full Text Available Objective: We investigate (1 the influence of ethnic, gender, and age concordance with interviewers and (2 how expression of qualitative data varies between interviews delivered in English and community languages (Punjabi/Urdu with monolingual and bilingual participants across three generations of the Indian Sikh and Pakistani Muslim communities living in the UK. Methods: We analyzed and interpreted semi-structured interview transcripts that were designed to collect data about lifestyles, disease management, community practices/beliefs, and social networks. First, qualitative content analysis was applied to transcripts. Second, a framework was applied as a guide to identify cross-language illustrations where responses varied in length, expression and depth. Results: Participant responses differed by language and topic. First-generation migrants when discussing religion, culture, or family practice were far likelier to use group or community narratives and give a longer response, indicating familiarity with or importance of such issues. Ethnic and gender concordance generated greater rapport between researchers and participants centered on community values and practices. Further, open-ended questions that were less direct were better suited for first-generation migrants. Conclusion: Community-based researchers need more time to complete interviews in second languages, need to acknowledge that narratives can be contextualized in both personal and community views, and reframe questions that may lead to greater expression. Furthermore, we detail a number of recommendations with regard to validating the translation of interviews from community languages to English as well as measures for testing language proficiency.

  14. A multi-level differential item functioning analysis of trends in international mathematics and science study: Potential sources of gender and minority difference among U.S. eighth graders' science achievement

    Science.gov (United States)

    Qian, Xiaoyu

    Science is an area where a large achievement gap has been observed between White and minority, and between male and female students. The science minority gap has continued as indicated by the National Assessment of Educational Progress and the Trends in International Mathematics and Science Studies (TIMSS). TIMSS also shows a gender gap favoring males emerging at the eighth grade. Both gaps continue to be wider in the number of doctoral degrees and full professorships awarded (NSF, 2008). The current study investigated both minority and gender achievement gaps in science utilizing a multi-level differential item functioning (DIF) methodology (Kamata, 2001) within fully Bayesian framework. All dichotomously coded items from TIMSS 2007 science assessment at eighth grade were analyzed. Both gender DIF and minority DIF were studied. Multi-level models were employed to identify DIF items and sources of DIF at both student and teacher levels. The study found that several student variables were potential sources of achievement gaps. It was also found that gender DIF favoring male students was more noticeable in the content areas of physics and earth science than biology and chemistry. In terms of item type, the majority of these gender DIF items were multiple choice than constructed response items. Female students also performed less well on items requiring visual-spatial ability. Minority students performed significantly worse on physics and earth science items as well. A higher percentage of minority DIF items in earth science and biology were constructed response than multiple choice items, indicating that literacy may be the cause of minority DIF. Three-level model results suggested that some teacher variables may be the cause of DIF variations from teacher to teacher. It is essential for both middle school science teachers and science educators to find instructional methods that work more effectively to improve science achievement of both female and minority students

  15. Further differentiating item and order information in semantic memory: students' recall of words from the "CU Fight Song", Harry Potter book titles, and Scooby Doo theme song.

    Science.gov (United States)

    Overstreet, Michael F; Healy, Alice F; Neath, Ian

    2017-01-01

    University of Colorado (CU) students were tested for both order and item information in their semantic memory for the "CU Fight Song". Following an earlier study by Overstreet and Healy [(2011). Item and order information in semantic memory: Students' retention of the "CU fight song" lyrics. Memory & Cognition, 39, 251-259. doi: 10.3758/s13421-010-0018-3 ], a symmetrical bow-shaped serial position function (with both primacy and recency advantages) was found for reconstructing the order of the nine lines in the song, whereas a function with no primacy advantage was found for recalling a missing word from each line. This difference between order and item information was found even though students filled in missing words without any alternatives provided and missing words came from the beginning, middle, or end of each line. Similar results were found for CU students' recall of the sequence of Harry Potter book titles and the lyrics of the Scooby Doo theme song. These findings strengthen the claim that the pronounced serial position function in semantic memory occurs largely because of the retention of order, rather than item, information.

  16. A comparison of three methods of assessing differential item functioning (DIF) in the Hospital Anxiety Depression Scale: ordinal logistic regression, Rasch analysis and the Mantel chi-square procedure.

    Science.gov (United States)

    Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C

    2014-12-01

    It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.

  17. Why wait if you can switch? A short term testing effect in cross-language recognition.

    NARCIS (Netherlands)

    Verkoeijen, Peter; Bouwmeester, Samantha; Camp, Gino

    2018-01-01

    Taking a memory test after an initial study phase produces better long-term retention than restudying the items, a phenomenon known as the testing effect. We propose that this effect emerges because testing strengthens semantic features of items’ memory traces, whereas restudying strengthens surface

  18. Compreensão da leitura: análise do funcionamento diferencial dos itens de um Teste de Cloze Reading comprehension: differential item functioning analysis of a Cloze Test

    Directory of Open Access Journals (Sweden)

    Katya Luciane Oliveira

    2012-01-01

    Full Text Available Este estudo teve por objetivos investigar o ajuste de um Teste de Cloze ao modelo Rasch e avaliar a dificuldade na resposta ao item em razão do gênero das pessoas (DIF. Participaram da pesquisa 573 alunos das 5ª a 8ª séries do ensino fundamental de escolas públicas estaduais dos estados de São Paulo e Minas Gerais. O teste de Cloze foi aplicado de forma coletiva. A análise do instrumento evidenciou um bom ajuste ao modelo Rasch, bem como os itens foram respondidos conforme o padrão esperado, demonstrando um bom ajuste, também. Quanto ao DIF, apenas três itens indicaram diferenciar o gênero. Com base nos dados, identificou-se que houve equilíbrio nas respostas dadas pelos meninos e meninas.The objectives of the present study were to investigate the adaptation of a Cloze test to the Rasch Model as well as to evaluate the Differential Item Functioning (DIF in relation to gender. The sample was composed by 573 students from 5th to 8th grades of public schools in the state of São Paulo. The cloze test was applied collectively. The analysis of the instrument revealed its adaptation to Rash Model and that the items were responded according to the expected pattern, showing good adjustment, as well. Regarding DIF, only three items were differentiated by gender. Based on the data, results indicated a balance in the answers given by boys and girls.

  19. Developing Culturally Competent Health Knowledge: Issues of Data Analysis of Cross-Cultural, Cross-Language Qualitative Research

    Directory of Open Access Journals (Sweden)

    Jenny Hsin-Chun Tsai

    2004-12-01

    Full Text Available There is a growing awareness and interest in the development of culturally competent health knowledge. Drawing on experience using a qualitative approach to elicit information from Mandarin- or Cantonese-speaking participants for a colorectal cancer prevention study, the authors describe lessons learned through the analysis process. These lessons include benefits and drawbacks of the use of coders from the studied culture group, challenges posed by using translated data for analysis, and suitable analytic approaches and research methods for cross-cultural, cross-language qualitative research. The authors also discuss the implications of these lessons for the development of culturally competent health knowledge.

  20. The Measurement of Relevance Amount of Documents That By Using of Google cross-language retrieval About Agriculture Subject Area are Retrieved

    Directory of Open Access Journals (Sweden)

    Fatemeh Jamshidi Ghahfarokhi

    2014-02-01

    Full Text Available In this study, the relevance amount of documents has been investigated by using google cross-language retrieval tools about a agriculture subject area in cross-language retrieval form, are retrieved. For this purpose, by using Persian journals articles that have had English abstracts, Persian phrases and subject terms with their English equivalent were extracted. In three class us, thirty number of phrases and subject terms of agriculture area were extracted: First class, subject phrases that only in agriculture are used; Secondary, agriculture subject terms that in other fields are used too; Third class, agriculture subject terms that out of this field are considered as public term. Then by these phrases and terms, documents were searched, and relevance amount of search results are investigated. Results of study showed that google cross-language retrieval tools for two classes of phrases and terms, in cross-language retrieval of relevance document about agriculture subject area, aren`t succeed: one class, agriculture subject terms that in other fields are used too. other class, agriculture subject terms that out of agriculture field are considered as public term. Google cross-language retrieval tools about subject phrase and terms that only in agriculture field are used, are performance rather desirable than other two class of phrase and terms

  1. Análise do funcionamento diferencial dos itens do Exame Nacional do Estudante (ENADE de psicologia de 2006 Differential item functioning of the national student exam for psychology (ENADE 2006

    Directory of Open Access Journals (Sweden)

    Ricardo Primi

    2010-12-01

    Full Text Available Parte do Sistema Nacional de Avaliação das Instituições de Educação Superior considera o desempenho dos estudantes por meio do ENADE. Neste artigo efetuou-se uma análise dos itens da prova do ENADE de psicologia aplicada em 2006 tentando-se detectar itens com funcionamento diferencial (DIF, isto é, itens com problema de equivalência ao medir ingressantes e concluintes e estudantes de instituições públicas e privada. Analisou-se uma amostra de 26.613 estudantes ingressantes e concluintes representativa de todos os cursos do país. Empregou-se a análise de Rasch e regressão logística para se detectar o DIF. Onze itens dos 30 que compunham a prova apresentaram DIF. Dois tipos de DIF ocorreram, um tipo em itens com baixa discriminação e outro em itens com alta discriminação. O subgrupo mais relevante tende a favorecer alunos de instituições públicas. Discute-se também a questão da discriminação elevada como indicativo de DIF.Part of the National Assessment of Institutions of Higher Education considers student performance through ENADE. In this article we performed differential item function analysis of the ENADE that took place in 2006 trying to detect items with problems in measurement equivalence in the assessment of freshman and senior students and from public and private institutions. We analyzed a sample of 26,613 freshmen and seniors representative of all the courses in the country. We used the Rasch analysis and logistic regression to detect DIF. Eleven of the 30 items composing the test showed DIF. Two types of DIF were observed, one occurring in less discriminating items and the other in more discriminating items. The most relevant subgroup of items tends to favor students from public institutions. We also discuss the issue of discrimination parameter being an indicator of DIF.

  2. Classification of health webpages as expert and non expert with a reduced set of cross-language features.

    Science.gov (United States)

    Grabar, Natalia; Krivine, Sonia; Jaulent, Marie-Christine

    2007-10-11

    Making the distinction between expert and non expert health documents can help users to select the information which is more suitable for them, according to whether they are familiar or not with medical terminology. This issue is particularly important for the information retrieval area. In our work we address this purpose through stylistic corpus analysis and the application of machine learning algorithms. Our hypothesis is that this distinction can be performed on the basis of a small number of features and that such features can be language and domain independent. The used features were acquired in source corpus (Russian language, diabetes topic) and then tested on target (French language, pneumology topic) and source corpora. These cross-language features show 90% precision and 93% recall with non expert documents in source language; and 85% precision and 74% recall with expert documents in target language.

  3. How to "Save Your Skin" When Processing L2 Idioms: An Eye Movement Analysis of Idiom Transparency and Cross-Language Similarity among Bilinguals

    Science.gov (United States)

    Cieslicka, Anna B.; Heredia, Roberto R.

    2017-01-01

    The current study looks at whether bilinguals varying in language dominance show a processing advantage for idiomatic over non-idiomatic phrases and to what extent this effect is modulated by idiom transparency (i.e., the degree to which the idiom's figurative meaning can be inferred from its literal analysis) and cross-language similarity (i.e.,…

  4. Item Purification Does Not Always Improve DIF Detection: A Counterexample with Angoff's Delta Plot

    Science.gov (United States)

    Magis, David; Facon, Bruno

    2013-01-01

    Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score-based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores to get purified sets of items, unaffected by DIF. The…

  5. Assessing normative cut points through differential item functioning analysis: An example from the adaptation of the Middlesex Elderly Assessment of Mental State (MEAMS for use as a cognitive screening test in Turkey

    Directory of Open Access Journals (Sweden)

    Kutlay Sehim

    2006-03-01

    Full Text Available Abstract Background The Middlesex Elderly Assessment of Mental State (MEAMS was developed as a screening test to detect cognitive impairment in the elderly. It includes 12 subtests, each having a 'pass score'. A series of tasks were undertaken to adapt the measure for use in the adult population in Turkey and to determine the validity of existing cut points for passing subtests, given the wide range of educational level in the Turkish population. This study focuses on identifying and validating the scoring system of the MEAMS for Turkish adult population. Methods After the translation procedure, 350 normal subjects and 158 acquired brain injury patients were assessed by the Turkish version of MEAMS. Initially, appropriate pass scores for the normal population were determined through ANOVA post-hoc tests according to age, gender and education. Rasch analysis was then used to test the internal construct validity of the scale and the validity of the cut points for pass scores on the pooled data by using Differential Item Functioning (DIF analysis within the framework of the Rasch model. Results Data with the initially modified pass scores were analyzed. DIF was found for certain subtests by age and education, but not for gender. Following this, pass scores were further adjusted and data re-fitted to the model. All subtests were found to fit the Rasch model (mean item fit 0.184, SD 0.319; person fit -0.224, SD 0.557 and DIF was then found to be absent. Thus the final pass scores for all subtests were determined. Conclusion The MEAMS offers a valid assessment of cognitive state for the adult Turkish population, and the revised cut points accommodate for age and education. Further studies are required to ascertain the validity in different diagnostic groups.

  6. Assessing normative cut points through differential item functioning analysis: an example from the adaptation of the Middlesex Elderly Assessment of Mental State (MEAMS) for use as a cognitive screening test in Turkey.

    Science.gov (United States)

    Tennant, Alan; Küçükdeveci, Ayse A; Kutlay, Sehim; Elhan, Atilla H

    2006-03-23

    The Middlesex Elderly Assessment of Mental State (MEAMS) was developed as a screening test to detect cognitive impairment in the elderly. It includes 12 subtests, each having a 'pass score'. A series of tasks were undertaken to adapt the measure for use in the adult population in Turkey and to determine the validity of existing cut points for passing subtests, given the wide range of educational level in the Turkish population. This study focuses on identifying and validating the scoring system of the MEAMS for Turkish adult population. After the translation procedure, 350 normal subjects and 158 acquired brain injury patients were assessed by the Turkish version of MEAMS. Initially, appropriate pass scores for the normal population were determined through ANOVA post-hoc tests according to age, gender and education. Rasch analysis was then used to test the internal construct validity of the scale and the validity of the cut points for pass scores on the pooled data by using Differential Item Functioning (DIF) analysis within the framework of the Rasch model. Data with the initially modified pass scores were analyzed. DIF was found for certain subtests by age and education, but not for gender. Following this, pass scores were further adjusted and data re-fitted to the model. All subtests were found to fit the Rasch model (mean item fit 0.184, SD 0.319; person fit -0.224, SD 0.557) and DIF was then found to be absent. Thus the final pass scores for all subtests were determined. The MEAMS offers a valid assessment of cognitive state for the adult Turkish population, and the revised cut points accommodate for age and education. Further studies are required to ascertain the validity in different diagnostic groups.

  7. Cross-Language Modulation of Visual Attention Span: An Arabic-French-Spanish Comparison in Skilled Adult Readers.

    Science.gov (United States)

    Awadh, Faris H R; Phénix, Thierry; Antzaka, Alexia; Lallier, Marie; Carreiras, Manuel; Valdois, Sylviane

    2016-01-01

    In delineating the amount of orthographic information that can be processed in parallel during a single fixation, the visual attention (VA) span acts as a key component of the reading system. Previous studies focused on the contribution of VA span to normal and pathological reading in monolingual and bilingual children from different European languages, without direct cross-language comparison. In the current paper, we explored modulations of VA span abilities in three languages -French, Spanish, and Arabic- that differ in transparency, reading direction and writing systems. The participants were skilled adult readers who were native speakers of French, Spanish or Arabic. They were administered tasks of global and partial letter report, single letter identification and text reading. Their VA span abilities were assessed using tasks that require the processing of briefly presented five consonant strings (e.g., R S H F T). All five consonants had to be reported in global report but a single cued letter in partial report. Results showed that VA span was reduced in Arabic readers as compared to French or Spanish readers who otherwise show a similar high performance in the two report tasks. The analysis of VA span response patterns in global report showed a left-right asymmetry in all three languages. A leftward letter advantage was found in French and Spanish but a rightward advantage in Arabic. The response patterns were symmetric in partial report, regardless of the language. Last, a significant relationship was found between VA span abilities and reading speed but only for French. The overall findings suggest that the size of VA span, the shape of VA span response patterns and the VA Span-reading relationship are modulated by language-specific features.

  8. Cross-language modulation of visual attention span: An Arabic-French-Spanish comparison in adult skilled readers.

    Directory of Open Access Journals (Sweden)

    Faris Haroon Rasheed Awadh

    2016-03-01

    Full Text Available In delineating the amount of orthographic information that can be processed in parallel during a single fixation, the visual attention (VA span acts as a key component of the reading system. Previous studies focused on the contribution of VA span to normal and pathological reading in monolingual and bilingual children from different European languages, without direct cross-language comparison. In the current paper, we explored modulations of VA span abilities in three languages --French, Spanish and Arabic-- that differ in transparency, reading direction and writing systems. The participants were adult skilled readers who were native speakers of French, Spanish or Arabic. They were administered tasks of global and partial letter report, single letter identification and text reading. Their VA span abilities were assessed using tasks that require the processing of briefly presented 5 consonant strings (e.g., R S H F T. All five consonants had to be reported in global report but a single cued letter in partial report. Results showed that the VA span was reduced in Arabic readers as compared to French or Spanish readers who otherwise show a similar high performance in the two report tasks. The analysis of VA span response patterns in global report showed a left-right asymmetry in all three languages. A leftward letter advantage was found in French and Spanish but a rightward advantage in Arabic. The response patterns were symmetric in partial report, regardless of the language. Last, a significant relationship was found between visual attention span abilities and reading speed but only for French. The overall findings suggest that the size of VA span, the shape of VA span response patterns and the VA Span-reading relationship are modulated by language-specific features.

  9. Cross-language psycholinguistics

    NARCIS (Netherlands)

    Cutler, A.

    1985-01-01

    Cross-linguistic research can be of valaue to psycholinguistics by allowing tests of hypotheses the testing of which would be severely confounded in a single language, and by providing simple and readily available control conditions. For a long time the resources of this kind of research were

  10. A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

    Science.gov (United States)

    Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

    2018-04-10

    To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading .3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.

  11. The Theory of Adaptive Dispersion and Acoustic-phonetic Properties of Cross-language Lexical-tone Systems

    Science.gov (United States)

    Alexander, Jennifer Alexandra

    Lexical-tone languages use fundamental frequency (F0/pitch) to convey word meaning. About 41.8% of the world's languages use lexical tone (Maddieson, 2008), yet those systems are under-studied. I aim to increase our understanding of speech-sound inventory organization by extending to tone-systems a model of vowel-system organization, the Theory of Adaptive Dispersion (TAD) (Liljencrants and Lindblom, 1972). This is a cross-language investigation of whether and how the size of a tonal inventory affects (A) acoustic tone-space size and (B) dispersion of tone categories within the tone-space. I compared five languages with very different tone inventories: Cantonese (3 contour, 3 level tones); Mandarin (3 contour, 1 level tone); Thai (2 contour, 3 level tones); Yoruba (3 level tones only); and Igbo (2 level tones only). Six native speakers (3 female) of each language produced 18 CV syllables in isolation, with each of his/her language's tones, six times. I measured tonal F0 across the vowel at onset, midpoint, and offglide. Tone-space size was the F0 difference in semitones (ST) between each language's highest and lowest tones. Tone dispersion was the F0 distance (ST) between two tones shared by multiple languages. Following the TAD, I predicted that languages with larger tone inventories would have larger tone-spaces. Against expectations, tone-space size was fixed across level-tone languages at midpoint and offglide, and across contour-tone languages (except Thai) at offglide. However, within each language type (level-tone vs. contour-tone), languages with smaller tone inventories had larger tone spaces at onset. Tone-dispersion results were also unexpected. The Cantonese mid-level tone was further dispersed from a tonal baseline than the Yoruba mid-level tone; Cantonese mid-level tone dispersion was therefore greater than theoretically necessary. The Cantonese high-level tone was also further dispersed from baseline than the Mandarin high-level tone -- at midpoint

  12. A review of the effects on IRT item parameter estimates with a focus on misbehaving common items in test equating

    Directory of Open Access Journals (Sweden)

    Michalis P Michaelides

    2010-10-01

    Full Text Available Many studies have investigated the topic of change or drift in item parameter estimates in the context of Item Response Theory. Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.

  13. A Review of the Effects on IRT Item Parameter Estimates with a Focus on Misbehaving Common Items in Test Equating.

    Science.gov (United States)

    Michaelides, Michalis P

    2010-01-01

    Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.

  14. How to Save You Skin When Processing L2 Idioms: An Eye Movement Analysis of Idiom Transparency and Cross-language Similarity among Bilinguals

    Directory of Open Access Journals (Sweden)

    Anna Cieślicka

    2017-10-01

    Full Text Available The current study looks at whether bilinguals varying in language dominance show a processing advantage for idiomatic over non-idiomatic phrases and to what extent this effect is modulated by idiom transparency (i.e., the degree to which the idiom’s figurative meaning can be inferred from its literal analysis and cross-language similarity (i.e., the extent to which an idiom has an identical translation equivalent in another language. An eye tracking experiment was conducted in which Spanish-English bilinguals were presented with literally plausible (i.e., idioms that can be interpreted both figuratively and literally transparent (e.g., break the ice, where the figurative meaning can be deduced from analyzing the idiom literally and opaque idioms (e.g., hit the sack, where the meaning cannot be inferred from idiom constituents. Idioms varied along the dimension of cross-language similarity, with half the idioms having word for word translation equivalents in English and Spanish and another half being different, that is, having no similar counterpart in another language. Each idiom was used either in its literal (e.g., get cold feet: become coldor figurative meaning (e.g., get cold feet: become afraid. In control phrases the last word of the idiom was replaced by a carefully matched control (e.g., get cold hands. Reading measures (fixation count, first pass/gaze reading time and total reading time revealed that cross-language similarity interacts in an important way with idiom transparency, such that opaque idioms were more difficult to process than transparent ones, and different transparent idioms took faster to process than similar transparent idioms. Results are discussed with regard to the holistic vs. compositional views of idiom storage and the role of activated L1 (first language knowledge in the course of L2 (second language figurative processing.

  15. Evolution of a Test Item

    Science.gov (United States)

    Spaan, Mary

    2007-01-01

    This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…

  16. An emotional functioning item bank of 24 items for computerized adaptive testing (CAT) was established

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Gamper, Eva-Maria; Costantini, Anna

    2016-01-01

    of the widely used EORTC Quality of Life questionnaire (QLQ-C30). STUDY DESIGN AND SETTING: On the basis of literature search and evaluations by international samples of experts and cancer patients, 38 candidate items were developed. The psychometric properties of the items were evaluated in a large...... international sample of cancer patients. This included evaluations of dimensionality, item response theory (IRT) model fit, differential item functioning (DIF), and of measurement precision/statistical power. RESULTS: Responses were obtained from 1,023 cancer patients from four countries. The evaluations showed...... that 24 items could be included in a unidimensional IRT model. DIF did not seem to have any significant impact on the estimation of EF. Evaluations indicated that the CAT measure may reduce sample size requirements by up to 50% compared to the QLQ-C30 EF scale without reducing power. CONCLUSION...

  17. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate and massive objects require a longer procedure and will therefore take longer.

  18. Spare Items validation

    International Nuclear Information System (INIS)

    Fernandez Carratala, L.

    1998-01-01

    There is an increasing difficulty for purchasing safety related spare items, with certifications by manufacturers for maintaining the original qualifications of the equipment of destination. The main reasons are, on the top of the logical evolution of technology, applied to the new manufactured components, the quitting of nuclear specific production lines and the evolution of manufacturers quality systems, originally based on nuclear codes and standards, to conventional industry standards. To face this problem, for many years different Dedication processes have been implemented to verify whether a commercial grade element is acceptable to be used in safety related applications. In the same way, due to our particular position regarding the spare part supplies, mainly from markets others than the american, C.N. Trillo has developed a methodology called Spare Items Validation. This methodology, which is originally based on dedication processes, is not a single process but a group of coordinated processes involving engineering, quality and management activities. These are to be performed on the spare item itself, its design control, its fabrication and its supply for allowing its use in destinations with specific requirements. The scope of application is not only focussed on safety related items, but also to complex design, high cost or plant reliability related components. The implementation in C.N. Trillo has been mainly curried out by merging, modifying and making the most of processes and activities which were already being performed in the company. (Author)

  19. Selecting Lower Priced Items.

    Science.gov (United States)

    Kleinert, Harold L.; And Others

    1988-01-01

    A program used to teach moderately to severely mentally handicapped students to select the lower priced items in actual shopping activities is described. Through a five-phase process, students are taught to compare prices themselves as well as take into consideration variations in the sizes of containers and varying product weights. (VW)

  20. The Role of Item Models in Automatic Item Generation

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2012-01-01

    Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…

  1. Item information and discrimination functions for trinary PCM items

    NARCIS (Netherlands)

    Akkermans, Wies; Muraki, Eiji

    1997-01-01

    For trinary partial credit items the shape of the item information and the item discrimination function is examined in relation to the item parameters. In particular, it is shown that these functions are unimodal if δ2 – δ1 < 4 ln 2 and bimodal otherwise. The locations and values of the maxima are

  2. Item Banking with Embedded Standards

    Science.gov (United States)

    MacCann, Robert G.; Stanley, Gordon

    2009-01-01

    An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…

  3. Examination of the PROMIS upper extremity item bank.

    Science.gov (United States)

    Hung, Man; Voss, Maren W; Bounsanga, Jerry; Crum, Anthony B; Tyser, Andrew R

    Clinical measurement. The psychometric properties of the PROMIS v1.2 UE item bank were tested on various samples prior to its release, but have not been fully evaluated among the orthopaedic population. This study assesses the performance of the UE item bank within the UE orthopaedic patient population. The UE item bank was administered to 1197 adult patients presenting to a tertiary orthopaedic clinic specializing in hand and UE conditions and was examined using traditional statistics and Rasch analysis. The UE item bank fits a unidimensional model (outfit MNSQ range from 0.64 to 1.70) and has adequate reliabilities (person = 0.84; item = 0.82) and local independence (item residual correlations range from -0.37 to 0.34). Only one item exhibits gender differential item functioning. Most items target low levels of function. The UE item bank is a useful clinical assessment tool. Additional items covering higher functions are needed to enhance validity. Supplemental testing is recommended for patients at higher levels of function until more high function UE items are developed. 2c. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.

  4. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate, preparation of the package and related paperwork). Large and massive objects require a longer procedure and will therefore take longer.

  5. The 12-item World Health Organization Disability Assessment Schedule II (WHO-DAS II: a nonparametric item response analysis

    Directory of Open Access Journals (Sweden)

    Fernandez Ana

    2010-05-01

    Full Text Available Abstract Background Previous studies have analyzed the psychometric properties of the World Health Organization Disability Assessment Schedule II (WHO-DAS II using classical omnibus measures of scale quality. These analyses are sample dependent and do not model item responses as a function of the underlying trait level. The main objective of this study was to examine the effectiveness of the WHO-DAS II items and their options in discriminating between changes in the underlying disability level by means of item response analyses. We also explored differential item functioning (DIF in men and women. Methods The participants were 3615 adult general practice patients from 17 regions of Spain, with a first diagnosed major depressive episode. The 12-item WHO-DAS II was administered by the general practitioners during the consultation. We used a non-parametric item response method (Kernel-Smoothing implemented with the TestGraf software to examine the effectiveness of each item (item characteristic curves and their options (option characteristic curves in discriminating between changes in the underliying disability level. We examined composite DIF to know whether women had a higher probability than men of endorsing each item. Results Item response analyses indicated that the twelve items forming the WHO-DAS II perform very well. All items were determined to provide good discrimination across varying standardized levels of the trait. The items also had option characteristic curves that showed good discrimination, given that each increasing option became more likely than the previous as a function of increasing trait level. No gender-related DIF was found on any of the items. Conclusions All WHO-DAS II items were very good at assessing overall disability. Our results supported the appropriateness of the weights assigned to response option categories and showed an absence of gender differences in item functioning.

  6. Cross-language activation of morphological relatives in cognates: The role of orthographic overlap and task-related processing

    Directory of Open Access Journals (Sweden)

    Kimberley eMulder

    2015-02-01

    Full Text Available We considered the role of orthography and task-related processing mechanisms in the activation of morphologically related complex words during bilingual word processing. So far, it has only been shown that such morphologically related words (i.e., morphological family members are activated through the semantic and morphological overlap they share with the target word. In this study, we investigated family size effects in Dutch-English identical cognates (e.g., tent in both languages, non-identical cognates (e.g., pil and pill, in English and Dutch, respectively, and non-cognates (e.g., chicken in English. Because of their cross-linguistic overlap in orthography, reading a cognate can result in activation of family members both languages. Cognates are therefore well-suited for studying mechanisms underlying bilingual activation of morphologically complex words. We investigated family size effects in an English lexical decision task and a Dutch-English language decision task, both performed by Dutch-English bilinguals. English lexical decision showed a facilitatory effect of English and Dutch family size on the processing of English-Dutch cognates relative to English non-cognates. These family size effects were not dependent on cognate type. In contrast, for language decision, in which a bilingual context is created, Dutch and English family size effects were inhibitory. Here, the combined family size of both languages turned out to better predict reaction time than the separate family size in Dutch or English. Moreover, the combined family size interacted with cognate type: The response to identical cognates was slowed by morphological family members in both languages. We conclude that (1 family size effects are sensitive to the task performed on the lexical items, and (2 depend on both semantic and formal aspects of bilingual word processing. We discuss various mechanisms that can explain the observed family size effects in a spreading

  7. Cross-language activation of morphological relatives in cognates: the role of orthographic overlap and task-related processing

    Science.gov (United States)

    Mulder, Kimberley; Dijkstra, Ton; Baayen, R. Harald

    2015-01-01

    We considered the role of orthography and task-related processing mechanisms in the activation of morphologically related complex words during bilingual word processing. So far, it has only been shown that such morphologically related words (i.e., morphological family members) are activated through the semantic and morphological overlap they share with the target word. In this study, we investigated family size effects in Dutch-English identical cognates (e.g., tent in both languages), non-identical cognates (e.g., pil and pill, in English and Dutch, respectively), and non-cognates (e.g., chicken in English). Because of their cross-linguistic overlap in orthography, reading a cognate can result in activation of family members both languages. Cognates are therefore well-suited for studying mechanisms underlying bilingual activation of morphologically complex words. We investigated family size effects in an English lexical decision task and a Dutch-English language decision task, both performed by Dutch-English bilinguals. English lexical decision showed a facilitatory effect of English and Dutch family size on the processing of English-Dutch cognates relative to English non-cognates. These family size effects were not dependent on cognate type. In contrast, for language decision, in which a bilingual context is created, Dutch and English family size effects were inhibitory. Here, the combined family size of both languages turned out to better predict reaction time than the separate family size in Dutch or English. Moreover, the combined family size interacted with cognate type: the response to identical cognates was slowed by morphological family members in both languages. We conclude that (1) family size effects are sensitive to the task performed on the lexical items, and (2) depend on both semantic and formal aspects of bilingual word processing. We discuss various mechanisms that can explain the observed family size effects in a spreading activation framework

  8. Item bias detection in the Hospital Anxiety and Depression Scale using structural equation modeling: comparison with other item bias detection methods

    NARCIS (Netherlands)

    Verdam, M.G.E.; Oort, F.J.; Sprangers, M.A.G.

    Purpose Comparison of patient-reported outcomes may be invalidated by the occurrence of item bias, also known as differential item functioning. We show two ways of using structural equation modeling (SEM) to detect item bias: (1) multigroup SEM, which enables the detection of both uniform and

  9. Cross-Language Transfer of Word Reading Accuracy and Word Reading Fluency in Spanish-English and Chinese-English Bilinguals: Script-Universal and Script-Specific Processes

    Science.gov (United States)

    Pasquarella, Adrian; Chen, Xi; Gottardo, Alexandra; Geva, Esther

    2015-01-01

    This study examined cross-language transfer of word reading accuracy and word reading fluency in Spanish-English and Chinese-English bilinguals. Participants included 51 Spanish-English and 64 Chinese-English bilinguals. Both groups of children completed parallel measures of phonological awareness, rapid automatized naming, word reading accuracy,…

  10. 大型教育調查研究中的差別試題功能:次級分析中的核心概念及建模方法 Differential Item Functioning Analyses in Large-Scale Educational Surveys: Key Concepts and Modeling Approaches for Secondary Analysts

    Directory of Open Access Journals (Sweden)

    朱小姝 Xiao-Shu Zhu

    2011-03-01

    Full Text Available 大型教育評量研究常採用多階段抽樣的設計(multi-stage sampling design),透過對母群體之抽樣單位進行分層以抽取受測者。此外,還會採用複雜題本設計(complex booklet design)的方式將題目組成多份測驗題本。在此情況下,欲確保公正測量出不同受測群體的能力,關鍵在於能夠有效偵測所採用的題目是否具差別試題功能(differential item functioning, DIF)。本文旨在介紹探討在大型教育評量複雜設計之下能用以偵測差別試題功能的建模方法,並應用六種可用於偵測DIF 的多階層廣義線性模式(hierarchical generalized linear models, HGLMs),再透過電腦模擬比較它們偵測DIF 的效力。接著又將這些模式應用到國際數學與科學教育成就趨勢調查研究(TIMSS)的實證數據上,藉以探測是否存在一致性的性別DIF(uniform gender DIF)。 Many educational surveys employ a multi-stage sampling design for students, which makes use of stratification and/or clustering of population units, as well as a complex booklet design for items from an item pool. In these surveys, the reliable detection of item bias or differential item functioning (DIF across student groups is a key component for ensuring fair representations of different student groups. In this paper, we describe several modeling approaches that can be useful for detecting DIF in educational surveys. We illustrate the key ideas by investigating the performance of six hierarchical generalized linear models (HGLMs using a small simulation study and by applying them to real data from the Trends in Mathematics and Science Study (TIMSS study where we use them to investigate potential uniform gender DIF.

  11. Item validity vs. item discrimination index: a redundancy?

    Science.gov (United States)

    Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.

    2018-03-01

    In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.

  12. Item response analysis on an examination in anesthesiology for medical students in Taiwan: A comparison of one- and two-parameter logistic models

    Directory of Open Access Journals (Sweden)

    Yu-Feng Huang

    2013-06-01

    Conclusion: Item response models are useful for medical test analyses and provide valuable information about model comparisons and identification of differential items other than test reliability, item difficulty, and examinee's ability.

  13. Dissociating the neural correlates of intra-item and inter-item working-memory binding.

    Directory of Open Access Journals (Sweden)

    Carinne Piekema

    Full Text Available BACKGROUND: Integration of information streams into a unitary representation is an important task of our cognitive system. Within working memory, the medial temporal lobe (MTL has been conceptually linked to the maintenance of bound representations. In a previous fMRI study, we have shown that the MTL is indeed more active during working-memory maintenance of spatial associations as compared to non-spatial associations or single items. There are two explanations for this result, the mere presence of the spatial component activates the MTL, or the MTL is recruited to bind associations between neurally non-overlapping representations. METHODOLOGY/PRINCIPAL FINDINGS: The current fMRI study investigates this issue further by directly comparing intrinsic intra-item binding (object/colour, extrinsic intra-item binding (object/location, and inter-item binding (object/object. The three binding conditions resulted in differential activation of brain regions. Specifically, we show that the MTL is important for establishing extrinsic intra-item associations and inter-item associations, in line with the notion that binding of information processed in different brain regions depends on the MTL. CONCLUSIONS/SIGNIFICANCE: Our findings indicate that different forms of working-memory binding rely on specific neural structures. In addition, these results extend previous reports indicating that the MTL is implicated in working-memory maintenance, challenging the classic distinction between short-term and long-term memory systems.

  14. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    Science.gov (United States)

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  15. Problems with the factor analysis of items: Solutions based on item response theory and item parcelling

    Directory of Open Access Journals (Sweden)

    Gideon P. De Bruin

    2004-10-01

    Full Text Available The factor analysis of items often produces spurious results in the sense that unidimensional scales appear multidimensional. This may be ascribed to failure in meeting the assumptions of linearity and normality on which factor analysis is based. Item response theory is explicitly designed for the modelling of the non-linear relations between ordinal variables and provides a strong alternative to the factor analysis of items. Items may also be combined in parcels that are more likely to satisfy the assumptions of factor analysis than do the items. The use of the Rasch rating scale model and the factor analysis of parcels is illustrated with data obtained with the Locus of Control Inventory. The results of these analyses are compared with the results obtained through the factor analysis of items. It is shown that the Rasch rating scale model and the factoring of parcels produce superior results to the factor analysis of items. Recommendations for the analysis of scales are made. Opsomming Die faktorontleding van items lewer dikwels misleidende resultate op, veral in die opsig dat eendimensionele skale as meerdimensioneel voorkom. Hierdie resultate kan dikwels daaraan toegeskryf word dat daar nie aan die aannames van lineariteit en normaliteit waarop faktorontleding berus, voldoen word nie. Itemresponsteorie, wat eksplisiet vir die modellering van die nie-liniêre verbande tussen ordinale items ontwerp is, bied ’n aantreklike alternatief vir die faktorontleding van items. Items kan ook in pakkies gegroepeer word wat meer waarskynlik aan die aannames van faktorontleding voldoen as individuele items. Die gebruik van die Rasch beoordelingskaalmodel en die faktorontleding van pakkies word aan die hand van data wat met die Lokus van Beheervraelys verkry is, gedemonstreer. Die resultate van hierdie ontledings word vergelyk met die resultate wat deur ‘n faktorontleding van die individuele items verkry is. Die resultate dui daarop dat die Rasch

  16. ITEM LEVEL DIAGNOSTICS AND MODEL - DATA FIT IN ITEM ...

    African Journals Online (AJOL)

    Global Journal

    Item response theory (IRT) is a framework for modeling and analyzing item response ... data. Though, there is an argument that the evaluation of fit in IRT modeling has been ... National Council on Measurement in Education ... model data fit should be based on three types of ... prediction should be assessed through the.

  17. Item Response Data Analysis Using Stata Item Response Theory Package

    Science.gov (United States)

    Yang, Ji Seung; Zheng, Xiaying

    2018-01-01

    The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

  18. Examining Differential Math Performance by Gender and Opportunity to Learn

    Science.gov (United States)

    Albano, Anthony D.; Rodriguez, Michael C.

    2013-01-01

    Although a substantial amount of research has been conducted on differential item functioning in testing, studies have focused on detecting differential item functioning rather than on explaining how or why it may occur. Some recent work has explored sources of differential functioning using explanatory and multilevel item response models. This…

  19. Evaluating the quality of medical multiple-choice items created with automated processes.

    Science.gov (United States)

    Gierl, Mark J; Lai, Hollis

    2013-07-01

    Computerised assessment raises formidable challenges because it requires large numbers of test items. Automatic item generation (AIG) can help address this test development problem because it yields large numbers of new items both quickly and efficiently. To date, however, the quality of the items produced using a generative approach has not been evaluated. The purpose of this study was to determine whether automatic processes yield items that meet standards of quality that are appropriate for medical testing. Quality was evaluated firstly by subjecting items created using both AIG and traditional processes to rating by a four-member expert medical panel using indicators of multiple-choice item quality, and secondly by asking the panellists to identify which items were developed using AIG in a blind review. Fifteen items from the domain of therapeutics were created in three different experimental test development conditions. The first 15 items were created by content specialists using traditional test development methods (Group 1 Traditional). The second 15 items were created by the same content specialists using AIG methods (Group 1 AIG). The third 15 items were created by a new group of content specialists using traditional methods (Group 2 Traditional). These 45 items were then evaluated for quality by a four-member panel of medical experts and were subsequently categorised as either Traditional or AIG items. Three outcomes were reported: (i) the items produced using traditional and AIG processes were comparable on seven of eight indicators of multiple-choice item quality; (ii) AIG items can be differentiated from Traditional items by the quality of their distractors, and (iii) the overall predictive accuracy of the four expert medical panellists was 42%. Items generated by AIG methods are, for the most part, equivalent to traditionally developed items from the perspective of expert medical reviewers. While the AIG method produced comparatively fewer plausible

  20. The Dif Identification in Constructed Response Items Using Partial Credit Model

    OpenAIRE

    Heri Retnawati

    2017-01-01

    The study was to identify the load, the type and the significance of differential item functioning (DIF) in constructed response item using the partial credit model (PCM). The data in the study were the students’ instruments and the students’ responses toward the PISA-like test items that had been completed by 386 ninth grade students and 460 tenth grade students who had been about 15 years old in the Province of Yogyakarta Special Region in Indonesia. The analysis toward the item characteris...

  1. The processing of inter-item relations as a moderating factor of retrieval-induced forgetting

    OpenAIRE

    Tempel, Tobias; Wippich, Werner

    2012-01-01

    We investigated influences of item generation and emotional valence on retrieval-induced forgetting. Drawing on postulates of the three-factor theory of generation effects, generation tasks differentially affecting the processing of inter-item relations were applied. Whereas retrieval-induced forgetting of freely generated items was moderated by the emotional valence as well as retrieval-induced forgetting of read items, even though in the reverse direction (Experiment 1), fragment completion...

  2. Improving measurement of injection drug risk behavior using item response theory.

    Science.gov (United States)

    Janulis, Patrick

    2014-03-01

    Recent research highlights the multiple steps to preparing and injecting drugs and the resultant viral threats faced by drug users. This research suggests that more sensitive measurement of injection drug HIV risk behavior is required. In addition, growing evidence suggests there are gender differences in injection risk behavior. However, the potential for differential item functioning between genders has not been explored. To explore item response theory as an improved measurement modeling technique that provides empirically justified scaling of injection risk behavior and to examine for potential gender-based differential item functioning. Data is used from three studies in the National Institute on Drug Abuse's Criminal Justice Drug Abuse Treatment Studies. A two-parameter item response theory model was used to scale injection risk behavior and logistic regression was used to examine for differential item functioning. Item fit statistics suggest that item response theory can be used to scale injection risk behavior and these models can provide more sensitive estimates of risk behavior. Additionally, gender-based differential item functioning is present in the current data. Improved measurement of injection risk behavior using item response theory should be encouraged as these models provide increased congruence between construct measurement and the complexity of injection-related HIV risk. Suggestions are made to further improve injection risk behavior measurement. Furthermore, results suggest direct comparisons of composite scores between males and females may be misleading and future work should account for differential item functioning before comparing levels of injection risk behavior.

  3. Selecting Items for Criterion-Referenced Tests.

    Science.gov (United States)

    Mellenbergh, Gideon J.; van der Linden, Wim J.

    1982-01-01

    Three item selection methods for criterion-referenced tests are examined: the classical theory of item difficulty and item-test correlation; the latent trait theory of item characteristic curves; and a decision-theoretic approach for optimal item selection. Item contribution to the standardized expected utility of mastery testing is discussed. (CM)

  4. Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

    Science.gov (United States)

    Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

    2013-07-01

    Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.

  5. Modeling differential item functioning with group-specific item parameters: A computerized adaptive testing application

    NARCIS (Netherlands)

    Makransky, Guido; Glas, Cornelis A.W.

    2013-01-01

    Many important decisions are made based on the results of tests administered under different conditions in the fields of educational and psychological testing. Inaccurate inferences are often made if the property of measurement invariance (MI) is not assessed across these conditions. The importance

  6. Evaluating the Mathematics Interest Inventory Using Item Response Theory: Differential Item Functioning across Gender and Ethnicities

    Science.gov (United States)

    Wei, Tianlan; Chesnut, Steven R.; Barnard-Brak, Lucy; Stevens, Tara; Olivárez, Arturo, Jr.

    2014-01-01

    As the United States has begun to lag behind other developed countries in performance on mathematics and science, researchers have sought to explain this with theories of teaching, knowledge, and motivation. We expand this examination by further analyzing a measure of interest that has been linked to student performance in mathematics and…

  7. Stereotype threat and differential item functioning : A critical assessment

    NARCIS (Netherlands)

    Flore, Paulette

    2018-01-01

    Verslechteren de prestaties van meisjes of vrouwen op wiskundetoetsen als ze geconfronteerd worden met gender stereotypen? Deze vraag hebben psychologen in binnen- en buitenland de afgelopen twee decennia geprobeerd te beantwoorden m.b.v. experimenten. In deze experimenten wordt een groep leerlingen

  8. 48 CFR 852.214-72 - Alternate item(s).

    Science.gov (United States)

    2010-10-01

    ... AND FORMS SOLICITATION PROVISIONS AND CONTRACT CLAUSES Texts of Provisions and Clauses 852.214-72... 2008) Bids on []* will be given equal consideration along with bids on []** and any such bids received... [].** * Contracting officer will insert an alternate item that is considered acceptable. ** Contracting officer will...

  9. Development and psychometric evaluation of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions.

    Science.gov (United States)

    Forrest, Christopher B; Devine, Janine; Bevans, Katherine B; Becker, Brandon D; Carle, Adam C; Teneralli, Rachel E; Moon, JeanHee; Tucker, Carole A; Ravens-Sieberer, Ulrike

    2018-01-01

    To describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions. A pool of 55 life satisfaction items was administered to 1992 children 8-17 years old and 964 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and assessment of construct validity. Thirteen items were deleted because of poor psychometric performance. An 8-item short form was administered to a national sample of 996 children 8-17 years old, and 1294 parents of children 5-17 years old. The combined sample (2988 children and 2258 parents) was used in item response theory (IRT) calibration analyses. The final item banks were unidimensional, the items were locally independent, and the items were free from impactful differential item functioning. The 8-item and 4-item short form scales showed excellent reliability, convergent validity, and discriminant validity. Life satisfaction decreased with declining socio-economic status, presence of a special health care need, and increasing age for girls, but not boys. After IRT calibration, we found that 4- and 8-item short forms had a high degree of precision (reliability) across a wide range (>4 SD units) of the latent variable. The PROMIS Pediatric Life Satisfaction item banks and their short forms provide efficient, precise, and valid assessments of life satisfaction in children and youth.

  10. Development and Evaluation of the PROMIS® Pediatric Positive Affect Item Bank, Child-Report and Parent-Proxy Editions.

    Science.gov (United States)

    Forrest, Christopher B; Ravens-Sieberer, Ulrike; Devine, Janine; Becker, Brandon D; Teneralli, Rachel; Moon, JeanHee; Carle, Adam; Tucker, Carole A; Bevans, Katherine B

    2018-03-01

    The purpose of this study is to describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Positive Affect item bank, child-report and parent-proxy editions. The initial item pool comprising 53 items, previously developed using qualitative methods, was administered to 1,874 children 8-17 years old and 909 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and construct validity. A total of 14 items were deleted, because of poor psychometric performance, and an 8-item short form constructed from the remaining 39 items was administered to a national sample of 1,004 children 8-17 years old, and 1,306 parents of children 5-17 years old. The combined sample was used in item response theory (IRT) calibration analyses. The final item bank appeared unidimensional, the items appeared locally independent, and the items were free from differential item functioning. The scales showed excellent reliability and convergent and discriminant validity. Positive affect decreased with children's age and was lower for those with a special health care need. After IRT calibration, we found that 4 and 8 item short forms had a high degree of precision (reliability) across a wide range of the latent trait (>4 SD units). The PROMIS Pediatric Positive Affect item bank and its short forms provide an efficient, precise, and valid assessment of positive affect in children and youth.

  11. Item response theory analyses of the Delis-Kaplan Executive Function System card sorting subtest.

    Science.gov (United States)

    Spencer, Mercedes; Cho, Sun-Joo; Cutting, Laurie E

    2018-02-02

    In the current study, we examined the dimensionality of the 16-item Card Sorting subtest of the Delis-Kaplan Executive Functioning System assessment in a sample of 264 native English-speaking children between the ages of 9 and 15 years. We also tested for measurement invariance for these items across age and gender groups using item response theory (IRT). Results of the exploratory factor analysis indicated that a two-factor model that distinguished between verbal and perceptual items provided the best fit to the data. Although the items demonstrated measurement invariance across age groups, measurement invariance was violated for gender groups, with two items demonstrating differential item functioning for males and females. Multigroup analysis using all 16 items indicated that the items were more effective for individuals whose IRT scale scores were relatively high. A single-group explanatory IRT model using 14 non-differential item functioning items showed that for perceptual ability, females scored higher than males and that scores increased with age for both males and females; for verbal ability, the observed increase in scores across age differed for males and females. The implications of these findings are discussed.

  12. Modelling sequentially scored item responses

    NARCIS (Netherlands)

    Akkermans, W.

    2000-01-01

    The sequential model can be used to describe the variable resulting from a sequential scoring process. In this paper two more item response models are investigated with respect to their suitability for sequential scoring: the partial credit model and the graded response model. The investigation is

  13. Item level diagnostics and model - data fit in item response theory ...

    African Journals Online (AJOL)

    Item response theory (IRT) is a framework for modeling and analyzing item response data. Item-level modeling gives IRT advantages over classical test theory. The fit of an item score pattern to an item response theory (IRT) models is a necessary condition that must be assessed for further use of item and models that best fit ...

  14. Brief Report: Checklist for Autism Spectrum Disorder--Most Discriminating Items for Diagnosing Autism

    Science.gov (United States)

    Mayes, Susan D.

    2018-01-01

    The smallest subset of items from the 30-item Checklist for Autism Spectrum Disorder (CASD) that differentiated 607 referred children (3-17 years) with and without autism with 100% accuracy was identified. This 6-item subset (CASD-Short Form) was cross-validated on an independent sample of 397 referred children (1-18 years) with and without autism…

  15. Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank

    NARCIS (Netherlands)

    Oude Voshaar, Martijn A.H.; Ten Klooster, Peter M.; Vonkeman, Harald E.; van de Laar, Mart A.F.J.

    2017-01-01

    Objective: Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Study

  16. Factor Structure and Reliability of Test Items for Saudi Teacher Licence Assessment

    Science.gov (United States)

    Alsadaawi, Abdullah Saleh

    2017-01-01

    The Saudi National Assessment Centre administers the Computer Science Teacher Test for teacher certification. The aim of this study is to explore gender differences in candidates' scores, and investigate dimensionality, reliability, and differential item functioning using confirmatory factor analysis and item response theory. The confirmatory…

  17. Assessment of Preference for Edible and Leisure Items in Individuals with Dementia

    Science.gov (United States)

    Ortega, Javier Virues; Iwata, Brian A.; Nogales-Gonzalez, Celia; Frades, Belen

    2012-01-01

    We conducted 2 studies on reinforcer preference in patients with dementia. Results of preference assessments yielded differential selections by 14 participants. Unlike prior studies with individuals with intellectual disabilities, all participants showed a noticeable preference for leisure items over edible items. Results of a subsequent analysis…

  18. Gender Differences in Figural Matrices: The Moderating Role of Item Design Features

    Science.gov (United States)

    Arendasy, Martin E.; Sommer, Markus

    2012-01-01

    There is a heated debate on whether observed gender differences in some figural matrices in adults can be attributed to gender differences in inductive reasoning/G[subscript f] or differential item functioning and/or test bias. Based on previous studies we hypothesized that three specific item design features moderate the effect size of the gender…

  19. Psychometric Consequences of Subpopulation Item Parameter Drift

    Science.gov (United States)

    Huggins-Manley, Anne Corinne

    2017-01-01

    This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…

  20. Generalizability theory and item response theory

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a

  1. Application of Item Response Theory to Tests of Substance-related Associative Memory

    Science.gov (United States)

    Shono, Yusuke; Grenard, Jerry L.; Ames, Susan L.; Stacy, Alan W.

    2015-01-01

    A substance-related word association test (WAT) is one of the commonly used indirect tests of substance-related implicit associative memory and has been shown to predict substance use. This study applied an item response theory (IRT) modeling approach to evaluate psychometric properties of the alcohol- and marijuana-related WATs and their items among 775 ethnically diverse at-risk adolescents. After examining the IRT assumptions, item fit, and differential item functioning (DIF) across gender and age groups, the original 18 WAT items were reduced to 14- and 15-items in the alcohol- and marijuana-related WAT, respectively. Thereafter, unidimensional one- and two-parameter logistic models (1PL and 2PL models) were fitted to the revised WAT items. The results demonstrated that both alcohol- and marijuana-related WATs have good psychometric properties. These results were discussed in light of the framework of a unified concept of construct validity (Messick, 1975, 1989, 1995). PMID:25134051

  2. Generalizability theory and item response theory

    OpenAIRE

    Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a selected-response format. This chapter presents a short overview of how item response theory and generalizability theory were integrated to model such assessments. Further, the precision of the esti...

  3. Validation of a mobility item bank for older patients in primary care.

    Science.gov (United States)

    Cabrero-García, Julio; Ramos-Pichardo, Juan Diego; Muñoz-Mendoza, Carmen Luz; Cabañero-Martínez, María José; González-Llopis, Lorena; Reig-Ferrer, Abilio

    2012-12-05

    To develop and validate an item bank to measure mobility in older people in primary care and to analyse differential item functioning (DIF) and differential bundle functioning (DBF) by sex. A pool of 48 mobility items was administered by interview to 593 older people attending primary health care practices. The pool contained four domains based on the International Classification of Functioning: changing and maintaining body position, carrying, lifting and pushing, walking and going up and down stairs. The Late Life Mobility item bank consisted of 35 items, and measured with a reliability of 0.90 or more across the full spectrum of mobility, except at the higher end of better functioning. No evidence was found of non-uniform DIF but uniform DIF was observed, mainly for items in the changing and maintaining body position and carrying, lifting and pushing domains. The walking domain did not display DBF, but the other three domains did, principally the carrying, lifting and pushing items. During the design and validation of an item bank to measure mobility in older people, we found that strength (carrying, lifting and pushing) items formed a secondary dimension that produced DBF. More research is needed to determine how best to include strength items in a mobility measure, or whether it would be more appropriate to design separate measures for each construct.

  4. 跨語言資訊檢索:理論、技術與應用 | Cross-Language Information Retrieval: Theories and Technologies

    Directory of Open Access Journals (Sweden)

    陳信希 Hsin-His Chen

    2002-04-01

    of the major characteristics in network era. The trend toward information globalization has brought new challenges for in-formation management. On the one hand, it is often necessary to share the valuable resources on the web with users of different languages. On the other hand, it is also necessary for a user to utilize knowledge presented in a foreign language. This paper introduces related theories and technologies of cross language information retrieval, which is kernel in multilingual information management. The basic concepts are presented in sequence on the basis of the classification of query translation, document translation, and no translation. Besides, some advanced concepts like translation ambiguity and target polysemy, as well as proper name transliteration are discussed. Performance evaluation is indispensable for improvement. This paper also shows three world-wide IR evaluation, including TREC, CLEF and NTCIR.

  5. Teoria da Resposta ao Item Teoria de la respuesta al item Item response theory

    Directory of Open Access Journals (Sweden)

    Eutalia Aparecida Candido de Araujo

    2009-12-01

    Full Text Available A preocupação com medidas de traços psicológicos é antiga, sendo que muitos estudos e propostas de métodos foram desenvolvidos no sentido de alcançar este objetivo. Entre os trabalhos propostos, destaca-se a Teoria da Resposta ao Item (TRI que, a princípio, veio completar limitações da Teoria Clássica de Medidas, empregada em larga escala até hoje na medida de traços psicológicos. O ponto principal da TRI é que ela leva em consideração o item particularmente, sem relevar os escores totais; portanto, as conclusões não dependem apenas do teste ou questionário, mas de cada item que o compõe. Este artigo propõe-se a apresentar esta Teoria que revolucionou a teoria de medidas.La preocupación con las medidas de los rasgos psicológicos es antigua y muchos estudios y propuestas de métodos fueron desarrollados para lograr este objetivo. Entre estas propuestas de trabajo se incluye la Teoría de la Respuesta al Ítem (TRI que, en principio, vino a completar las limitaciones de la Teoría Clásica de los Tests, ampliamente utilizada hasta hoy en la medida de los rasgos psicológicos. El punto principal de la TRI es que se tiene en cuenta el punto concreto, sin relevar las puntuaciones totales; por lo tanto, los resultados no sólo dependen de la prueba o cuestionario, sino que de cada ítem que lo compone. En este artículo se propone presentar la Teoría que revolucionó la teoría de medidas.The concern with measures of psychological traits is old and many studies and proposals of methods were developed to achieve this goal. Among these proposed methods highlights the Item Response Theory (IRT that, in principle, came to complete limitations of the Classical Test Theory, which is widely used until nowadays in the measurement of psychological traits. The main point of IRT is that it takes into account the item in particular, not relieving the total scores; therefore, the findings do not only depend on the test or questionnaire

  6. A comparison of Rasch item-fit and Cronbach's alpha item reduction analysis for the development of a Quality of Life scale for children and adolescents.

    Science.gov (United States)

    Erhart, M; Hagquist, C; Auquier, P; Rajmil, L; Power, M; Ravens-Sieberer, U

    2010-07-01

    This study compares item reduction analysis based on classical test theory (maximizing Cronbach's alpha - approach A), with analysis based on the Rasch Partial Credit Model item-fit (approach B), as applied to children and adolescents' health-related quality of life (HRQoL) items. The reliability and structural, cross-cultural and known-group validity of the measures were examined. Within the European KIDSCREEN project, 3019 children and adolescents (8-18 years) from seven European countries answered 19 HRQoL items of the Physical Well-being dimension of a preliminary KIDSCREEN instrument. The Cronbach's alpha and corrected item total correlation (approach A) were compared with infit mean squares and the Q-index item-fit derived according to a partial credit model (approach B). Cross-cultural differential item functioning (DIF ordinal logistic regression approach), structural validity (confirmatory factor analysis and residual correlation) and relative validity (RV) for socio-demographic and health-related factors were calculated for approaches (A) and (B). Approach (A) led to the retention of 13 items, compared with 11 items with approach (B). The item overlap was 69% for (A) and 78% for (B). The correlation coefficient of the summated ratings was 0.93. The Cronbach's alpha was similar for both versions [0.86 (A); 0.85 (B)]. Both approaches selected some items that are not strictly unidimensional and items displaying DIF. RV ratios favoured (A) with regard to socio-demographic aspects. Approach (B) was superior in RV with regard to health-related aspects. Both types of item reduction analysis should be accompanied by additional analyses. Neither of the two approaches was universally superior with regard to cultural, structural and known-group validity. However, the results support the usability of the Rasch method for developing new HRQoL measures for children and adolescents.

  7. Sharing the cost of redundant items

    DEFF Research Database (Denmark)

    Hougaard, Jens Leth; Moulin, Hervé

    2014-01-01

    We ask how to share the cost of finitely many public goods (items) among users with different needs: some smaller subsets of items are enough to serve the needs of each user, yet the cost of all items must be covered, even if this entails inefficiently paying for redundant items. Typical examples...... are network connectivity problems when an existing (possibly inefficient) network must be maintained. We axiomatize a family cost ratios based on simple liability indices, one for each agent and for each item, measuring the relative worth of this item across agents, and generating cost allocation rules...... additive in costs....

  8. Assessment of the psychometrics of a PROMIS item bank: self-efficacy for managing daily activities.

    Science.gov (United States)

    Hong, Ickpyo; Velozo, Craig A; Li, Chih-Ying; Romero, Sergio; Gruber-Baldini, Ann L; Shulman, Lisa M

    2016-09-01

    The aim of this study is to investigate the psychometrics of the Patient-Reported Outcomes Measurement Information System self-efficacy for managing daily activities item bank. The item pool was field tested on a sample of 1087 participants via internet (n = 250) and in-clinic (n = 837) surveys. All participants reported having at least one chronic health condition. The 35 item pool was investigated for dimensionality (confirmatory factor analyses, CFA and exploratory factor analysis, EFA), item-total correlations, local independence, precision, and differential item functioning (DIF) across gender, race, ethnicity, age groups, data collection modes, and neurological chronic conditions (McFadden Pseudo R (2) less than 10 %). The item pool met two of the four CFA fit criteria (CFI = 0.952 and SRMR = 0.07). EFA analysis found a dominant first factor (eigenvalue = 24.34) and the ratio of first to second eigenvalue was 12.4. The item pool demonstrated good item-total correlations (0.59-0.85) and acceptable internal consistency (Cronbach's alpha = 0.97). The item pool maintained its precision (reliability over 0.90) across a wide range of theta (3.70), and there was no significant DIF. The findings indicated the item pool has sound psychometric properties and the test items are eligible for development of computerized adaptive testing and short forms.

  9. Concreteness effects in short-term memory: a test of the item-order hypothesis.

    Science.gov (United States)

    Roche, Jaclynn; Tolan, G Anne; Tehan, Gerald

    2011-12-01

    The following experiments explore word length and concreteness effects in short-term memory within an item-order processing framework. This framework asserts order memory is better for those items that are relatively easy to process at the item level. However, words that are difficult to process benefit at the item level for increased attention/resources being applied. The prediction of the model is that differential item and order processing can be detected in episodic tasks that differ in the degree to which item or order memory are required by the task. The item-order account has been applied to the word length effect such that there is a short word advantage in serial recall but a long word advantage in item recognition. The current experiment considered the possibility that concreteness effects might be explained within the same framework. In two experiments, word length (Experiment 1) and concreteness (Experiment 2) are examined using forward serial recall, backward serial recall, and item recognition. These results for word length replicate previous studies showing the dissociation in item and order tasks. The same was not true for the concreteness effect. In all three tasks concrete words were better remembered than abstract words. The concreteness effect cannot be explained in terms of an item-order trade off. PsycINFO Database Record (c) 2011 APA, all rights reserved.

  10. The Effects of Item Format and Cognitive Domain on Students' Science Performance in TIMSS 2011

    Science.gov (United States)

    Liou, Pey-Yan; Bulut, Okan

    2017-12-01

    The purpose of this study was to examine eighth-grade students' science performance in terms of two test design components, item format, and cognitive domain. The portion of Taiwanese data came from the 2011 administration of the Trends in International Mathematics and Science Study (TIMSS), one of the major international large-scale assessments in science. The item difficulty analysis was initially applied to show the proportion of correct items. A regression-based cumulative link mixed modeling (CLMM) approach was further utilized to estimate the impact of item format, cognitive domain, and their interaction on the students' science scores. The results of the proportion-correct statistics showed that constructed-response items were more difficult than multiple-choice items, and that the reasoning cognitive domain items were more difficult compared to the items in the applying and knowing domains. In terms of the CLMM results, students tended to obtain higher scores when answering constructed-response items as well as items in the applying cognitive domain. When the two predictors and the interaction term were included together, the directions and magnitudes of the predictors on student science performance changed substantially. Plausible explanations for the complex nature of the effects of the two test-design predictors on student science performance are discussed. The results provide practical, empirical-based evidence for test developers, teachers, and stakeholders to be aware of the differential function of item format, cognitive domain, and their interaction in students' science performance.

  11. Development of the PROMIS positive emotional and sensory expectancies of smoking item banks.

    Science.gov (United States)

    Tucker, Joan S; Shadel, William G; Edelen, Maria Orlando; Stucky, Brian D; Li, Zhen; Hansen, Mark; Cai, Li

    2014-09-01

    The positive emotional and sensory expectancies of cigarette smoking include improved cognitive abilities, positive affective states, and pleasurable sensorimotor sensations. This paper describes development of Positive Emotional and Sensory Expectancies of Smoking item banks that will serve to standardize the assessment of this construct among daily and nondaily cigarette smokers. Data came from daily (N = 4,201) and nondaily (N =1,183) smokers who completed an online survey. To identify a unidimensional set of items, we conducted item factor analyses, item response theory analyses, and differential item functioning analyses. Additionally, we evaluated the performance of fixed-item short forms (SFs) and computer adaptive tests (CATs) to efficiently assess the construct. Eighteen items were included in the item banks (15 common across daily and nondaily smokers, 1 unique to daily, 2 unique to nondaily). The item banks are strongly unidimensional, highly reliable (reliability = 0.95 for both), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.86). Results from simulated CATs indicated that, on average, less than 8 items are needed to assess the construct with adequate precision using the item banks. These analyses identified a new set of items that can assess the positive emotional and sensory expectancies of smoking in a reliable and standardized manner. Considerable efficiency in assessing this construct can be achieved by using the item bank SF, employing computer adaptive tests, or selecting subsets of items tailored to specific research or clinical purposes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Psychometric properties of the Triarchic Psychopathy Measure: An item response theory approach.

    Science.gov (United States)

    Shou, Yiyun; Sellbom, Martin; Xu, Jing

    2018-05-01

    There is cumulative evidence for the cross-cultural validity of the Triarchic Psychopathy Measure (TriPM; Patrick, 2010) among non-Western populations. Recent studies using correlational and regression analyses show promising construct validity of the TriPM in Chinese samples. However, little is known about the efficiency of items in TriPM in assessing the proposed latent traits. The current study evaluated the psychometric properties of the Chinese TriPM at the item level using item response theory analyses. It also examined the measurement invariance of the TriPM between the Chinese and the U.S. student samples by applying differential item functioning analyses under the item response theory framework. The results supported the unidimensional nature of the Disinhibition and Meanness scales. Both scales had a greater level of precision in the respective underlying constructs at the positive ends. The two scales, however, had several items that were weakly associated with their respective latent traits in the Chinese student sample. Boldness, on the other hand, was found to be multidimensional, and reflected a more normally distributed range of variation. The examination of measurement bias via differential item functioning analyses revealed that a number of items of the TriPM were not equivalent across the Chinese and the U.S. Some modification and adaptation of items might be considered for improving the precision of the TriPM for Chinese participants. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  13. Emergency Power For Critical Items

    Science.gov (United States)

    Young, William R.

    2009-07-01

    Natural disasters, such as hurricanes, floods, tornados, and tsunami, are becoming a greater problem as climate change impacts our environment. Disasters, whether natural or man made, destroy lives, homes, businesses and the natural environment. Such disasters can happen with little or no warning, leaving hundreds or even thousands of people without medical services, potable water, sanitation, communications and electrical services for up to several weeks. In our modern world, the need for electricity has become a necessity. Modern building codes and new disaster resistant building practices are reducing the damage to homes and businesses. Emergency gasoline and diesel generators are becoming common place for power outages. Generators need fuel, which may not be available after a disaster, but Photovoltaic (solar-electric) systems supply electricity without petroleum fuel as they are powered by the sun. Photovoltaic (PV) systems can provide electrical power for a home or business. PV systems can operate as utility interactive or stand-alone with battery backup. Determining your critical load items and sizing the photovoltaic system for those critical items, guarantees their operation in a disaster.

  14. AN INVESTIGATION OF ITEM BIAS.

    Science.gov (United States)

    CLEARY, T. ANNE; HILTON, THOMAS L.

    THE PURPOSE OF THIS INVESTIGATION WAS TO DETERMINE WHETHER THE PRELIMINARY SCHOLASTIC APTITUDE TEST PRESENTED A DIFFERENTIAL DIFFICULTY FOR RACIAL AND SOCIOECONOMIC GROUPS. THE SUBJECTS WERE TWO GROUPS TOTALING 1,410 NEGRO AND WHITE HIGH SCHOOL SENIORS IN AN INTEGRATED HIGH SCHOOL WHO HAD TAKEN THE TEST. THEY WERE DIVIDED INTO THREE SOCIOECONOMIC…

  15. 5 CFR 591.212 - How does OPM select survey items?

    Science.gov (United States)

    2010-01-01

    ... 5 Administrative Personnel 1 2010-01-01 2010-01-01 false How does OPM select survey items? 591.212 Section 591.212 Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT CIVIL SERVICE REGULATIONS ALLOWANCES AND DIFFERENTIALS Cost-of-Living Allowance and Post Differential-Nonforeign Areas Cost-Of-Living...

  16. Using automatic item generation to create multiple-choice test items.

    Science.gov (United States)

    Gierl, Mark J; Lai, Hollis; Turner, Simon R

    2012-08-01

    Many tests of medical knowledge, from the undergraduate level to the level of certification and licensure, contain multiple-choice items. Although these are efficient in measuring examinees' knowledge and skills across diverse content areas, multiple-choice items are time-consuming and expensive to create. Changes in student assessment brought about by new forms of computer-based testing have created the demand for large numbers of multiple-choice items. Our current approaches to item development cannot meet this demand. We present a methodology for developing multiple-choice items based on automatic item generation (AIG) concepts and procedures. We describe a three-stage approach to AIG and we illustrate this approach by generating multiple-choice items for a medical licensure test in the content area of surgery. To generate multiple-choice items, our method requires a three-stage process. Firstly, a cognitive model is created by content specialists. Secondly, item models are developed using the content from the cognitive model. Thirdly, items are generated from the item models using computer software. Using this methodology, we generated 1248 multiple-choice items from one item model. Automatic item generation is a process that involves using models to generate items using computer technology. With our method, content specialists identify and structure the content for the test items, and computer technology systematically combines the content to generate new test items. By combining these outcomes, items can be generated automatically. © Blackwell Publishing Ltd 2012.

  17. A Balance Sheet for Educational Item Banking.

    Science.gov (United States)

    Hiscox, Michael D.

    Educational item banking presents observers with a considerable paradox. The development of test items from scratch is viewed as wasteful, a luxury in times of declining resources. On the other hand, item banking has failed to become a mature technology despite large amounts of money and the efforts of talented professionals. The question of which…

  18. 76 FR 60474 - Commercial Item Handbook

    Science.gov (United States)

    2011-09-29

    ... DEPARTMENT OF DEFENSE Defense Acquisition Regulations System Commercial Item Handbook AGENCY.... SUMMARY: DoD has updated its Commercial Item Handbook. The purpose of the Handbook is to help acquisition personnel develop sound business strategies for procuring commercial items. DoD is seeking industry input on...

  19. Towards an authoring system for item construction

    NARCIS (Netherlands)

    Rikers, Jos H.A.N.

    1988-01-01

    The process of writing test items is analyzed, and a blueprint is presented for an authoring system for test item writing to reduce invalidity and to structure the process of item writing. The developmental methodology is introduced, and the first steps in the process are reported. A historical

  20. Obtaining a Proportional Allocation by Deleting Items

    NARCIS (Netherlands)

    Dorn, B.; de Haan, R.; Schlotter, I.; Röthe, J.

    2017-01-01

    We consider the following control problem on fair allocation of indivisible goods. Given a set I of items and a set of agents, each having strict linear preference over the items, we ask for a minimum subset of the items whose deletion guarantees the existence of a proportional allocation in the

  1. Item Analysis in Introductory Economics Testing.

    Science.gov (United States)

    Tinari, Frank D.

    1979-01-01

    Computerized analysis of multiple choice test items is explained. Examples of item analysis applications in the introductory economics course are discussed with respect to three objectives: to evaluate learning; to improve test items; and to help improve classroom instruction. Problems, costs and benefits of the procedures are identified. (JMD)

  2. New technologies for item monitoring

    International Nuclear Information System (INIS)

    Abbott, J.A.; Waddoups, I.G.

    1993-12-01

    This report responds to the Department of Energy's request that Sandia National Laboratories compare existing technologies against several advanced technologies as they apply to DOE needs to monitor the movement of material, weapons, or personnel for safety and security programs. The authors describe several material control systems, discuss their technologies, suggest possible applications, discuss assets and limitations, and project costs for each system. The following systems are described: WATCH system (Wireless Alarm Transmission of Container Handling); Tag system (an electrostatic proximity sensor); PANTRAK system (Personnel And Material Tracking); VRIS (Vault Remote Inventory System); VSIS (Vault Safety and Inventory System); AIMS (Authenticated Item Monitoring System); EIVS (Experimental Inventory Verification System); Metrox system (canister monitoring system); TCATS (Target Cueing And Tracking System); LGVSS (Light Grid Vault Surveillance System); CSS (Container Safeguards System); SAMMS (Security Alarm and Material Monitoring System); FOIDS (Fiber Optic Intelligence ampersand Detection System); GRADS (Graded Radiation Detection System); and PINPAL (Physical Inventory Pallet)

  3. New technologies for item monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Abbott, J.A. [EG & G Energy Measurements, Albuquerque, NM (United States); Waddoups, I.G. [Sandia National Labs., Albuquerque, NM (United States)

    1993-12-01

    This report responds to the Department of Energy`s request that Sandia National Laboratories compare existing technologies against several advanced technologies as they apply to DOE needs to monitor the movement of material, weapons, or personnel for safety and security programs. The authors describe several material control systems, discuss their technologies, suggest possible applications, discuss assets and limitations, and project costs for each system. The following systems are described: WATCH system (Wireless Alarm Transmission of Container Handling); Tag system (an electrostatic proximity sensor); PANTRAK system (Personnel And Material Tracking); VRIS (Vault Remote Inventory System); VSIS (Vault Safety and Inventory System); AIMS (Authenticated Item Monitoring System); EIVS (Experimental Inventory Verification System); Metrox system (canister monitoring system); TCATS (Target Cueing And Tracking System); LGVSS (Light Grid Vault Surveillance System); CSS (Container Safeguards System); SAMMS (Security Alarm and Material Monitoring System); FOIDS (Fiber Optic Intelligence & Detection System); GRADS (Graded Radiation Detection System); and PINPAL (Physical Inventory Pallet).

  4. Approximation Preserving Reductions among Item Pricing Problems

    Science.gov (United States)

    Hamane, Ryoso; Itoh, Toshiya; Tomita, Kouhei

    When a store sells items to customers, the store wishes to determine the prices of the items to maximize its profit. Intuitively, if the store sells the items with low (resp. high) prices, the customers buy more (resp. less) items, which provides less profit to the store. So it would be hard for the store to decide the prices of items. Assume that the store has a set V of n items and there is a set E of m customers who wish to buy those items, and also assume that each item i ∈ V has the production cost di and each customer ej ∈ E has the valuation vj on the bundle ej ⊆ V of items. When the store sells an item i ∈ V at the price ri, the profit for the item i is pi = ri - di. The goal of the store is to decide the price of each item to maximize its total profit. We refer to this maximization problem as the item pricing problem. In most of the previous works, the item pricing problem was considered under the assumption that pi ≥ 0 for each i ∈ V, however, Balcan, et al. [In Proc. of WINE, LNCS 4858, 2007] introduced the notion of “loss-leader, ” and showed that the seller can get more total profit in the case that pi < 0 is allowed than in the case that pi < 0 is not allowed. In this paper, we derive approximation preserving reductions among several item pricing problems and show that all of them have algorithms with good approximation ratio.

  5. The Dif Identification in Constructed Response Items Using Partial Credit Model

    Directory of Open Access Journals (Sweden)

    Heri Retnawati

    2017-10-01

    Full Text Available The study was to identify the load, the type and the significance of differential item functioning (DIF in constructed response item using the partial credit model (PCM. The data in the study were the students’ instruments and the students’ responses toward the PISA-like test items that had been completed by 386 ninth grade students and 460 tenth grade students who had been about 15 years old in the Province of Yogyakarta Special Region in Indonesia. The analysis toward the item characteristics through the student categorization based on their class was conducted toward the PCM using CONQUEST software. Furthermore, by applying these items characteristics, the researcher draw the category response function (CRF graphic in order to identify whether the type of DIF content had been in uniform or non-uniform. The significance of DIF was identified by comparing the discrepancy between the difficulty level parameter and the error in the CONQUEST output results. The results of the analysis showed that from 18 items that had been analyzed there were 4 items which had not been identified load DIF, there were 5 items that had been identified containing DIF but not statistically significant and there were 9 items that had been identified containing DIF significantly. The causes of items containing DIF were discussed.

  6. Item Modeling Concept Based on Multimedia Authoring

    Directory of Open Access Journals (Sweden)

    Janez Stergar

    2008-09-01

    Full Text Available In this paper a modern item design framework for computer based assessment based on Flash authoring environment will be introduced. Question design will be discussed as well as the multimedia authoring environment used for item modeling emphasized. Item type templates are a structured means of collecting and storing item information that can be used to improve the efficiency and security of the innovative item design process. Templates can modernize the item design, enhance and speed up the development process. Along with content creation, multimedia has vast potential for use in innovative testing. The introduced item design template is based on taxonomy of innovative items which have great potential for expanding the content areas and construct coverage of an assessment. The presented item design approach is based on GUI's – one for question design based on implemented item design templates and one for user interaction tracking/retrieval. The concept of user interfaces based on Flash technology will be discussed as well as implementation of the innovative approach of the item design forms with multimedia authoring. Also an innovative method for user interaction storage/retrieval based on PHP extending Flash capabilities in the proposed framework will be introduced.

  7. Losing Items in the Psychogeriatric Nursing Home

    Directory of Open Access Journals (Sweden)

    J. van Hoof PhD

    2016-09-01

    Full Text Available Introduction: Losing items is a time-consuming occurrence in nursing homes that is ill described. An explorative study was conducted to investigate which items got lost by nursing home residents, and how this affects the residents and family caregivers. Method: Semi-structured interviews and card sorting tasks were conducted with 12 residents with early-stage dementia and 12 family caregivers. Thematic analysis was applied to the outcomes of the sessions. Results: The participants stated that numerous personal items and assistive devices get lost in the nursing home environment, which had various emotional, practical, and financial implications. Significant amounts of time are spent on trying to find items, varying from 1 hr up to a couple of weeks. Numerous potential solutions were identified by the interviewees. Discussion: Losing items often goes together with limitations to the participation of residents. Many family caregivers are reluctant to replace lost items, as these items may get lost again.

  8. Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2013-01-01

    Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

  9. Using item response theory to address vulnerabilities in FFQ.

    Science.gov (United States)

    Kazman, Josh B; Scott, Jonathan M; Deuster, Patricia A

    2017-09-01

    The limitations for self-reporting of dietary patterns are widely recognised as a major vulnerability of FFQ and the dietary screeners/scales derived from FFQ. Such instruments can yield inconsistent results to produce questionable interpretations. The present article discusses the value of psychometric approaches and standards in addressing these drawbacks for instruments used to estimate dietary habits and nutrient intake. We argue that a FFQ or screener that treats diet as a 'latent construct' can be optimised for both internal consistency and the value of the research results. Latent constructs, a foundation for item response theory (IRT)-based scales (e.g. Patient Reported Outcomes Measurement Information System) are typically introduced in the design stage of an instrument to elicit critical factors that cannot be observed or measured directly. We propose an iterative approach that uses such modelling to refine FFQ and similar instruments. To that end, we illustrate the benefits of psychometric modelling by using items and data from a sample of 12 370 Soldiers who completed the 2012 US Army Global Assessment Tool (GAT). We used factor analysis to build the scale incorporating five out of eleven survey items. An IRT-driven assessment of response category properties indicates likely problems in the ordering or wording of several response categories. Group comparisons, examined with differential item functioning (DIF), provided evidence of scale validity across each Army sub-population (sex, service component and officer status). Such an approach holds promise for future FFQ.

  10. Converging evidence for control of color-word Stroop interference at the item level.

    Science.gov (United States)

    Bugg, Julie M; Hutchison, Keith A

    2013-04-01

    Prior studies have shown that cognitive control is implemented at the list and context levels in the color-word Stroop task. At first blush, the finding that Stroop interference is reduced for mostly incongruent items as compared with mostly congruent items (i.e., the item-specific proportion congruence [ISPC] effect) appears to provide evidence for yet a third level of control, which modulates word reading at the item level. However, evidence to date favors the view that ISPC effects reflect the rapid prediction of high-contingency responses and not item-specific control. In Experiment 1, we first show that an ISPC effect is obtained when the relevant dimension (i.e., color) signals proportion congruency, a problematic pattern for theories based on differential response contingencies. In Experiment 2, we replicate and extend this pattern by showing that item-specific control settings transfer to new stimuli, ruling out alternative frequency-based accounts. In Experiment 3, we revert to the traditional design in which the irrelevant dimension (i.e., word) signals proportion congruency. Evidence for item-specific control, including transfer of the ISPC effect to new stimuli, is apparent when 4-item sets are employed but not when 2-item sets are employed. We attribute this pattern to the absence of high-contingency responses on incongruent trials in the 4-item set. These novel findings provide converging evidence for reactive control of color-word Stroop interference at the item level, reveal theoretically important factors that modulate reliance on item-specific control versus contingency learning, and suggest an update to the item-specific control account (Bugg, Jacoby, & Chanani, 2011).

  11. Gender differences in national assessment of educational progress science items: What does i don't know really mean?

    Science.gov (United States)

    Linn, Marcia C.; de Benedictis, Tina; Delucchi, Kevin; Harris, Abigail; Stage, Elizabeth

    The National Assessment of Educational Progress Science Assessment has consistently revealed small gender differences on science content items but not on science inquiry items. This assessment differs from others in that respondents can choose I don't know rather than guessing. This paper examines explanations for the gender differences including (a) differential prior instruction, (b) differential response to uncertainty and use of the I don't know response, (c) differential response to figurally presented items, and (d) different attitudes towards science. Of these possible explanations, the first two received support. Females are more likely to use the I don't know response, especially for items with physical science content or masculine themes such as football. To ameliorate this situation we need more effective science instruction and more gender-neutral assessment items.

  12. CERN Running Club – Sale of Items

    CERN Multimedia

    CERN Running club

    2018-01-01

    The CERN Running Club is organising a sale of items  on 26 June from 11:30 – 13:00 in the entry area of Restaurant 2 (504 R-202). The items for sale are souvenir prizes of past Relay Races and comprise: Backpacks, thermos, towels, gloves & caps, lamps, long sleeve winter shirts and windproof vest. All items will be sold at 5 CHF.

  13. deltaPlotR: An R Package for Di?erential Item Functioning Analysis with Ango? s Delta Plot

    OpenAIRE

    David Magis; Bruno Facon

    2014-01-01

    Angoff's delta plot is a straightforward and not computationally intensive method to identify differential item functioning (DIF) among dichotomously scored items. This approach was recently improved by proposing an optimal threshold selection and by considering several item purification processes. Moreover, to support practical DIF analyses with the delta plot and these improvements, the R package deltaPlotR was also developed. The purpose of this paper is twofold: to outline the delta plot ...

  14. Harmonizing Measures of Cognitive Performance Across International Surveys of Aging Using Item Response Theory.

    Science.gov (United States)

    Chan, Kitty S; Gross, Alden L; Pezzin, Liliana E; Brandt, Jason; Kasper, Judith D

    2015-12-01

    To harmonize measures of cognitive performance using item response theory (IRT) across two international aging studies. Data for persons ≥65 years from the Health and Retirement Study (HRS, N = 9,471) and the English Longitudinal Study of Aging (ELSA, N = 5,444). Cognitive performance measures varied (HRS fielded 25, ELSA 13); 9 were in common. Measurement precision was examined for IRT scores based on (a) common items, (b) common items adjusted for differential item functioning (DIF), and (c) DIF-adjusted all items. Three common items (day of date, immediate word recall, and delayed word recall) demonstrated DIF by survey. Adding survey-specific items improved precision but mainly for HRS respondents at lower cognitive levels. IRT offers a feasible strategy for harmonizing cognitive performance measures across other surveys and for other multi-item constructs of interest in studies of aging. Practical implications depend on sample distribution and the difficulty mix of in-common and survey-specific items. © The Author(s) 2015.

  15. Linking Existing Instruments to Develop an Activity of Daily Living Item Bank.

    Science.gov (United States)

    Li, Chih-Ying; Romero, Sergio; Bonilha, Heather S; Simpson, Kit N; Simpson, Annie N; Hong, Ickpyo; Velozo, Craig A

    2018-03-01

    This study examined dimensionality and item-level psychometric properties of an item bank measuring activities of daily living (ADL) across inpatient rehabilitation facilities and community living centers. Common person equating method was used in the retrospective veterans data set. This study examined dimensionality, model fit, local independence, and monotonicity using factor analyses and fit statistics, principal component analysis (PCA), and differential item functioning (DIF) using Rasch analysis. Following the elimination of invalid data, 371 veterans who completed both the Functional Independence Measure (FIM) and minimum data set (MDS) within 6 days were retained. The FIM-MDS item bank demonstrated good internal consistency (Cronbach's α = .98) and met three rating scale diagnostic criteria and three of the four model fit statistics (comparative fit index/Tucker-Lewis index = 0.98, root mean square error of approximation = 0.14, and standardized root mean residual = 0.07). PCA of Rasch residuals showed the item bank explained 94.2% variance. The item bank covered the range of θ from -1.50 to 1.26 (item), -3.57 to 4.21 (person) with person strata of 6.3. The findings indicated the ADL physical function item bank constructed from FIM and MDS measured a single latent trait with overall acceptable item-level psychometric properties, suggesting that it is an appropriate source for developing efficient test forms such as short forms and computerized adaptive tests.

  16. The effects of value on context-item associative memory in younger and older adults.

    Science.gov (United States)

    Hennessee, Joseph P; Knowlton, Barbara J; Castel, Alan D

    2018-02-01

    Valuable items are often remembered better than items that are less valuable by both older and younger adults, but older adults typically show deficits in binding. Here, we examine whether value affects the quality of recognition memory and the binding of incidental details to valuable items. In Experiment 1, participants learned English words each associated with a point-value they earned for correct recognition with the goal of maximizing their score. In Experiment 2, value was manipulated by presenting items that were either congruent or incongruent with an imagined state of physiological need (e.g., hunger). In Experiment 1, point-value was associated with enhanced recollection in both age groups. Memory for the color associated with the word was in fact reduced for high-value recollected items compared with low-value recollected items, suggesting value selectively enhances binding of task-relevant details. In Experiment 2, memory for learned images was enhanced by value in both age groups. However, value differentially enhanced binding of an imagined context to the item in younger and older adults, with a strong trend for increased binding in younger adults only. These findings suggest that value enhances episodic encoding in both older and younger adults but that binding of associated details may be reduced for valuable items compared to less valuable items, particularly in older adults. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  17. The MIMIC Model as a Tool for Differential Bundle Functioning Detection

    Science.gov (United States)

    Finch, W. Holmes

    2012-01-01

    Increasingly, researchers interested in identifying potentially biased test items are encouraged to use a confirmatory, rather than exploratory, approach. One such method for confirmatory testing is rooted in differential bundle functioning (DBF), where hypotheses regarding potential differential item functioning (DIF) for sets of items (bundles)…

  18. Using Cochran's Z Statistic to Test the Kernel-Smoothed Item Response Function Differences between Focal and Reference Groups

    Science.gov (United States)

    Zheng, Yinggan; Gierl, Mark J.; Cui, Ying

    2010-01-01

    This study combined the kernel smoothing procedure and a nonparametric differential item functioning statistic--Cochran's Z--to statistically test the difference between the kernel-smoothed item response functions for reference and focal groups. Simulation studies were conducted to investigate the Type I error and power of the proposed…

  19. Calibration of the PROMIS physical function item bank in Dutch patients with rheumatoid arthritis.

    Directory of Open Access Journals (Sweden)

    Martijn A H Oude Voshaar

    Full Text Available OBJECTIVE: To calibrate the Dutch-Flemish version of the PROMIS physical function (PF item bank in patients with rheumatoid arthritis (RA and to evaluate cross-cultural measurement equivalence with US general population and RA data. METHODS: Data were collected from RA patients enrolled in the Dutch DREAM registry. An incomplete longitudinal anchored design was used where patients completed all 121 items of the item bank over the course of three waves of data collection. Item responses were fit to a generalized partial credit model adapted for longitudinal data and the item parameters were examined for differential item functioning (DIF across country, age, and sex. RESULTS: In total, 690 patients participated in the study at time point 1 (T2, N = 489; T3, N = 311. The item bank could be successfully fitted to a generalized partial credit model, with the number of misfitting items falling within acceptable limits. Seven items demonstrated DIF for sex, while 5 items showed DIF for age in the Dutch RA sample. Twenty-five (20% items were flagged for cross-cultural DIF compared to the US general population. However, the impact of observed DIF on total physical function estimates was negligible. DISCUSSION: The results of this study showed that the PROMIS PF item bank adequately fit a unidimensional IRT model which provides support for applications that require invariant estimates of physical function, such as computer adaptive testing and targeted short forms. More studies are needed to further investigate the cross-cultural applicability of the US-based PROMIS calibration and standardized metric.

  20. Calibration of the Dutch-Flemish PROMIS Pain Behavior item bank in patients with chronic pain.

    Science.gov (United States)

    Crins, M H P; Roorda, L D; Smits, N; de Vet, H C W; Westhovens, R; Cella, D; Cook, K F; Revicki, D; van Leeuwen, J; Boers, M; Dekker, J; Terwee, C B

    2016-02-01

    The aims of the current study were to calibrate the item parameters of the Dutch-Flemish PROMIS Pain Behavior item bank using a sample of Dutch patients with chronic pain and to evaluate cross-cultural validity between the Dutch-Flemish and the US PROMIS Pain Behavior item banks. Furthermore, reliability and construct validity of the Dutch-Flemish PROMIS Pain Behavior item bank were evaluated. The 39 items in the bank were completed by 1042 Dutch patients with chronic pain. To evaluate unidimensionality, a one-factor confirmatory factor analysis (CFA) was performed. A graded response model (GRM) was used to calibrate the items. To evaluate cross-cultural validity, Differential item functioning (DIF) for language (Dutch vs. English) was evaluated. Reliability of the item bank was also examined and construct validity was studied using several legacy instruments, e.g. the Roland Morris Disability Questionnaire. CFA supported the unidimensionality of the Dutch-Flemish PROMIS Pain Behavior item bank (CFI = 0.960, TLI = 0.958), the data also fit the GRM, and demonstrated good coverage across the pain behavior construct (threshold parameters range: -3.42 to 3.54). Analysis showed good cross-cultural validity (only six DIF items), reliability (Cronbach's α = 0.95) and construct validity (all correlations ≥0.53). The Dutch-Flemish PROMIS Pain Behavior item bank was found to have good cross-cultural validity, reliability and construct validity. The development of the Dutch-Flemish PROMIS Pain Behavior item bank will serve as the basis for Dutch-Flemish PROMIS short forms and computer adaptive testing (CAT). © 2015 European Pain Federation - EFIC®

  1. Differential geometry and mathematical physics

    CERN Document Server

    Rudolph, Gerd

    Starting from an undergraduate level, this book systematically develops the basics of • Calculus on manifolds, vector bundles, vector fields and differential forms, • Lie groups and Lie group actions, • Linear symplectic algebra and symplectic geometry, • Hamiltonian systems, symmetries and reduction, integrable systems and Hamilton-Jacobi theory. The topics listed under the first item are relevant for virtually all areas of mathematical physics. The second and third items constitute the link between abstract calculus and the theory of Hamiltonian systems. The last item provides an introduction to various aspects of this theory, including Morse families, the Maslov class and caustics. The book guides the reader from elementary differential geometry to advanced topics in the theory of Hamiltonian systems with the aim of making current research literature accessible. The style is that of a mathematical textbook,with full proofs given in the text or as exercises. The material is illustrated by numerous d...

  2. 38 CFR 3.1606 - Transportation items.

    Science.gov (United States)

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Transportation items. 3... Burial Benefits § 3.1606 Transportation items. The transportation costs of those persons who come within... shipment. (6) Cost of transportation by common carrier including amounts paid as Federal taxes. (7) Cost of...

  3. Grouping of Items in Mobile Web Questionnaires

    Science.gov (United States)

    Mavletova, Aigul; Couper, Mick P.

    2016-01-01

    There is some evidence that a scrolling design may reduce breakoffs in mobile web surveys compared to a paging design, but there is little empirical evidence to guide the choice of the optimal number of items per page. We investigate the effect of the number of items presented on a page on data quality in two types of questionnaires: with or…

  4. Binomial test models and item difficulty

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1979-01-01

    In choosing a binomial test model, it is important to know exactly what conditions are imposed on item difficulty. In this paper these conditions are examined for both a deterministic and a stochastic conception of item responses. It appears that they are more restrictive than is generally

  5. Comparison on Computed Tomography using industrial items

    DEFF Research Database (Denmark)

    Angel, Jais Andreas Breusch; De Chiffre, Leonardo

    2014-01-01

    In a comparison involving 27 laboratories from 8 countries, measurements on two common industrial items, a polymer part and a metal part, were carried out using X-ray Computed Tomography. All items were measured using coordinate measuring machines before and after circulation, with reference...

  6. Factoring handedness data: I. Item analysis.

    Science.gov (United States)

    Messinger, H B; Messinger, M I

    1995-12-01

    Recently in this journal Peters and Murphy challenged the validity of factor analyses done on bimodal handedness data, suggesting instead that right- and left-handers be studied separately. But bimodality may be avoidable if attention is paid to Oldfield's questionnaire format and instructions for the subjects. Two characteristics appear crucial: a two-column LEFT-RIGHT format for the body of the instrument and what we call Oldfield's Admonition: not to indicate strong preference for handedness item, such as write, unless "... the preference is so strong that you would never try to use the other hand unless absolutely forced to...". Attaining unimodality of an item distribution would seem to overcome the objections of Peters and Murphy. In a 1984 survey in Boston we used Oldfield's ten-item questionnaire exactly as published. This produced unimodal item distributions. With reflection of the five-point item scale and a logarithmic transformation, we achieved a degree of normalization for the items. Two surveys elsewhere based on Oldfield's 20-item list but with changes in the questionnaire format and the instructions, yielded markedly different item distributions with peaks at each extreme and sometimes in the middle as well.

  7. Item Information in the Rasch Model

    NARCIS (Netherlands)

    Engelen, Ron J.H.; van der Linden, Willem J.; Oosterloo, Sebe J.

    1988-01-01

    Fisher's information measure for the item difficulty parameter in the Rasch model and its marginal and conditional formulations are investigated. It is shown that expected item information in the unconditional model equals information in the marginal model, provided the assumption of sampling

  8. Item response theory - A first approach

    Science.gov (United States)

    Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

    2017-07-01

    The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.

  9. Software Note: Using BILOG for Fixed-Anchor Item Calibration

    Science.gov (United States)

    DeMars, Christine E.; Jurich, Daniel P.

    2012-01-01

    The nonequivalent groups anchor test (NEAT) design is often used to scale item parameters from two different test forms. A subset of items, called the anchor items or common items, are administered as part of both test forms. These items are used to adjust the item calibrations for any differences in the ability distributions of the groups taking…

  10. The randomly renewed general item and the randomly inspected item with exponential life distribution

    International Nuclear Information System (INIS)

    Schneeweiss, W.G.

    1979-01-01

    For a randomly renewed item the probability distributions of the time to failure and of the duration of down time and the expectations of these random variables are determined. Moreover, it is shown that the same theory applies to randomly checked items with exponential probability distribution of life such as electronic items. The case of periodic renewals is treated as an example. (orig.) [de

  11. Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items

    Science.gov (United States)

    Aybek, Eren Can; Demirtasli, R. Nukhet

    2017-01-01

    This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…

  12. Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory

    Science.gov (United States)

    Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi

    2016-01-01

    High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…

  13. Asymptotic Standard Errors for Item Response Theory True Score Equating of Polytomous Items

    Science.gov (United States)

    Cher Wong, Cheow

    2015-01-01

    Building on previous works by Lord and Ogasawara for dichotomous items, this article proposes an approach to derive the asymptotic standard errors of item response theory true score equating involving polytomous items, for equivalent and nonequivalent groups of examinees. This analytical approach could be used in place of empirical methods like…

  14. The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory

    Science.gov (United States)

    Sahin, Alper; Anil, Duygu

    2017-01-01

    This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…

  15. Investigating Separate and Concurrent Approaches for Item Parameter Drift in 3PL Item Response Theory Equating

    Science.gov (United States)

    Arce-Ferrer, Alvaro J.; Bulut, Okan

    2017-01-01

    This study examines separate and concurrent approaches to combine the detection of item parameter drift (IPD) and the estimation of scale transformation coefficients in the context of the common item nonequivalent groups design with the three-parameter item response theory equating. The study uses real and synthetic data sets to compare the two…

  16. The Technical Quality of Test Items Generated Using a Systematic Approach to Item Writing.

    Science.gov (United States)

    Siskind, Theresa G.; Anderson, Lorin W.

    The study was designed to examine the similarity of response options generated by different item writers using a systematic approach to item writing. The similarity of response options to student responses for the same item stems presented in an open-ended format was also examined. A non-systematic (subject matter expertise) approach and a…

  17. Item Response Theory Analysis of the Psychopathic Personality Inventory-Revised.

    Science.gov (United States)

    Eichenbaum, Alexander E; Marcus, David K; French, Brian F

    2017-06-01

    This study examined item and scale functioning in the Psychopathic Personality Inventory-Revised (PPI-R) using an item response theory analysis. PPI-R protocols from 1,052 college student participants (348 male, 704 female) were analyzed. Analyses were conducted on the 131 self-report items comprising the PPI-R's eight content scales, using a graded response model. Scales collected a majority of their information about respondents possessing higher than average levels of the traits being measured. Each scale contained at least some items that evidenced limited ability to differentiate between respondents with differing levels of the trait being measured. Moreover, 80 items (61.1%) yielded significantly different responses between men and women presumably possessing similar levels of the trait being measured. Item performance was also influenced by the scoring format (directly scored vs. reverse-scored) of the items. Overall, the results suggest that the PPI-R, despite identifying psychopathic personality traits in individuals possessing high levels of those traits, may not identify these traits equally well for men and women, and scores are likely influenced by the scoring format of the individual item and scale.

  18. Local context effects during emotional item directed forgetting in younger and older adults.

    Science.gov (United States)

    Gallant, Sara N; Dyson, Benjamin J; Yang, Lixia

    2017-09-01

    This paper explored the differential sensitivity young and older adults exhibit to the local context of items entering memory. We examined trial-to-trial performance during an item directed forgetting task for positive, negative, and neutral (or baseline) words each cued as either to-be-remembered (TBR) or to-be-forgotten (TBF). This allowed us to focus on how variations in emotional valence (independent of arousal) and instruction (TBR vs. TBF) of the previous item (trial n-1) impacted memory for the current item (trial n) during encoding. Different from research showing impairing effects of emotional arousal, both age groups showed a memorial boost for stimuli when preceded by items high in positive or negative valence relative to those preceded by neutral items. This advantage was particularly prominent for neutral trial n items that followed emotional items suggesting that, regardless of age, neutral memories may be strengthened by a local context that is high in valence. A trending age difference also emerged with older adults showing greater sensitivity when encoding instructions changed between trial n-1 and n. Results are discussed in light of age-related theories of cognitive and emotional processing, highlighting the need to consider the dynamic, moment-to-moment fluctuations of these systems.

  19. Development of a subjective cognitive decline questionnaire using item response theory: a pilot study.

    Science.gov (United States)

    Gifford, Katherine A; Liu, Dandan; Romano, Raymond; Jones, Richard N; Jefferson, Angela L

    2015-12-01

    Subjective cognitive decline (SCD) may indicate unhealthy cognitive changes, but no standardized SCD measurement exists. This pilot study aims to identify reliable SCD questions. 112 cognitively normal (NC, 76±8 years, 63% female), 43 mild cognitive impairment (MCI; 77±7 years, 51% female), and 33 diagnostically ambiguous participants (79±9 years, 58% female) were recruited from a research registry and completed 57 self-report SCD questions. Psychometric methods were used for item-reduction. Factor analytic models assessed unidimensionality of the latent trait (SCD); 19 items were removed with extreme response distribution or trait-fit. Item response theory (IRT) provided information about question utility; 17 items with low information were dropped. Post-hoc simulation using computerized adaptive test (CAT) modeling selected the most commonly used items (n=9 of 21 items) that represented the latent trait well (r=0.94) and differentiated NC from MCI participants (F(1,146)=8.9, p=0.003). Item response theory and computerized adaptive test modeling identified nine reliable SCD items. This pilot study is a first step toward refining SCD assessment in older adults. Replication of these findings and validation with Alzheimer's disease biomarkers will be an important next step for the creation of a SCD screener.

  20. The cultural fairness of the 12-item General Health Questionnaire among diverse adolescents.

    Science.gov (United States)

    Bowe, Anica

    2017-01-01

    The 12-item general health questionnaire (GHQ-12) was used in the Longitudinal Study of Young People in England (LSYPE; N = 15,770) to collect measures on adolescent mental health. Given the debate in current literature regarding the dimensionality of the GHQ-12, this study examined the cultural sensitivity of the instrument at the item level for each of the 7 major ethnic groups within the database. This study used a hybrid approach of ordinal logistic regression and item response theory (IRT) to examine the presence of differential item functioning (DIF) on the questionnaire. Results demonstrated that uniform, nonuniform, and overall DIF were present on items between White and Asian adolescents (7 items), White and Black Caribbean adolescents (1 item), and White and Black African adolescents (7 items), however all McFadden's pseudo R² effect size estimates indicated that the DIF was negligible. Overall, there were cumulative small scale level effects for the Mixed/Biracial, Asian, and Black African groups, but in each case the bias was only marginal. Findings demonstrate that the GHQ-12 can be considered culturally sensitive for adolescents from diverse ethnic groups in England, but follow-up studies are necessary. Implications for future education and health policies as well as the use of IR-based approaches for psychological instruments are discussed. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  1. Item analysis of ADAS-Cog: effect of baseline cognitive impairment in a clinical AD trial.

    Science.gov (United States)

    Sevigny, Jeffrey J; Peng, Yahong; Liu, Lian; Lines, Christopher R

    2010-03-01

    We explored the association of Alzheimer's disease (AD) Assessment Scale (ADAS-Cog) item scores with AD severity using cross-sectional and longitudinal data from the same study. Post hoc analyses were performed using placebo data from a 12-month trial of patients with mild-to-moderate AD (N =281 randomized, N =209 completed). Baseline distributions of ADAS-Cog item scores by Mini-Mental State Examination (MMSE) score and Clinical Dementia Rating (CDR) sum of boxes score (measures of dementia severity) were estimated using local and nonparametric regressions. Mixed-effect models were used to characterize ADAS-Cog item score changes over time by dementia severity (MMSE: mild =21-26, moderate =14-20; global CDR: mild =0.5-1, moderate =2). In the cross-sectional analysis of baseline ADAS-Cog item scores, orientation was the most sensitive item to differentiate patients across levels of cognitive impairment. Several items showed a ceiling effect, particularly in milder AD. In the longitudinal analysis of change scores over 12 months, orientation was the only item with noticeable decline (8%-10%) in mild AD. Most items showed modest declines (5%-20%) in moderate AD.

  2. Bayes Factor Covariance Testing in Item Response Models.

    Science.gov (United States)

    Fox, Jean-Paul; Mulder, Joris; Sinharay, Sandip

    2017-12-01

    Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning the underlying covariance structure are evaluated using (fractional) Bayes factor tests. The support for a unidimensional factor (i.e., assumption of local independence) and differential item functioning are evaluated by testing the covariance components. The posterior distribution of common covariance components is obtained in closed form by transforming latent responses with an orthogonal (Helmert) matrix. This posterior distribution is defined as a shifted-inverse-gamma, thereby introducing a default prior and a balanced prior distribution. Based on that, an MCMC algorithm is described to estimate all model parameters and to compute (fractional) Bayes factor tests. Simulation studies are used to show that the (fractional) Bayes factor tests have good properties for testing the underlying covariance structure of binary response data. The method is illustrated with two real data studies.

  3. Methodology for the development and calibration of the SCI-QOL item banks.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Victorson, David; Choi, Seung W; Gershon, Richard; Heinemann, Allen W; Cella, David

    2015-05-01

    To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Individual interviews (n=44) and focus groups (n=65 individuals with SCI and n=42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n=877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n=245) to assess test-retest reliability and stability. A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury--Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM.

  4. Detecting Differential Person Functioning in Emotional Intelligence

    Science.gov (United States)

    Alsmadi, Yahia M.; Alsmadi, Abdalla A.

    2009-01-01

    Differential Item Functioning (DIF) is a widely used term in test development literature. It is very important to analyze test's data for DIF because It is a serious threat to validity. If the same data matrix was transposed, similar analysis can be carried for Differential Person Functioning (DPF). The purpose of this paper is to introduce and…

  5. Automated Item Generation with Recurrent Neural Networks.

    Science.gov (United States)

    von Davier, Matthias

    2018-03-12

    Utilizing technology for automated item generation is not a new idea. However, test items used in commercial testing programs or in research are still predominantly written by humans, in most cases by content experts or professional item writers. Human experts are a limited resource and testing agencies incur high costs in the process of continuous renewal of item banks to sustain testing programs. Using algorithms instead holds the promise of providing unlimited resources for this crucial part of assessment development. The approach presented here deviates in several ways from previous attempts to solve this problem. In the past, automatic item generation relied either on generating clones of narrowly defined item types such as those found in language free intelligence tests (e.g., Raven's progressive matrices) or on an extensive analysis of task components and derivation of schemata to produce items with pre-specified variability that are hoped to have predictable levels of difficulty. It is somewhat unlikely that researchers utilizing these previous approaches would look at the proposed approach with favor; however, recent applications of machine learning show success in solving tasks that seemed impossible for machines not too long ago. The proposed approach uses deep learning to implement probabilistic language models, not unlike what Google brain and Amazon Alexa use for language processing and generation.

  6. Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank.

    Science.gov (United States)

    Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Vonkeman, Harald E; van de Laar, Mart A F J

    2017-11-01

    Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Seventy-two items generated from patient interviews and mapped to the International Classification of Functioning, Disability and Health (ICF) domestic life chapter were administered to 1128 adults representative of the Dutch population. The partial credit model was fitted to the item responses and evaluated with respect to its assumptions, model fit, and differential item functioning (DIF). Measurement performance of a computerized adaptive testing (CAT) algorithm was compared with the SF-36 physical functioning scale (PF-10). A final bank of 41 items was developed. All items demonstrated acceptable fit to the partial credit model and measurement invariance across age, sex, and educational level. Five- and ten-item CAT simulations were shown to have high measurement precision, which exceeded that of SF-36 physical functioning scale across the physical function continuum. Floor effects were absent for a 10-item empirical CAT simulation, and ceiling effects were low (13.5%) compared with SF-36 physical functioning (38.1%). CAT also discriminated better than SF-36 physical functioning between age groups, number of chronic conditions, and respondents with or without rheumatic conditions. The Rasch assessment of everyday activity limitations (REAL) item bank will hopefully prove a useful instrument for assessing everyday activity limitations. T-scores obtained using derived measures can be used to benchmark physical function outcomes against the general Dutch adult population.

  7. NHRIC (National Health Related Items Code)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The National Health Related Items Code (NHRIC) is a system for identification and numbering of marketed device packages that is compatible with other numbering...

  8. Basic Stand Alone Carrier Line Items PUF

    Data.gov (United States)

    U.S. Department of Health & Human Services — This release contains the Basic Stand Alone (BSA) Carrier Line Items Public Use Files (PUF) with information from Medicare Carrier claims. The CMS BSA Carrier Line...

  9. Extending item response theory to online homework

    Directory of Open Access Journals (Sweden)

    Gerd Kortemeyer

    2014-05-01

    Full Text Available Item response theory (IRT becomes an increasingly important tool when analyzing “big data” gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large-enrollment physics course for scientists and engineers, the study compares outcomes from IRT analyses of exam and homework data, and then proceeds to investigate the effects of each confounding factor introduced in the online realm. It is found that IRT yields the correct trends for learner ability and meaningful item parameters, yet overall agreement with exam data is moderate. It is also found that learner ability and item discrimination is robust over a wide range with respect to model assumptions and introduced noise. Item difficulty is also robust, but over a narrower range.

  10. Item Response Theory Analyses of the Cambridge Face Memory Test (CFMT)

    Science.gov (United States)

    Cho, Sun-Joo; Wilmer, Jeremy; Herzmann, Grit; McGugin, Rankin; Fiset, Daniel; Van Gulick, Ana E.; Ryan, Katie; Gauthier, Isabel

    2014-01-01

    We evaluated the psychometric properties of the Cambridge face memory test (CFMT; Duchaine & Nakayama, 2006). First, we assessed the dimensionality of the test with a bi-factor exploratory factor analysis (EFA). This EFA analysis revealed a general factor and three specific factors clustered by targets of CFMT. However, the three specific factors appeared to be minor factors that can be ignored. Second, we fit a unidimensional item response model. This item response model showed that the CFMT items could discriminate individuals at different ability levels and covered a wide range of the ability continuum. We found the CFMT to be particularly precise for a wide range of ability levels. Third, we implemented item response theory (IRT) differential item functioning (DIF) analyses for each gender group and two age groups (Age ≤ 20 versus Age > 21). This DIF analysis suggested little evidence of consequential differential functioning on the CFMT for these groups, supporting the use of the test to compare older to younger, or male to female, individuals. Fourth, we tested for a gender difference on the latent facial recognition ability with an explanatory item response model. We found a significant but small gender difference on the latent ability for face recognition, which was higher for women than men by 0.184, at age mean 23.2, controlling for linear and quadratic age effects. Finally, we discuss the practical considerations of the use of total scores versus IRT scale scores in applications of the CFMT. PMID:25642930

  11. Inventions on presenting textual items in Graphical User Interface

    OpenAIRE

    Mishra, Umakant

    2014-01-01

    Although a GUI largely replaces textual descriptions by graphical icons, the textual items are not completely removed. The textual items are inevitably used in window titles, message boxes, help items, menu items and popup items. Textual items are necessary for communicating messages that are beyond the limitation of graphical messages. However, it is necessary to harness the textual items on the graphical interface in such a way that they complement each other to produce the best effect. One...

  12. Item selection via Bayesian IRT models.

    Science.gov (United States)

    Arima, Serena

    2015-02-10

    With reference to a questionnaire that aimed to assess the quality of life for dysarthric speakers, we investigate the usefulness of a model-based procedure for reducing the number of items. We propose a mixed cumulative logit model, which is known in the psychometrics literature as the graded response model: responses to different items are modelled as a function of individual latent traits and as a function of item characteristics, such as their difficulty and their discrimination power. We jointly model the discrimination and the difficulty parameters by using a k-component mixture of normal distributions. Mixture components correspond to disjoint groups of items. Items that belong to the same groups can be considered equivalent in terms of both difficulty and discrimination power. According to decision criteria, we select a subset of items such that the reduced questionnaire is able to provide the same information that the complete questionnaire provides. The model is estimated by using a Bayesian approach, and the choice of the number of mixture components is justified according to information criteria. We illustrate the proposed approach on the basis of data that are collected for 104 dysarthric patients by local health authorities in Lecce and in Milan. Copyright © 2014 John Wiley & Sons, Ltd.

  13. The role of attention in item-item binding in visual working memory.

    Science.gov (United States)

    Peterson, Dwight J; Naveh-Benjamin, Moshe

    2017-09-01

    An important yet unresolved question regarding visual working memory (VWM) relates to whether or not binding processes within VWM require additional attentional resources compared with processing solely the individual components comprising these bindings. Previous findings indicate that binding of surface features (e.g., colored shapes) within VWM is not demanding of resources beyond what is required for single features. However, it is possible that other types of binding, such as the binding of complex, distinct items (e.g., faces and scenes), in VWM may require additional resources. In 3 experiments, we examined VWM item-item binding performance under no load, articulatory suppression, and backward counting using a modified change detection task. Binding performance declined to a greater extent than single-item performance under higher compared with lower levels of concurrent load. The findings from each of these experiments indicate that processing item-item bindings within VWM requires a greater amount of attentional resources compared with single items. These findings also highlight an important distinction between the role of attention in item-item binding within VWM and previous studies of long-term memory (LTM) where declines in single-item and binding test performance are similar under divided attention. The current findings provide novel evidence that the specific type of binding is an important determining factor regarding whether or not VWM binding processes require attention. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  14. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions.

    Science.gov (United States)

    Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

    2016-01-01

    This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.

  15. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions

    Directory of Open Access Journals (Sweden)

    Yoon Soo ePark

    2016-02-01

    Full Text Available This study investigates the impact of item parameter drift (IPD on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effect on item parameters and examinee ability.

  16. Evaluation of item candidates for a diabetic retinopathy quality of life item bank.

    Science.gov (United States)

    Fenwick, Eva K; Pesudovs, Konrad; Khadka, Jyoti; Rees, Gwyn; Wong, Tien Y; Lamoureux, Ecosse L

    2013-09-01

    We are developing an item bank assessing the impact of diabetic retinopathy (DR) on quality of life (QoL) using a rigorous multi-staged process combining qualitative and quantitative methods. We describe here the first two qualitative phases: content development and item evaluation. After a comprehensive literature review, items were generated from four sources: (1) 34 previously validated patient-reported outcome measures; (2) five published qualitative articles; (3) eight focus groups and 18 semi-structured interviews with 57 DR patients; and (4) seven semi-structured interviews with diabetes or ophthalmic experts. Items were then evaluated during 3 stages, namely binning (grouping) and winnowing (reduction) based on key criteria and panel consensus; development of item stems and response options; and pre-testing of items via cognitive interviews with patients. The content development phase yielded 1,165 unique items across 7 QoL domains. After 3 sessions of binning and winnowing, items were reduced to a minimally representative set (n = 312) across 9 domains of QoL: visual symptoms; ocular surface symptoms; activity limitation; mobility; emotional; health concerns; social; convenience; and economic. After 8 cognitive interviews, 42 items were amended resulting in a final set of 314 items. We have employed a systematic approach to develop items for a DR-specific QoL item bank. The psychometric properties of the nine QoL subscales will be assessed using Rasch analysis. The resulting validated item bank will allow clinicians and researchers to better understand the QoL impact of DR and DR therapies from the patient's perspective.

  17. Testing the Item-Order Account of Design Effects Using the Production Effect

    Science.gov (United States)

    Jonker, Tanya R.; Levene, Merrick; MacLeod, Colin M.

    2014-01-01

    A number of memory phenomena evident in recall in within-subject, mixed-lists designs are reduced or eliminated in between-subject, pure-list designs. The item-order account (McDaniel & Bugg, 2008) proposes that differential retention of order information might underlie this pattern. According to this account, order information may be encoded…

  18. Effects of Learning Experience on Forgetting Rates of Item and Associative Memories

    Science.gov (United States)

    Yang, Jiongjiong; Zhan, Lexia; Wang, Yingying; Du, Xiaoya; Zhou, Wenxi; Ning, Xueling; Sun, Qing; Moscovitch, Morris

    2016-01-01

    Are associative memories forgotten more quickly than item memories, and does the level of original learning differentially influence forgetting rates? In this study, we addressed these questions by having participants learn single words and word pairs once (Experiment 1), three times (Experiment 2), and six times (Experiment 3) in a massed…

  19. HIV/AIDS knowledge among men who have sex with men: applying the item response theory.

    Science.gov (United States)

    Gomes, Raquel Regina de Freitas Magalhães; Batista, José Rodrigues; Ceccato, Maria das Graças Braga; Kerr, Lígia Regina Franco Sansigolo; Guimarães, Mark Drew Crosland

    2014-04-01

    To evaluate the level of HIV/AIDS knowledge among men who have sex with men in Brazil using the latent trait model estimated by Item Response Theory. Multicenter, cross-sectional study, carried out in ten Brazilian cities between 2008 and 2009. Adult men who have sex with men were recruited (n = 3,746) through Respondent Driven Sampling. HIV/AIDS knowledge was ascertained through ten statements by face-to-face interview and latent scores were obtained through two-parameter logistic modeling (difficulty and discrimination) using Item Response Theory. Differential item functioning was used to examine each item characteristic curve by age and schooling. Overall, the HIV/AIDS knowledge scores using Item Response Theory did not exceed 6.0 (scale 0-10), with mean and median values of 5.0 (SD = 0.9) and 5.3, respectively, with 40.7% of the sample with knowledge levels below the average. Some beliefs still exist in this population regarding the transmission of the virus by insect bites, by using public restrooms, and by sharing utensils during meals. With regard to the difficulty and discrimination parameters, eight items were located below the mean of the scale and were considered very easy, and four items presented very low discrimination parameter (items contributed to the inaccuracy of the measurement of knowledge among those with median level and above. Item Response Theory analysis, which focuses on the individual properties of each item, allows measures to be obtained that do not vary or depend on the questionnaire, which provides better ascertainment and accuracy of knowledge scores. Valid and reliable scales are essential for monitoring HIV/AIDS knowledge among the men who have sex with men population over time and in different geographic regions, and this psychometric model brings this advantage.

  20. Item Response Theory analysis of Fagerström Test for Cigarette Dependence.

    Science.gov (United States)

    Svicher, Andrea; Cosci, Fiammetta; Giannini, Marco; Pistelli, Francesco; Fagerström, Karl

    2018-02-01

    The Fagerström Test for Cigarette Dependence (FTCD) and the Heaviness of Smoking Index (HSI) are the gold standard measures to assess cigarette dependence. However, FTCD reliability and factor structure have been questioned and HSI psychometric properties are in need of further investigations. The present study examined the psychometrics properties of the FTCD and the HSI via the Item Response Theory. The study was a secondary analysis of data collected in 862 Italian daily smokers. Confirmatory factor analysis was run to evaluate the dimensionality of FTCD. A Grade Response Model was applied to FTCD and HSI to verify the fit to the data. Both item and test functioning were analyzed and item statistics, Test Information Function, and scale reliabilities were calculated. Mokken Scale Analysis was applied to estimate homogeneity and Loevinger's coefficients were calculated. The FTCD showed unidimensionality and homogeneity for most of the items and for the total score. It also showed high sensitivity and good reliability from medium to high levels of cigarette dependence, although problems related to some items (i.e., items 3 and 5) were evident. HSI had good homogeneity, adequate item functioning, and high reliability from medium to high levels of cigarette dependence. Significant Differential Item Functioning was found for items 1, 4, 5 of the FTCD and for both items of HSI. HSI seems highly recommended in clinical settings addressed to heavy smokers while FTCD would be better used in smokers with a level of cigarette dependence ranging between low and high. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Redefining diagnostic symptoms of depression using Rasch analysis: testing an item bank suitable for DSM-V and computer adaptive testing.

    Science.gov (United States)

    Mitchell, Alex J; Smith, Adam B; Al-salihy, Zerak; Rahim, Twana A; Mahmud, Mahmud Q; Muhyaldin, Asma S

    2011-10-01

    We aimed to redefine the optimal self-report symptoms of depression suitable for creation of an item bank that could be used in computer adaptive testing or to develop a simplified screening tool for DSM-V. Four hundred subjects (200 patients with primary depression and 200 non-depressed subjects), living in Iraqi Kurdistan were interviewed. The Mini International Neuropsychiatric Interview (MINI) was used to define the presence of major depression (DSM-IV criteria). We examined symptoms of depression using four well-known scales delivered in Kurdish. The Partial Credit Model was applied to each instrument. Common-item equating was subsequently used to create an item bank and differential item functioning (DIF) explored for known subgroups. A symptom level Rasch analysis reduced the original 45 items to 24 items of the original after the exclusion of 21 misfitting items. A further six items (CESD13 and CESD17, HADS-D4, HADS-D5 and HADS-D7, and CDSS3 and CDSS4) were removed due to misfit as the items were added together to form the item bank, and two items were subsequently removed following the DIF analysis by diagnosis (CESD20 and CDSS9, both of which were harder to endorse for women). Therefore the remaining optimal item bank consisted of 17 items and produced an area under the curve (AUC) of 0.987. Using a bank restricted to the optimal nine items revealed only minor loss of accuracy (AUC = 0.989, sensitivity 96%, specificity 95%). Finally, when restricted to only four items accuracy was still high (AUC was still 0.976; sensitivity 93%, specificity 96%). An item bank of 17 items may be useful in computer adaptive testing and nine or even four items may be used to develop a simplified screening tool for DSM-V major depressive disorder (MDD). Further examination of this item bank should be conducted in different cultural settings.

  2. An NCME Instructional Module on Polytomous Item Response Theory Models

    Science.gov (United States)

    Penfield, Randall David

    2014-01-01

    A polytomous item is one for which the responses are scored according to three or more categories. Given the increasing use of polytomous items in assessment practices, item response theory (IRT) models specialized for polytomous items are becoming increasingly common. The purpose of this ITEMS module is to provide an accessible overview of…

  3. Loglinear multidimensional IRT models for polytomously scired Items

    NARCIS (Netherlands)

    Kelderman, Henk

    1988-01-01

    A loglinear item response theory (IRT) model is proposed that relates polytomously scored item responses to a multidimensional latent space. Each item may have a different response function where each item response may be explained by one or more latent traits. Item response functions may follow a

  4. A strategy for optimizing item-pool management

    NARCIS (Netherlands)

    Ariel, A.; van der Linden, Willem J.; Veldkamp, Bernard P.

    2006-01-01

    Item-pool management requires a balancing act between the input of new items into the pool and the output of tests assembled from it. A strategy for optimizing item-pool management is presented that is based on the idea of a periodic update of an optimal blueprint for the item pool to tune item

  5. An Examination of the Instructional Sensitivity of the TIMSS Math Items: A Hierarchical Differential Item Functioning Approach

    Science.gov (United States)

    Li, Hongli; Qin, Qi; Lei, Pui-Wa

    2017-01-01

    In recent years, students' test scores have been used to evaluate teachers' performance. The assumption underlying this practice is that students' test performance reflects teachers' instruction. However, this assumption is generally not empirically tested. In this study, we examine the effect of teachers' instruction on test performance at the…

  6. Evaluation of the Multiple Sclerosis Walking Scale-12 (MSWS-12) in a Dutch sample: Application of item response theory.

    Science.gov (United States)

    Mokkink, Lidwine Brigitta; Galindo-Garre, Francisca; Uitdehaag, Bernard Mj

    2016-12-01

    The Multiple Sclerosis Walking Scale-12 (MSWS-12) measures walking ability from the patients' perspective. We examined the quality of the MSWS-12 using an item response theory model, the graded response model (GRM). A total of 625 unique Dutch multiple sclerosis (MS) patients were included. After testing for unidimensionality, monotonicity, and absence of local dependence, a GRM was fit and item characteristics were assessed. Differential item functioning (DIF) for the variables gender, age, duration of MS, type of MS and severity of MS, reliability, total test information, and standard error of the trait level (θ) were investigated. Confirmatory factor analysis showed a unidimensional structure of the 12 items of the scale, explaining 88% of the variance. Item 2 did not fit into the GRM model. Reliability was 0.93. Items 8 and 9 (of the 11 and 12 item version respectively) showed DIF on the variable severity, based on the Expanded Disability Status Scale (EDSS). However, the EDSS is strongly related to the content of both items. Our results confirm the good quality of the MSWS-12. The trait level (θ) scores and item parameters of both the 12- and 11-item versions were highly comparable, although we do not suggest to change the content of the MSWS-12. © The Author(s), 2016.

  7. Counterfeit and Fraudulent Items - Mitigating the risk

    International Nuclear Information System (INIS)

    Tannenbaum, Marc

    2011-01-01

    This presentation (slides) provides an overview of the industry's challenges and activities. Firstly, it outlines the differences between counterfeit, fraudulent, suspect, and also substandard items. Notice is given that items could be found not to meet the standard, but the difference in the intent to deceive with counterfeit and fraudulent items is the critical element. Examples from other industries are used which also rely heavily on the assurance of quality for safety. It also informs that EPRI has just completed a report in October 2009 in coordination with other US government agencies and industry organizations; this report, entitled Counterfeit, Substandard and Fraudulent Items, number 1019163, is available for free on the EPRI web site. As a follow-up to this report, EPRI is developing a CFSI Database; any country interested in a collaborative agreement is invited to use and contribute to the database information. Finally, it stresses the importance of the oversight of contractors, training to raise the awareness of the employees and the inspectors, and having a response plan for identified items

  8. Using Explanatory Item Response Models to Evaluate Complex Scientific Tasks Designed for the Next Generation Science Standards

    Science.gov (United States)

    Chiu, Tina

    This dissertation includes three studies that analyze a new set of assessment tasks developed by the Learning Progressions in Middle School Science (LPS) Project. These assessment tasks were designed to measure science content knowledge on the structure of matter domain and scientific argumentation, while following the goals from the Next Generation Science Standards (NGSS). The three studies focus on the evidence available for the success of this design and its implementation, generally labelled as "validity" evidence. I use explanatory item response models (EIRMs) as the overarching framework to investigate these assessment tasks. These models can be useful when gathering validity evidence for assessments as they can help explain student learning and group differences. In the first study, I explore the dimensionality of the LPS assessment by comparing the fit of unidimensional, between-item multidimensional, and Rasch testlet models to see which is most appropriate for this data. By applying multidimensional item response models, multiple relationships can be investigated, and in turn, allow for a more substantive look into the assessment tasks. The second study focuses on person predictors through latent regression and differential item functioning (DIF) models. Latent regression models show the influence of certain person characteristics on item responses, while DIF models test whether one group is differentially affected by specific assessment items, after conditioning on latent ability. Finally, the last study applies the linear logistic test model (LLTM) to investigate whether item features can help explain differences in item difficulties.

  9. Conjunctive and Disjunctive Item Response Functions.

    Science.gov (United States)

    1984-10-01

    fed set ofvaluesof a, b, AI , B1 A2 2 . 2 A3 , and 13 , the f ’. g ’a. nd h’a in (7) are fied. Equation (7) must still hold for S - e19029e3,..* . Thus...for Item I Is -- b ?(a:1 , b1 ,O) (1 + ’)(I + e4 (22 where a and pi are arbitrary constants. These constants mst be the sam for all Items In a given...NETHERLIS I E3I1 Focility-Acquisitions 4133 Rugby Avnue 1 Lee Cronbach Bethesda, NO 20014 16 Laburnue Road Atherton, CA 94205 1 Dr. Benjamin A. Fairbank

  10. Development of a Short Version of MSQOL-54 Using Factor Analysis and Item Response Theory.

    Directory of Open Access Journals (Sweden)

    Rosalba Rosato

    Full Text Available The Multiple Sclerosis Quality of Life-54 (MSQOL-54, 52 items grouped in 12 subscales plus two single items is the most used MS specific health related quality of life inventory.To develop a shortened version of the MSQOL-54.MSQOL-54 dimensionality and metric properties were investigated by confirmatory factor analysis (CFA and Rasch modelling (Partial Credit Model, PCM on MSQOL-54s completed by 473 MS patients. Their mean age was 41 years, 65% were women, and median Expanded Disability Status Scale (EDSS score was 2.0 (range 0-9.5. Differential item functioning (DIF was evaluated for gender, age and EDSS. Dimensionality of the resulting short version was assessed by exploratory factor analysis (EFA and CFA. Cognitive debriefing of the short instrument (vs. the original was then performed on 12 MS patients.CFA of MSQOL-54 subscales showed that the data fitted the overall model well. Two subscales (Role Limitations--Physical, Role Limitations--Emotional did not fit the PCM, and were removed; two other subscales (Health Perceptions, Social Function did not fit the model, but were retained as single items. Sexual Satisfaction (single-item subscale was also removed. The resulting MSQOL-29 consisted of 25 items grouped in 7 subscales, plus 4 single items. PCM fit statistics were within the acceptability range for all MSQOL-29 items except one which had significant DIF by age. EFA and CFA indicated adequate fit to the original two-factor (Physical and Mental Health Composites hypothesis. Cognitive debriefing confirmed that MSQOL-29 was acceptable and had lost no key items.The proposed MSQOL-29 is 50% shorter than MSQOL-54, yet preserves key quality of life dimensions. Prospective validation on a large, independent MS patient sample is ongoing.

  11. Lawton IADL scale in dementia: can item response theory make it more informative?

    Science.gov (United States)

    McGrory, Sarah; Shenkin, Susan D; Austin, Elizabeth J; Starr, John M

    2014-07-01

    impairment of functional abilities represents a crucial component of dementia diagnosis. Current functional measures rely on the traditional aggregate method of summing raw scores. While this summary score provides a quick representation of a person's ability, it disregards useful information on the item level. to use item response theory (IRT) methods to increase the interpretive power of the Lawton Instrumental Activities of Daily Living (IADL) scale by establishing a hierarchy of item 'difficulty' and 'discrimination'. this cross-sectional study applied IRT methods to the analysis of IADL outcomes. Participants were 202 members of the Scottish Dementia Research Interest Register (mean age = 76.39, range = 56-93, SD = 7.89 years) with complete itemised data available. a Mokken scale with good reliability (Molenaar Sijtsama statistic 0.79) was obtained, satisfying the IRT assumption that the items comprise a single unidimensional scale. The eight items in the scale could be placed on a hierarchy of 'difficulty' (H coefficient = 0.55), with 'Shopping' being the most 'difficult' item and 'Telephone use' being the least 'difficult' item. 'Shopping' was the most discriminatory item differentiating well between patients of different levels of ability. IRT methods are capable of providing more information about functional impairment than a summed score. 'Shopping' and 'Telephone use' were identified as items that reveal key information about a patient's level of ability, and could be useful screening questions for clinicians. © The Author 2013. Published by Oxford University Press on behalf of the British Geriatrics Society. All rights reserved. For Permissions, please email: journals.permissions@ oup.com.

  12. Psychometric properties of the Epworth Sleepiness Scale: A factor analysis and item-response theory approach.

    Science.gov (United States)

    Pilcher, June J; Switzer, Fred S; Munc, Alec; Donnelly, Janet; Jellen, Julia C; Lamm, Claus

    2018-04-01

    The purpose of this study is to examine the psychometric properties of the Epworth Sleepiness Scale (ESS) in two languages, German and English. Students from a university in Austria (N = 292; 55 males; mean age = 18.71 ± 1.71 years; 237 females; mean age = 18.24 ± 0.88 years) and a university in the US (N = 329; 128 males; mean age = 18.71 ± 0.88 years; 201 females; mean age = 21.59 ± 2.27 years) completed the ESS. An exploratory-factor analysis was completed to examine dimensionality of the ESS. Item response theory (IRT) analyses were used to provide information about the response rates on the items on the ESS and provide differential item functioning (DIF) analyses to examine whether the items were interpreted differently between the two languages. The factor analyses suggest that the ESS measures two distinct sleepiness constructs. These constructs indicate that the ESS is probing sleepiness in settings requiring active versus passive responding. The IRT analyses found that overall, the items on the ESS perform well as a measure of sleepiness. However, Item 8 and to a lesser extent Item 6 were being interpreted differently by respondents in comparison to the other items. In addition, the DIF analyses showed that the responses between German and English were very similar indicating that there are only minor measurement differences between the two language versions of the ESS. These findings suggest that the ESS provides a reliable measure of propensity to sleepiness; however, it does convey a two-factor approach to sleepiness. Researchers and clinicians can use the German and English versions of the ESS but may wish to exclude Item 8 when calculating a total sleepiness score.

  13. A Comparison of the 27-Item and 12-Item Intolerance of Uncertainty Scales

    Science.gov (United States)

    Khawaja, Nigar G.; Yu, Lai Ngo Heidi

    2010-01-01

    The 27-item Intolerance of Uncertainty Scale (IUS) has become one of the most frequently used measures of Intolerance of Uncertainty. More recently, an abridged, 12-item version of the IUS has been developed. The current research used clinical (n = 50) and non-clinical (n = 56) samples to examine and compare the psychometric properties of both…

  14. Modeling Local Item Dependence in Cloze and Reading Comprehension Test Items Using Testlet Response Theory

    Science.gov (United States)

    Baghaei, Purya; Ravand, Hamdollah

    2016-01-01

    In this study the magnitudes of local dependence generated by cloze test items and reading comprehension items were compared and their impact on parameter estimates and test precision was investigated. An advanced English as a foreign language reading comprehension test containing three reading passages and a cloze test was analyzed with a…

  15. Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain.

    Science.gov (United States)

    Crins, Martine H P; Roorda, Leo D; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Terwee, Caroline B

    2015-01-01

    The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT) model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA). Items were calibrated using the graded response model (GRM), an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF) for language (Dutch vs. English) was examined. Reliability was evaluated based on standard errors and Cronbach's alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire) were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986). Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44). The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF), good reliability (Cronbach's alpha = 0.98), and good construct validity (Pearson correlations between 0.62 and 0.75). A computer adaptive test (CAT) and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed.

  16. Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain.

    Directory of Open Access Journals (Sweden)

    Martine H P Crins

    Full Text Available The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA. Items were calibrated using the graded response model (GRM, an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF for language (Dutch vs. English was examined. Reliability was evaluated based on standard errors and Cronbach's alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986. Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44. The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF, good reliability (Cronbach's alpha = 0.98, and good construct validity (Pearson correlations between 0.62 and 0.75. A computer adaptive test (CAT and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed.

  17. Constraint Differentiation

    DEFF Research Database (Denmark)

    Mödersheim, Sebastian Alexander; Basin, David; Viganò, Luca

    2010-01-01

    We introduce constraint differentiation, a powerful technique for reducing search when model-checking security protocols using constraint-based methods. Constraint differentiation works by eliminating certain kinds of redundancies that arise in the search space when using constraints to represent...... results show that constraint differentiation substantially reduces search and considerably improves the performance of OFMC, enabling its application to a wider class of problems....

  18. Differential manifolds

    CERN Document Server

    Kosinski, Antoni A

    2007-01-01

    The concepts of differential topology form the center of many mathematical disciplines such as differential geometry and Lie group theory. Differential Manifolds presents to advanced undergraduates and graduate students the systematic study of the topological structure of smooth manifolds. Author Antoni A. Kosinski, Professor Emeritus of Mathematics at Rutgers University, offers an accessible approach to both the h-cobordism theorem and the classification of differential structures on spheres.""How useful it is,"" noted the Bulletin of the American Mathematical Society, ""to have a single, sho

  19. 47 CFR 32.7600 - Extraordinary items.

    Science.gov (United States)

    2010-10-01

    ... FOR TELECOMMUNICATIONS COMPANIES Instructions For Other Income Accounts § 32.7600 Extraordinary items... extraordinary. Extraordinary events and transactions are distinguished by both their unusual nature and by the infrequency of their occurrence, taking into account the environment in which the company operates. This...

  20. Soviet Cybernetics: Recent News Items, Number Thirteen.

    Science.gov (United States)

    Holland, Wade B.

    An issue of "Soviet Cybernetics: Recent News Items" consists of English translations of the leading recent Soviet contributions to the study of cybernetics. Articles deal with cybernetics in the 21st Century; the Soviet State Committee on Science and Technology; economic reforms in Rudnev's ministry; an interview with Rudnev; Dnepr-2; Dnepr-2…

  1. Random Item Generation Is Affected by Age

    Science.gov (United States)

    Multani, Namita; Rudzicz, Frank; Wong, Wing Yiu Stephanie; Namasivayam, Aravind Kumar; van Lieshout, Pascal

    2016-01-01

    Purpose: Random item generation (RIG) involves central executive functioning. Measuring aspects of random sequences can therefore provide a simple method to complement other tools for cognitive assessment. We examine the extent to which RIG relates to specific measures of cognitive function, and whether those measures can be estimated using RIG…

  2. In-Process Items on LCS.

    Science.gov (United States)

    Russell, Thyra K.

    Morris Library at Southern Illinois University computerized its technical processes using the Library Computer System (LCS), which was implemented in the library to streamline order processing by: (1) providing up-to-date online files to track in-process items; (2) encouraging quick, efficient accessing of information; (3) reducing manual files;…

  3. Algorithmic test design using classical item parameters

    NARCIS (Netherlands)

    van der Linden, Willem J.; Adema, Jos J.

    Two optimalization models for the construction of tests with a maximal value of coefficient alpha are given. Both models have a linear form and can be solved by using a branch-and-bound algorithm. The first model assumes an item bank calibrated under the Rasch model and can be used, for instance,

  4. Item Effects in Recognition Memory for Words

    Science.gov (United States)

    Freeman, Emily; Heathcote, Andrew; Chalmers, Kerry; Hockley, William

    2010-01-01

    We investigate the effects of word characteristics on episodic recognition memory using analyses that avoid Clark's (1973) "language-as-a-fixed-effect" fallacy. Our results demonstrate the importance of modeling word variability and show that episodic memory for words is strongly affected by item noise (Criss & Shiffrin, 2004), as measured by the…

  5. Extending Item Response Theory to Online Homework

    Science.gov (United States)

    Kortemeyer, Gerd

    2014-01-01

    Item response theory (IRT) becomes an increasingly important tool when analyzing "big data" gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large-enrollment physics course for…

  6. Item Response Theory: A Basic Concept

    Science.gov (United States)

    Mahmud, Jumailiyah

    2017-01-01

    With the development in computing technology, item response theory (IRT) develops rapidly, and has become a user friendly application in psychometrics world. Limitation in classical theory is one aspect that encourages the use of IRT. In this study, the basic concept of IRT will be discussed. In addition, it will briefly review the ability…

  7. Item Response Theory for Peer Assessment

    Science.gov (United States)

    Uto, Masaki; Ueno, Maomi

    2016-01-01

    As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…

  8. 77 FR 59339 - Acquisition of Commercial Items

    Science.gov (United States)

    2012-09-27

    ... DEPARTMENT OF DEFENSE Defense Acquisition Regulations System 48 CFR Part 212 Acquisition of Commercial Items CFR Correction 212.504 [Corrected] In Title 48 of the Code of Federal Regulations, Chapter 2 (Parts 201--299), revised as of October 1, 2011, on page 73, in section 212.504, paragraph (a) is...

  9. Bayesian item selection criteria for adaptive testing

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1996-01-01

    R.J. Owen (1975) proposed an approximate empirical Bayes procedure for item selection in adaptive testing. The procedure replaces the true posterior by a normal approximation with closed-form expressions for its first two moments. This approximation was necessary to minimize the computational

  10. Aging and Confidence Judgments in Item Recognition

    Science.gov (United States)

    Voskuilen, Chelsea; Ratcliff, Roger; McKoon, Gail

    2018-01-01

    We examined the effects of aging on performance in an item-recognition experiment with confidence judgments. A model for confidence judgments and response time (RTs; Ratcliff & Starns, 2013) was used to fit a large amount of data from a new sample of older adults and a previously reported sample of younger adults. This model of confidence…

  11. 10 CFR 74.55 - Item monitoring.

    Science.gov (United States)

    2010-01-01

    ... Quantities of Strategic Special Nuclear Material § 74.55 Item monitoring. (a) Licensees subject to § 74.51... quantitatively measured, the validity of that measurement independently confirmed, and that additionally have..., except for reactor components measuring at least one meter in length and weighing in excess of 30...

  12. Identify, Organize, and Retrieve Items Using Zotero

    Science.gov (United States)

    Clark, Brian; Stierman, John

    2009-01-01

    Librarians build collections. To do this they use tools that help them identify, organize, and retrieve items for the collection. Zotero (zoh-TAIR-oh) is such a tool that helps the user build a library of useful books, articles, web sites, blogs, etc., discovered while surfing online. A visit to Zotero's homepage, www.zotero.org, shows a number of…

  13. Development and psychometric characteristics of the SCI-QOL Ability to Participate and Satisfaction with Social Roles and Activities item banks and short forms.

    Science.gov (United States)

    Heinemann, Allen W; Kisala, Pamela A; Hahn, Elizabeth A; Tulsky, David S

    2015-05-01

    To develop a spinal cord injury (SCI)-focused version of PROMIS and Neuro-QOL social domain item banks; evaluate the psychometric properties of items developed for adults with SCI; and report information to facilitate clinical and research use. We used a mixed-methods design to develop and evaluate Ability to Participate in Social Roles and Activities and Satisfaction with Social Roles and Activities items. Focus groups helped define the constructs; cognitive interviews helped revise items; and confirmatory factor analysis and item response theory methods helped calibrate item banks and evaluate differential item functioning related to demographic and injury characteristics. Five SCI Model System sites and one Veterans Administration medical center. The calibration sample consisted of 641 individuals; a reliability sample consisted of 245 individuals residing in the community. A subset of 27 Ability to Participate and 35 Satisfaction items demonstrated good measurement properties and negligible differential item functioning related to demographic and injury characteristics. The SCI-specific measures correlate strongly with the PROMIS and Neuro-QOL versions. Ten item short forms correlate >0.96 with the full banks. Variable-length CATs with a minimum of 4 items, variable-length CATs with a minimum of 8 items, fixed-length CATs of 10 items, and the 10-item short forms demonstrate construct coverage and measurement error that is comparable to the full item bank. The Ability to Participate and Satisfaction with Social Roles and Activities CATs and short forms demonstrate excellent psychometric properties and are suitable for clinical and research applications.

  14. Differential equations

    CERN Document Server

    Barbu, Viorel

    2016-01-01

    This textbook is a comprehensive treatment of ordinary differential equations, concisely presenting basic and essential results in a rigorous manner. Including various examples from physics, mechanics, natural sciences, engineering and automatic theory, Differential Equations is a bridge between the abstract theory of differential equations and applied systems theory. Particular attention is given to the existence and uniqueness of the Cauchy problem, linear differential systems, stability theory and applications to first-order partial differential equations. Upper undergraduate students and researchers in applied mathematics and systems theory with a background in advanced calculus will find this book particularly useful. Supplementary topics are covered in an appendix enabling the book to be completely self-contained.

  15. The Longer We Have to Forget the More We Remember: The Ironic Effect of Postcue Duration in Item-Based Directed Forgetting

    Science.gov (United States)

    Bancroft, Tyler D.; Hockley, William E.; Farquhar, Riley

    2013-01-01

    The effects of the duration of remember and forget cues were examined to test the differential rehearsal account of item-based directed forgetting. In Experiments 1 and 2, cues were shown for 300, 600, or 900 ms, and a directed forgetting effect (better recognition of remember than forget items) was found at each duration. In addition, recognition…

  16. Complex versus Simple Modeling for DIF Detection: When the Intraclass Correlation Coefficient (?) of the Studied Item Is Less Than the ? of the Total Score

    Science.gov (United States)

    Jin, Ying; Myers, Nicholas D.; Ahn, Soyeon

    2014-01-01

    Previous research has demonstrated that differential item functioning (DIF) methods that do not account for multilevel data structure could result in too frequent rejection of the null hypothesis (i.e., no DIF) when the intraclass correlation coefficient (?) of the studied item was the same as the ? of the total score. The current study extended…

  17. Do Self Concept Tests Test Self Concept? An Evaluation of the Validity of Items on the Piers Harris and Coopersmith Measures.

    Science.gov (United States)

    Lynch, Mervin D.; Chaves, John

    Items from Peirs-Harris and Coopersmith self-concept tests were evaluated against independent measures on three self-constructs, idealized, empathic, and worth. Construct measurements were obtained with the semantic differential and D statistic. Ratings were obtained from 381 children, grades 4-6. For each test, item ratings and construct measures…

  18. Evaluation of the Fecal Incontinence Quality of Life Scale (FIQL) using item response theory reveals limitations and suggests revisions.

    Science.gov (United States)

    Peterson, Alexander C; Sutherland, Jason M; Liu, Guiping; Crump, R Trafford; Karimuddin, Ahmer A

    2018-06-01

    The Fecal Incontinence Quality of Life Scale (FIQL) is a commonly used patient-reported outcome measure for fecal incontinence, often used in clinical trials, yet has not been validated in English since its initial development. This study uses modern methods to thoroughly evaluate the psychometric characteristics of the FIQL and its potential for differential functioning by gender. This study analyzed prospectively collected patient-reported outcome data from a sample of patients prior to colorectal surgery. Patients were recruited from 14 general and colorectal surgeons in Vancouver Coastal Health hospitals in Vancouver, Canada. Confirmatory factor analysis was used to assess construct validity. Item response theory was used to evaluate test reliability, describe item-level characteristics, identify local item dependence, and test for differential functioning by gender. 236 patients were included for analysis, with mean age 58 and approximately half female. Factor analysis failed to identify the lifestyle, coping, depression, and embarrassment domains, suggesting lack of construct validity. Items demonstrated low difficulty, indicating that the test has the highest reliability among individuals who have low quality of life. Five items are suggested for removal or replacement. Differential test functioning was minimal. This study has identified specific improvements that can be made to each domain of the Fecal Incontinence Quality of Life Scale and to the instrument overall. Formatting, scoring, and instructions may be simplified, and items with higher difficulty developed. The lifestyle domain can be used as is. The embarrassment domain should be significantly revised before use.

  19. Calibration of Automatically Generated Items Using Bayesian Hierarchical Modeling.

    Science.gov (United States)

    Johnson, Matthew S.; Sinharay, Sandip

    For complex educational assessments, there is an increasing use of "item families," which are groups of related items. However, calibration or scoring for such an assessment requires fitting models that take into account the dependence structure inherent among the items that belong to the same item family. C. Glas and W. van der Linden…

  20. Applying Hierarchical Model Calibration to Automatically Generated Items.

    Science.gov (United States)

    Williamson, David M.; Johnson, Matthew S.; Sinharay, Sandip; Bejar, Isaac I.

    This study explored the application of hierarchical model calibration as a means of reducing, if not eliminating, the need for pretesting of automatically generated items from a common item model prior to operational use. Ultimately the successful development of automatic item generation (AIG) systems capable of producing items with highly similar…

  1. 10 CFR 835.605 - Labeling items and containers.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false Labeling items and containers. 835.605 Section 835.605... items and containers. Except as provided at § 835.606, each item or container of radioactive material... information to permit individuals handling, using, or working in the vicinity of the items or containers to...

  2. 41 CFR 101-27.404 - Review of items.

    Science.gov (United States)

    2010-07-01

    ... 41 Public Contracts and Property Management 2 2010-07-01 2010-07-01 true Review of items. 101-27.404 Section 101-27.404 Public Contracts and Property Management Federal Property Management...-Elimination of Items From Inventory § 101-27.404 Review of items. Except for standby or reserve stocks, items...

  3. Australian Chemistry Test Item Bank: Years 11 & 12. Volume 1.

    Science.gov (United States)

    Commons, C., Ed.; Martin, P., Ed.

    Volume 1 of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the "ACER…

  4. ACER Chemistry Test Item Collection. ACER Chemtic Year 12.

    Science.gov (United States)

    Australian Council for Educational Research, Hawthorn.

    The chemistry test item banks contains 225 multiple-choice questions suitable for diagnostic and achievement testing; a three-page teacher's guide; answer key with item facilities; an answer sheet; and a 45-item sample achievement test. Although written for the new grade 12 chemistry course in Victoria, Australia, the items are widely applicable.…

  5. Utilizing Response Time Distributions for Item Selection in CAT

    Science.gov (United States)

    Fan, Zhewen; Wang, Chun; Chang, Hua-Hua; Douglas, Jeffrey

    2012-01-01

    Traditional methods for item selection in computerized adaptive testing only focus on item information without taking into consideration the time required to answer an item. As a result, some examinees may receive a set of items that take a very long time to finish, and information is not accrued as efficiently as possible. The authors propose two…

  6. A Review of Classical Methods of Item Analysis.

    Science.gov (United States)

    French, Christine L.

    Item analysis is a very important consideration in the test development process. It is a statistical procedure to analyze test items that combines methods used to evaluate the important characteristics of test items, such as difficulty, discrimination, and distractibility of the items in a test. This paper reviews some of the classical methods for…

  7. The Protective Behavioral Strategies for Marijuana Scale: Further examination using item response theory.

    Science.gov (United States)

    Pedersen, Eric R; Huang, Wenjing; Dvorak, Robert D; Prince, Mark A; Hummer, Justin F

    2017-08-01

    Given recent state legislation legalizing marijuana for recreational purposes and majority popular opinion favoring these laws, we developed the Protective Behavioral Strategies for Marijuana scale (PBSM) to identify strategies that may mitigate the harms related to marijuana use among those young people who choose to use the drug. In the current study, we expand on the initial exploratory study of the PBSM to further validate the measure with a large and geographically diverse sample (N = 2,117; 60% women, 30% non-White) of college students from 11 different universities across the United States. We sought to develop a psychometrically sound item bank for the PBSM and to create a short assessment form that minimizes respondent burden and time. Quantitative item analyses, including exploratory and confirmatory factor analyses with item response theory (IRT) and evaluation of differential item functioning (DIF), revealed an item bank of 36 items that was examined for unidimensionality and good content coverage, as well as a short form of 17 items that is free of bias in terms of gender (men vs. women), race (White vs. non-White), ethnicity (Hispanic vs. non-Hispanic), and recreational marijuana use legal status (state recreational marijuana was legal for 25.5% of participants). We also provide a scoring table for easy transformation from sum scores to IRT scale scores. The PBSM item bank and short form associated strongly and negatively with past month marijuana use and consequences. The measure may be useful to researchers and clinicians conducting intervention and prevention programs with young adults. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  8. A signal detection-item response theory model for evaluating neuropsychological measures.

    Science.gov (United States)

    Thomas, Michael L; Brown, Gregory G; Gur, Ruben C; Moore, Tyler M; Patt, Virginie M; Risbrough, Victoria B; Baker, Dewleen G

    2018-02-05

    Models from signal detection theory are commonly used to score neuropsychological test data, especially tests of recognition memory. Here we show that certain item response theory models can be formulated as signal detection theory models, thus linking two complementary but distinct methodologies. We then use the approach to evaluate the validity (construct representation) of commonly used research measures, demonstrate the impact of conditional error on neuropsychological outcomes, and evaluate measurement bias. Signal detection-item response theory (SD-IRT) models were fitted to recognition memory data for words, faces, and objects. The sample consisted of U.S. Infantry Marines and Navy Corpsmen participating in the Marine Resiliency Study. Data comprised item responses to the Penn Face Memory Test (PFMT; N = 1,338), Penn Word Memory Test (PWMT; N = 1,331), and Visual Object Learning Test (VOLT; N = 1,249), and self-report of past head injury with loss of consciousness. SD-IRT models adequately fitted recognition memory item data across all modalities. Error varied systematically with ability estimates, and distributions of residuals from the regression of memory discrimination onto self-report of past head injury were positively skewed towards regions of larger measurement error. Analyses of differential item functioning revealed little evidence of systematic bias by level of education. SD-IRT models benefit from the measurement rigor of item response theory-which permits the modeling of item difficulty and examinee ability-and from signal detection theory-which provides an interpretive framework encompassing the experimentally validated constructs of memory discrimination and response bias. We used this approach to validate the construct representation of commonly used research measures and to demonstrate how nonoptimized item parameters can lead to erroneous conclusions when interpreting neuropsychological test data. Future work might include the

  9. Evaluating HIV Knowledge Questionnaires Among Men Who Have Sex with Men: A Multi-Study Item Response Theory Analysis.

    Science.gov (United States)

    Janulis, Patrick; Newcomb, Michael E; Sullivan, Patrick; Mustanski, Brian

    2018-01-01

    Knowledge about the transmission, prevention, and treatment of HIV remains a critical element in psychosocial models of HIV risk behavior and is commonly used as an outcome in HIV prevention interventions. However, most HIV knowledge questions have not undergone rigorous psychometric testing such as using item response theory. The current study used data from six studies of men who have sex with men (MSM; n = 3565) to (1) examine the item properties of HIV knowledge questions, (2) test for differential item functioning on commonly studied characteristics (i.e., age, race/ethnicity, and HIV risk behavior), (3) select items with the optimal item characteristics, and (4) leverage this combined dataset to examine the potential moderating effect of age on the relationship between condomless anal sex (CAS) and HIV knowledge. Findings indicated that existing questions tend to poorly differentiate those with higher levels of HIV knowledge, but items were relatively robust across diverse individuals. Furthermore, age moderated the relationship between CAS and HIV knowledge with older MSM having the strongest association. These findings suggest that additional items are required in order to capture a more nuanced understanding of HIV knowledge and that the association between CAS and HIV knowledge may vary by age.

  10. Item analysis and evaluation in the examinations in the faculty of ...

    African Journals Online (AJOL)

    2014-11-05

    Nov 5, 2014 ... Key words: Classical test theory, item analysis, item difficulty, item discrimination, item response theory, reliability ... the probability of answering an item correctly or of attaining ..... A Monte Carlo comparison of item and person.

  11. Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients.

    Science.gov (United States)

    Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J B; Conroy, Thierry; Tomaszewski, Krzysztof A; Young, Teresa; Petersen, Morten Aa

    2017-11-01

    The European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Group is developing computerized adaptive testing (CAT) versions of all EORTC Quality of Life Questionnaire (QLQ-C30) scales with the aim to enhance measurement precision. Here we present the results on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients. This evaluation included an assessment of dimensionality, fit to the item response theory (IRT) model, differential item functioning (DIF), and measurement properties. A total of 1030 cancer patients completed the 44 candidate items on CF. Of these, 34 items could be included in a unidimensional IRT model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient assessment of HRQOL of cancer patients, without loss of comparability of results.

  12. Development of the Open Items Tracking System

    International Nuclear Information System (INIS)

    Riggi, V.

    1994-01-01

    The West Valley Demonstration Project, located on the site of the only commercial nuclear fuel reprocessing facility to have operated in USA, has the directed objectives of solidifying the high-level radioactive waste into a durable, solid form for shipment; decontaminating and decommissioning the tanks and facilities; and disposing of the resulting low-level and transuranic wastes. Since an escalating trend of open work items was noticed in the Fall of 1988, and there was no control mechanism for tracking and closing the open items, a Work Control System was developed for this purpose. It is self-contained system on a mainframe ARTEMIS 9000, which tracks, monitors, and closes out external commitments in a timely manner. Audits, surveillances, site appraisals, preventive maintenance, instrument calibration recall, and scheduling are covered

  13. Item calibration in incomplete testing designs

    Directory of Open Access Journals (Sweden)

    Norman D. Verhelst

    2011-01-01

    Full Text Available This study discusses the justifiability of item parameter estimation in incomplete testing designs in item response theory. Marginal maximum likelihood (MML as well as conditional maximum likelihood (CML procedures are considered in three commonly used incomplete designs: random incomplete, multistage testing and targeted testing designs. Mislevy and Sheenan (1989 have shown that in incomplete designs the justifiability of MML can be deduced from Rubin's (1976 general theory on inference in the presence of missing data. Their results are recapitulated and extended for more situations. In this study it is shown that for CML estimation the justification must be established in an alternative way, by considering the neglected part of the complete likelihood. The problems with incomplete designs are not generally recognized in practical situations. This is due to the stochastic nature of the incomplete designs which is not taken into account in standard computer algorithms. For that reason, incorrect uses of standard MML- and CML-algorithms are discussed.

  14. Effect of study context on item recollection.

    Science.gov (United States)

    Skinner, Erin I; Fernandes, Myra A

    2010-07-01

    We examined how visual context information provided during encoding, and unrelated to the target word, affected later recollection for words presented alone using a remember-know paradigm. Experiments 1A and 1B showed that participants had better overall memory-specifically, recollection-for words studied with pictures of intact faces than for words studied with pictures of scrambled or inverted faces. Experiment 2 replicated these results and showed that recollection was higher for words studied with pictures of faces than when no image accompanied the study word. In Experiment 3 participants showed equivalent memory for words studied with unique faces as for those studied with a repeatedly presented face. Results suggest that recollection benefits when visual context information high in meaningful content accompanies study words and that this benefit is not related to the uniqueness of the context. We suggest that participants use elaborative processes to integrate item and meaningful contexts into ensemble information, improving subsequent item recollection.

  15. Differential games

    CERN Document Server

    Friedman, Avner

    2006-01-01

    This volume lays the mathematical foundations for the theory of differential games, developing a rigorous mathematical framework with existence theorems. It begins with a precise definition of a differential game and advances to considerations of games of fixed duration, games of pursuit and evasion, the computation of saddle points, games of survival, and games with restricted phase coordinates. Final chapters cover selected topics (including capturability and games with delayed information) and N-person games.Geared toward graduate students, Differential Games will be of particular interest

  16. Differential Geometry

    CERN Document Server

    Stoker, J J

    2011-01-01

    This classic work is now available in an unabridged paperback edition. Stoker makes this fertile branch of mathematics accessible to the nonspecialist by the use of three different notations: vector algebra and calculus, tensor calculus, and the notation devised by Cartan, which employs invariant differential forms as elements in an algebra due to Grassman, combined with an operation called exterior differentiation. Assumed are a passing acquaintance with linear algebra and the basic elements of analysis.

  17. Assessing Psychopathy Among Justice Involved Adolescents with the PCL: YV: An Item Response Theory Examination Across Gender

    Science.gov (United States)

    Tsang, Siny; Schmidt, Karen M.; Vincent, Gina M.; Salekin, Randall T.; Moretti, Marlene M.; Odgers, Candice L.

    2014-01-01

    This study used an item response theory (IRT) model and a large adolescent sample of justice involved youth (N = 1,007, 38% female) to examine the item functioning of the Psychopathy Checklist – Youth Version (PCL: YV). Items that were most discriminating (or most sensitive to changes) of the latent trait (thought to be psychopathy) among adolescents included “Glibness/superficial charm”, “Lack of remorse”, and “Need for stimulation”, whereas items that were least discriminating included “Pathological lying”, “Failure to accept responsibility”, and “Lacks goals.” The items “Impulsivity” and “Irresponsibility” were the most likely to be rated high among adolescents, whereas “Parasitic lifestyle”, and “Glibness/superficial charm” were the most likely to be rated low. Evidence of differential item functioning (DIF) on four of the 13 items was found between boys and girls. “Failure to accept responsibility” and “Impulsivity” were endorsed more frequently to describe adolescent girls than boys at similar levels of the latent trait, and vice versa for “Grandiose sense of self-worth” and “Lacks goals.” The DIF findings suggest that four PCL: YV items function differently between boys and girls. PMID:25580672

  18. Item response theory applied to factors affecting the patient journey towards hearing rehabilitation

    Directory of Open Access Journals (Sweden)

    Michelene Chenault

    2016-11-01

    Full Text Available To develop a tool for use in hearing screening and to evaluate the patient journey towards hearing rehabilitation, responses to the hearing aid rehabilitation questionnaire scales aid stigma, pressure, and aid unwanted addressing respectively hearing aid stigma, experienced pressure from others; perceived hearing aid benefit were evaluated with item response theory. The sample was comprised of 212 persons aged 55 years or more; 63 were hearing aid users, 64 with and 85 persons without hearing impairment according to guidelines for hearing aid reimbursement in the Netherlands. Bias was investigated relative to hearing aid use and hearing impairment within the differential test functioning framework. Items compromising model fit or demonstrating differential item functioning were dropped. The aid stigma scale was reduced from 6 to 4, the pressure scale from 7 to 4, and the aid unwanted scale from 5 to 4 items. This procedure resulted in bias-free scales ready for screening purposes and application to further understand the help-seeking process of the hearing impaired.

  19. Item Response Theory Applied to Factors Affecting the Patient Journey Towards Hearing Rehabilitation

    Science.gov (United States)

    Chenault, Michelene; Berger, Martijn; Kremer, Bernd; Anteunis, Lucien

    2016-01-01

    To develop a tool for use in hearing screening and to evaluate the patient journey towards hearing rehabilitation, responses to the hearing aid rehabilitation questionnaire scales aid stigma, pressure, and aid unwanted addressing respectively hearing aid stigma, experienced pressure from others; perceived hearing aid benefit were evaluated with item response theory. The sample was comprised of 212 persons aged 55 years or more; 63 were hearing aid users, 64 with and 85 persons without hearing impairment according to guidelines for hearing aid reimbursement in the Netherlands. Bias was investigated relative to hearing aid use and hearing impairment within the differential test functioning framework. Items compromising model fit or demonstrating differential item functioning were dropped. The aid stigma scale was reduced from 6 to 4, the pressure scale from 7 to 4, and the aid unwanted scale from 5 to 4 items. This procedure resulted in bias-free scales ready for screening purposes and application to further understand the help-seeking process of the hearing impaired. PMID:28028428

  20. CTTITEM: SAS macro and SPSS syntax for classical item analysis.

    Science.gov (United States)

    Lei, Pui-Wa; Wu, Qiong

    2007-08-01

    This article describes the functions of a SAS macro and an SPSS syntax that produce common statistics for conventional item analysis including Cronbach's alpha, item difficulty index (p-value or item mean), and item discrimination indices (D-index, point biserial and biserial correlations for dichotomous items and item-total correlation for polytomous items). These programs represent an improvement over the existing SAS and SPSS item analysis routines in terms of completeness and user-friendliness. To promote routine evaluations of item qualities in instrument development of any scale, the programs are available at no charge for interested users. The program codes along with a brief user's manual that contains instructions and examples are downloadable from suen.ed.psu.edu/-pwlei/plei.htm.

  1. Pengendalian Persediaan Primary Items dalam Logistik Konstruksi

    Directory of Open Access Journals (Sweden)

    Lady Lisya

    2016-09-01

    Full Text Available Construction logistics are activities that consist of ordering, storage and transportation of materials of construction projects. Storage material is logistics activity that ensure the availability of materials in project site. Generally, material storage activities have been conducted at the project site. Logistics construction is aimed to support the project activities that the completion schedule has been set. Construction logistics issues is determining the schedule of ordering materials so that the project can be implemented on schedule. The purpose of research is to determine the optimum ordering period for the primary items on the main building structure construction and designing inventory control cards as a mechanism for monitoring procurement of materials. This research has been obtained optimal ordering period for the primary items of main building structure with elements of the work using Fixed Period Requirement method. Inventories were already meet the material requirement of each period. Material management has been conducted based grouping approach as many as 31 groups. In addition, this research has proposed the inventory control cards as an instrument for material procurement monitoring. The implications of inventory control cards are coordinate contracting parties with vendors to plan the replenishment  of materials to meet the work schedule. Further research can be developed with other aspects such as integrated material order system between contractors and vendors to consider the safety stock. In addition, the information system for planning material is an important consideration for construction projects with large scale so that the companies can plan primary items inventory and other materials in the projects completion more easily, quickly and accurately.

  2. The staging area concept for item control

    International Nuclear Information System (INIS)

    Williams, R.A.

    1984-01-01

    Accounting for special nuclear material contained in fabricated nuclear fuel rod items has been completely automated at the Westinghouse Nuclear Fuel Division facility in Columbia, South Carolina. Experience with the automated system has shown substantial difficulty in maintaining current knowledge of the precise locations of rods pulled out of the ''normal'' processing cycle. This has been resolved by creation of two tightly controlled staging areas for handling and distribution of all ''deviant'' rods by two specially trained expeditors. Thus, coupling automated data collection with centralized expert handling and distribution has created a viable system for control of large numbers of fuel rods in a major fabrication plant

  3. Evaluation of the Psychometric Properties of the Asian Adolescent Depression Scale and Construction of a Short Form: An Item Response Theory Analysis.

    Science.gov (United States)

    Lo, Barbara Chuen Yee; Zhao, Yue; Kwok, Alice Wai Yee; Chan, Wai; Chan, Calais Kin Yuen

    2017-07-01

    The present study applied item response theory to examine the psychometric properties of the Asian Adolescent Depression Scale and to construct a short form among 1,084 teenagers recruited from secondary schools in Hong Kong. Findings suggested that some items of the full form reflected higher levels of severity and were more discriminating than others, and the Asian Adolescent Depression Scale was useful in measuring a broad range of depressive severity in community youths. Differential item functioning emerged in several items where females reported higher depressive severity than males. In the short form construction, preliminary validation suggested that, relative to the 20-item full form, our derived short form offered significantly greater diagnostic performance and stronger discriminatory ability in differentiating depressed and nondepressed groups, and simultaneously maintained adequate measurement precision with a reduced response burden in assessing depression in the Asian adolescents. Cultural variance in depressive symptomatology and clinical implications are discussed.

  4. Measuring psychological trauma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Psychological Trauma item bank and short form.

    Science.gov (United States)

    Kisala, Pamela A; Victorson, David; Pace, Natalie; Heinemann, Allen W; Choi, Seung W; Tulsky, David S

    2015-05-01

    To describe the development and psychometric properties of the SCI-QOL Psychological Trauma item bank and short form. Using a mixed-methods design, we developed and tested a Psychological Trauma item bank with patient and provider focus groups, cognitive interviews, and item response theory based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a 31-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Veterans Administration hospital. A total of 716 individuals with SCI completed the trauma items The 31 items fit a unidimensional model (CFI=0.952; RMSEA=0.061) and demonstrated good precision (theta range between 0.6 and 2.5). Nine items demonstrated negligible DIF with little impact on score estimates. The final calibrated item bank contains 19 items The SCI-QOL Psychological Trauma item bank is a psychometrically robust measurement tool from which a short form and a computer adaptive test (CAT) version are available.

  5. Measuring resilience after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Resilience item bank and short form.

    Science.gov (United States)

    Victorson, David; Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Weiland, Brian; Choi, Seung W

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury--Quality of Life (SCI-QOL) Resilience item bank and short form. Using a mixed-methods design, we developed and tested a resilience item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory based analytic approaches, including tests of model fit and differential item functioning (DIF). We tested a 32-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Department of Veterans Affairs medical center. A total of 717 individuals with SCI completed the Resilience items. A unidimensional model was observed (CFI=0.968; RMSEA=0.074) and measurement precision was good (theta range between -3.1 and 0.9). Ten items were flagged for DIF, however, after examination of effect sizes we found this to be negligible with little practical impact on score estimates. The final calibrated item bank resulted in 21 retained items. This study indicates that the SCI-QOL Resilience item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.

  6. Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form.

    Science.gov (United States)

    Kalpakjian, Claire Z; Tate, Denise G; Kisala, Pamela A; Tulsky, David S

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory-(IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. A total of 717 individuals with SCI completed the self-esteem items. A unidimensional model was observed (CFI=0.946; RMSEA=0.087) and measurement precision was good (theta range between -2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.

  7. The e-MSWS-12: improving the multiple sclerosis walking scale using item response theory.

    Science.gov (United States)

    Engelhard, Matthew M; Schmidt, Karen M; Engel, Casey E; Brenton, J Nicholas; Patek, Stephen D; Goldman, Myla D

    2016-12-01

    The Multiple Sclerosis Walking Scale (MSWS-12) is the predominant patient-reported measure of multiple sclerosis (MS) -elated walking ability, yet it had not been analyzed using item response theory (IRT), the emerging standard for patient-reported outcome (PRO) validation. This study aims to reduce MSWS-12 measurement error and facilitate computerized adaptive testing by creating an IRT model of the MSWS-12 and distributing it online. MSWS-12 responses from 284 subjects with MS were collected by mail and used to fit and compare several IRT models. Following model selection and assessment, subpopulations based on age and sex were tested for differential item functioning (DIF). Model comparison favored a one-dimensional graded response model (GRM). This model met fit criteria and explained 87 % of response variance. The performance of each MSWS-12 item was characterized using category response curves (CRCs) and item information. IRT-based MSWS-12 scores correlated with traditional MSWS-12 scores (r = 0.99) and timed 25-foot walk (T25FW) speed (r =  -0.70). Item 2 showed DIF based on age (χ 2  = 19.02, df = 5, p Item 11 showed DIF based on sex (χ 2  = 13.76, df = 5, p = 0.02). MSWS-12 measurement error depends on walking ability, but could be lowered by improving or replacing items with low information or DIF. The e-MSWS-12 includes IRT-based scoring, error checking, and an estimated T25FW derived from MSWS-12 responses. It is available at https://ms-irt.shinyapps.io/e-MSWS-12 .

  8. Item response theory analysis applied to the Spanish version of the Personal Outcomes Scale.

    Science.gov (United States)

    Guàrdia-Olmos, J; Carbó-Carreté, M; Peró-Cebollero, M; Giné, C

    2017-11-01

    The study of measurements of quality of life (QoL) is one of the great challenges of modern psychology and psychometric approaches. This issue has greater importance when examining QoL in populations that were historically treated on the basis of their deficiency, and recently, the focus has shifted to what each person values and desires in their life, as in cases of people with intellectual disability (ID). Many studies of QoL scales applied in this area have attempted to improve the validity and reliability of their components by incorporating various sources of information to achieve consistency in the data obtained. The adaptation of the Personal Outcomes Scale (POS) in Spanish has shown excellent psychometric attributes, and its administration has three sources of information: self-assessment, practitioner and family. The study of possible congruence or incongruence of observed distributions of each item between sources is therefore essential to ensure a correct interpretation of the measure. The aim of this paper was to analyse the observed distribution of items and dimensions from the three Spanish POS information sources cited earlier, using the item response theory. We studied a sample of 529 people with ID and their respective practitioners and family member, and in each case, we analysed items and factors using Samejima's model of polytomic ordinal scales. The results indicated an important number of items with differential effects regarding sources, and in some cases, they indicated significant differences in the distribution of items, factors and sources of information. As a result of this analysis, we must affirm that the administration of the POS, considering three sources of information, was adequate overall, but a correct interpretation of the results requires that it obtain much more information to consider, as well as some specific items in specific dimensions. The overall ratings, if these comments are considered, could result in bias. © 2017

  9. A Non-Parametric Item Response Theory Evaluation of the CAGE Instrument Among Older Adults.

    Science.gov (United States)

    Abdin, Edimansyah; Sagayadevan, Vathsala; Vaingankar, Janhavi Ajit; Picco, Louisa; Chong, Siow Ann; Subramaniam, Mythily

    2018-02-23

    The validity of the CAGE using item response theory (IRT) has not yet been examined in older adult population. This study aims to investigate the psychometric properties of the CAGE using both non-parametric and parametric IRT models, assess whether there is any differential item functioning (DIF) by age, gender and ethnicity and examine the measurement precision at the cut-off scores. We used data from the Well-being of the Singapore Elderly study to conduct Mokken scaling analysis (MSA), dichotomous Rasch and 2-parameter logistic IRT models. The measurement precision at the cut-off scores were evaluated using classification accuracy (CA) and classification consistency (CC). The MSA showed the overall scalability H index was 0.459, indicating a medium performing instrument. All items were found to be homogenous, measuring the same construct and able to discriminate well between respondents with high levels of the construct and the ones with lower levels. The item discrimination ranged from 1.07 to 6.73 while the item difficulty ranged from 0.33 to 2.80. Significant DIF was found for 2-item across ethnic group. More than 90% (CC and CA ranged from 92.5% to 94.3%) of the respondents were consistently and accurately classified by the CAGE cut-off scores of 2 and 3. The current study provides new evidence on the validity of the CAGE from the IRT perspective. This study provides valuable information of each item in the assessment of the overall severity of alcohol problem and the precision of the cut-off scores in older adult population.

  10. Negative effects of item repetition on source memory.

    Science.gov (United States)

    Kim, Kyungmi; Yi, Do-Joon; Raye, Carol L; Johnson, Marcia K

    2012-08-01

    In the present study, we explored how item repetition affects source memory for new item-feature associations (picture-location or picture-color). We presented line drawings varying numbers of times in Phase 1. In Phase 2, each drawing was presented once with a critical new feature. In Phase 3, we tested memory for the new source feature of each item from Phase 2. Experiments 1 and 2 demonstrated and replicated the negative effects of item repetition on incidental source memory. Prior item repetition also had a negative effect on source memory when different source dimensions were used in Phases 1 and 2 (Experiment 3) and when participants were explicitly instructed to learn source information in Phase 2 (Experiments 4 and 5). Importantly, when the order between Phases 1 and 2 was reversed, such that item repetition occurred after the encoding of critical item-source combinations, item repetition no longer affected source memory (Experiment 6). Overall, our findings did not support predictions based on item predifferentiation, within-dimension source interference, or general interference from multiple traces of an item. Rather, the findings were consistent with the idea that prior item repetition reduces attention to subsequent presentations of the item, decreasing the likelihood that critical item-source associations will be encoded.

  11. The Role of Medial Temporal Lobe Regions in Incidental and Intentional Retrieval of Item and Relational Information in Aging.

    Science.gov (United States)

    Wang, Wei-Chun; Giovanello, Kelly S

    2016-06-01

    Considerable neuropsychological and neuroimaging work indicates that the medial temporal lobes are critical for both item and relational memory retrieval. However, there remain outstanding issues in the literature, namely the extent to which medial temporal lobe regions are differentially recruited during incidental and intentional retrieval of item and relational information, and the extent to which aging may affect these neural substrates. The current fMRI study sought to address these questions; participants incidentally encoded word pairs embedded in sentences and incidental item and relational retrieval were assessed through speeded reading of intact, rearranged, and new word-pair sentences, while intentional item and relational retrieval were assessed through old/new associative recognition of a separate set of intact, rearranged, and new word pairs. Results indicated that, in both younger and older adults, anterior hippocampus and perirhinal cortex indexed incidental and intentional item retrieval in the same manner. In contrast, posterior hippocampus supported incidental and intentional relational retrieval in both age groups and an adjacent cluster in posterior hippocampus was recruited during both forms of relational retrieval for older, but not younger, adults. Our findings suggest that while medial temporal lobe regions do not differentiate between incidental and intentional forms of retrieval, there are distinct roles for anterior and posterior medial temporal lobe regions during retrieval of item and relational information, respectively, and further indicate that posterior regions may, under certain conditions, be over-recruited in healthy aging. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  12. Cash Impact of the Consumable Item Transfer, Phase II

    National Research Council Canada - National Science Library

    1998-01-01

    ...). This report is the third in a series of reports regarding the consumable item transfer (CIT), phase II. The Deputy Secretary of Defense directed the transfer of the management of consumable items to Defense Logistics Agency...

  13. Negative affect impairs associative memory but not item memory.

    OpenAIRE

    Bisby, J. A.; Burgess, N.

    2014-01-01

    The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine the effects of emotion on memory for items and their associations. By presenting neutral and negative items with background contexts, Experiment 1 ...

  14. 26 CFR 301.6501(o)-3 - Partnership items.

    Science.gov (United States)

    2010-04-01

    ... 26 Internal Revenue 18 2010-04-01 2010-04-01 false Partnership items. 301.6501(o)-3 Section 301... § 301.6501(o)-3 Partnership items. (a) Partnership item defined. For purposes of section 6501(o) (as it..., and § 301.6511(g)-1, the term “partnership item” means— (1) Any item required to be taken into account...

  15. On multidimensional item response theory -- a coordinate free approach

    OpenAIRE

    Antal, Tamás

    2007-01-01

    A coordinate system free definition of complex structure multidimensional item response theory (MIRT) for dichotomously scored items is presented. The point of view taken emphasizes the possibilities and subtleties of understanding MIRT as a multidimensional extension of the ``classical'' unidimensional item response theory models. The main theorem of the paper is that every monotonic MIRT model looks the same; they are all trivial extensions of univariate item response theory.

  16. Measurement equivalence of the KINDL questionnaire across child self-reports and parent proxy-reports: a comparison between item response theory and ordinal logistic regression.

    Science.gov (United States)

    Jafari, Peyman; Sharafi, Zahra; Bagheri, Zahra; Shalileh, Sara

    2014-06-01

    Measurement equivalence is a necessary assumption for meaningful comparison of pediatric quality of life rated by children and parents. In this study, differential item functioning (DIF) analysis is used to examine whether children and their parents respond consistently to the items in the KINDer Lebensqualitätsfragebogen (KINDL; in German, Children Quality of Life Questionnaire). Two DIF detection methods, graded response model (GRM) and ordinal logistic regression (OLR), were applied for comparability. The KINDL was completed by 1,086 school children and 1,061 of their parents. While the GRM revealed that 12 out of the 24 items were flagged with DIF, the OLR identified 14 out of the 24 items with DIF. Seven items with DIF and five items without DIF were common across the two methods, yielding a total agreement rate of 50 %. This study revealed that parent proxy-reports cannot be used as a substitute for a child's ratings in the KINDL.

  17. The Dutch-Flemish PROMIS Physical Function item bank exhibited strong psychometric properties in patients with chronic pain.

    Science.gov (United States)

    Crins, Martine H P; Terwee, Caroline B; Klausch, Thomas; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis A; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Roorda, Leo D

    2017-07-01

    The objective of this study was to assess the psychometric properties of the Dutch-Flemish Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank in Dutch patients with chronic pain. A bank of 121 items was administered to 1,247 Dutch patients with chronic pain. Unidimensionality was assessed by fitting a one-factor confirmatory factor analysis and evaluating resulting fit statistics. Items were calibrated with the graded response model and its fit was evaluated. Cross-cultural validity was assessed by testing items for differential item functioning (DIF) based on language (Dutch vs. English). Construct validity was evaluated by calculation correlations between scores on the Dutch-Flemish PROMIS Physical Function measure and scores on generic and disease-specific measures. Results supported the Dutch-Flemish PROMIS Physical Function item bank's unidimensionality (Comparative Fit Index = 0.976, Tucker Lewis Index = 0.976) and model fit. Item thresholds targeted a wide range of physical function construct (threshold-parameters range: -4.2 to 5.6). Cross-cultural validity was good as four items only showed DIF for language and their impact on item scores was minimal. Physical Function scores were strongly associated with scores on all other measures (all correlations ≤ -0.60 as expected). The Dutch-Flemish PROMIS Physical Function item bank exhibited good psychometric properties. Development of a computer adaptive test based on the large bank is warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Tailored Cloze: Improved with Classical Item Analysis Techniques.

    Science.gov (United States)

    Brown, James Dean

    1988-01-01

    The reliability and validity of a cloze procedure used as an English-as-a-second-language (ESL) test in China were improved by applying traditional item analysis and selection techniques. The 'best' test items were chosen on the basis of item facility and discrimination indices, and were administered as a 'tailored cloze.' 29 references listed.…

  19. Electronics. Criterion-Referenced Test (CRT) Item Bank.

    Science.gov (United States)

    Davis, Diane, Ed.

    This document contains 519 criterion-referenced multiple choice and true or false test items for a course in electronics. The test item bank is designed to work with both the Vocational Instructional Management System (VIMS) and the Vocational Administrative Management System (VAMS) in Missouri. The items are grouped into 15 units covering the…

  20. Guide to good practices for the development of test items

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-01-01

    While the methodology used in developing test items can vary significantly, to ensure quality examinations, test items should be developed systematically. Test design and development is discussed in the DOE Guide to Good Practices for Design, Development, and Implementation of Examinations. This guide is intended to be a supplement by providing more detailed guidance on the development of specific test items. This guide addresses the development of written examination test items primarily. However, many of the concepts also apply to oral examinations, both in the classroom and on the job. This guide is intended to be used as guidance for the classroom and laboratory instructor or curriculum developer responsible for the construction of individual test items. This document focuses on written test items, but includes information relative to open-reference (open book) examination test items, as well. These test items have been categorized as short-answer, multiple-choice, or essay. Each test item format is described, examples are provided, and a procedure for development is included. The appendices provide examples for writing test items, a test item development form, and examples of various test item formats.

  1. Assessing difference between classical test theory and item ...

    African Journals Online (AJOL)

    Assessing difference between classical test theory and item response theory methods in scoring primary four multiple choice objective test items. ... All research participants were ranked on the CTT number correct scores and the corresponding IRT item pattern scores from their performance on the PRISMADAT. Wilcoxon ...

  2. 41 CFR 101-27.209-1 - GSA stock items.

    Science.gov (United States)

    2010-07-01

    ... 41 Public Contracts and Property Management 2 2010-07-01 2010-07-01 true GSA stock items. 101-27.209-1 Section 101-27.209-1 Public Contracts and Property Management Federal Property Management...-Management of Shelf-Life Materials § 101-27.209-1 GSA stock items. Shelf-life items that meet the criteria...

  3. ACER Chemistry Test Item Collection (ACER CHEMTIC Year 12 Supplement).

    Science.gov (United States)

    Australian Council for Educational Research, Hawthorn.

    This publication contains 317 multiple-choice chemistry test items related to topics covered in the Victorian (Australia) Year 12 chemistry course. It allows teachers access to a range of items suitable for diagnostic and achievement purposes, supplementing the ACER Chemistry Test Item Collection--Year 12 (CHEMTIC). The topics covered are: organic…

  4. Computerized adaptive testing item selection in computerized adaptive learning systems

    NARCIS (Netherlands)

    Eggen, Theodorus Johannes Hendrikus Maria; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item selection methods traditionally developed for computerized adaptive testing (CAT) are explored for their usefulness in item-based computerized adaptive learning (CAL) systems. While in CAT Fisher information-based selection is optimal, for recovering learning populations in CAL systems item

  5. 12 CFR 210.8 - Presenting noncash items for acceptance.

    Science.gov (United States)

    2010-01-01

    ... for acceptance. (a) A Reserve Bank or a subsequent collecting bank may, if instructed by the sender, present a noncash item for acceptance in any manner authorized by law if— (1) The item provides that it... 12 Banks and Banking 2 2010-01-01 2010-01-01 false Presenting noncash items for acceptance. 210.8...

  6. Writing, Evaluating and Assessing Data Response Items in Economics.

    Science.gov (United States)

    Trotman-Dickenson, D. I.

    1989-01-01

    Describes some of the problems in writing data response items in economics for use by A Level and General Certificate of Secondary Education (GCSE) students. Examines the experience of two series of workshops on writing items, evaluating them and assessing responses from schools. Offers suggestions for producing packages of data response items as…

  7. Item Response Theory Models for Performance Decline during Testing

    Science.gov (United States)

    Jin, Kuan-Yu; Wang, Wen-Chung

    2014-01-01

    Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

  8. Vegetable parenting practices scale: Item response modeling analyses

    Science.gov (United States)

    Our objective was to evaluate the psychometric properties of a vegetable parenting practices scale using multidimensional polytomous item response modeling which enables assessing item fit to latent variables and the distributional characteristics of the items in comparison to the respondents. We al...

  9. A simple and fast item selection procedure for adaptive testing

    NARCIS (Netherlands)

    Veerkamp, W.J.J.; Veerkamp, Wim J.J.; Berger, Martijn; Berger, Martijn P.F.

    1994-01-01

    Items with the highest discrimination parameter values in a logistic item response theory (IRT) model do not necessarily give maximum information. This paper shows which discrimination parameter values (as a function of the guessing parameter and the distance between person ability and item

  10. Optimal item discrimination and maximum information for logistic IRT models

    NARCIS (Netherlands)

    Veerkamp, W.J.J.; Veerkamp, Wim J.J.; Berger, Martijn P.F.; Berger, Martijn

    1999-01-01

    Items with the highest discrimination parameter values in a logistic item response theory model do not necessarily give maximum information. This paper derives discrimination parameter values, as functions of the guessing parameter and distances between person parameters and item difficulty, that

  11. Psychometric validation of the Persian nine-item Internet Gaming Disorder Scale – Short Form: Does gender and hours spent online gaming affect the interpretations of item descriptions?

    Science.gov (United States)

    Wu, Tzu-Yi; Lin, Chung-Ying; Årestedt, Kristofer; Griffiths, Mark D.; Broström, Anders; Pakpour, Amir H.

    2017-01-01

    Background and aims The nine-item Internet Gaming Disorder Scale – Short Form (IGDS-SF9) is brief and effective to evaluate Internet Gaming Disorder (IGD) severity. Although its scores show promising psychometric properties, less is known about whether different groups of gamers interpret the items similarly. This study aimed to verify the construct validity of the Persian IGDS-SF9 and examine the scores in relation to gender and hours spent online gaming among 2,363 Iranian adolescents. Methods Confirmatory factor analysis (CFA) and Rasch analysis were used to examine the construct validity of the IGDS-SF9. The effects of gender and time spent online gaming per week were investigated by multigroup CFA and Rasch differential item functioning (DIF). Results The unidimensionality of the IGDS-SF9 was supported in both CFA and Rasch. However, Item 4 (fail to control or cease gaming activities) displayed DIF (DIF contrast = 0.55) slightly over the recommended cutoff in Rasch but was invariant in multigroup CFA across gender. Items 4 (DIF contrast = −0.67) and 9 (jeopardize or lose an important thing because of gaming activity; DIF contrast = 0.61) displayed DIF in Rasch and were non-invariant in multigroup CFA across time spent online gaming. Conclusions Given the Persian IGDS-SF9 was unidimensional, it is concluded that the instrument can be used to assess IGD severity. However, users of the instrument are cautioned concerning the comparisons of the sum scores of the IGDS-SF9 across gender and across adolescents spending different amounts of time online gaming. PMID:28571474

  12. Psychometric validation of the Persian nine-item Internet Gaming Disorder Scale - Short Form: Does gender and hours spent online gaming affect the interpretations of item descriptions?

    Science.gov (United States)

    Wu, Tzu-Yi; Lin, Chung-Ying; Årestedt, Kristofer; Griffiths, Mark D; Broström, Anders; Pakpour, Amir H

    2017-06-01

    Background and aims The nine-item Internet Gaming Disorder Scale - Short Form (IGDS-SF9) is brief and effective to evaluate Internet Gaming Disorder (IGD) severity. Although its scores show promising psychometric properties, less is known about whether different groups of gamers interpret the items similarly. This study aimed to verify the construct validity of the Persian IGDS-SF9 and examine the scores in relation to gender and hours spent online gaming among 2,363 Iranian adolescents. Methods Confirmatory factor analysis (CFA) and Rasch analysis were used to examine the construct validity of the IGDS-SF9. The effects of gender and time spent online gaming per week were investigated by multigroup CFA and Rasch differential item functioning (DIF). Results The unidimensionality of the IGDS-SF9 was supported in both CFA and Rasch. However, Item 4 (fail to control or cease gaming activities) displayed DIF (DIF contrast = 0.55) slightly over the recommended cutoff in Rasch but was invariant in multigroup CFA across gender. Items 4 (DIF contrast = -0.67) and 9 (jeopardize or lose an important thing because of gaming activity; DIF contrast = 0.61) displayed DIF in Rasch and were non-invariant in multigroup CFA across time spent online gaming. Conclusions Given the Persian IGDS-SF9 was unidimensional, it is concluded that the instrument can be used to assess IGD severity. However, users of the instrument are cautioned concerning the comparisons of the sum scores of the IGDS-SF9 across gender and across adolescents spending different amounts of time online gaming.

  13. Solving Differential Equations Analytically. Elementary Differential Equations. Modules and Monographs in Undergraduate Mathematics and Its Applications Project. UMAP Unit 335.

    Science.gov (United States)

    Goldston, J. W.

    This unit introduces analytic solutions of ordinary differential equations. The objective is to enable the student to decide whether a given function solves a given differential equation. Examples of problems from biology and chemistry are covered. Problem sets, quizzes, and a model exam are included, and answers to all items are provided. The…

  14. Brief Sensation Seeking Scale: Latent structure of 8-item and 4-item versions in Peruvian adolescents.

    Science.gov (United States)

    Merino-Soto, Cesar; Salas Blas, Edwin

    2018-01-01

    This research intended to validate two brief scales of sensations seeking with Peruvian adolescents: the eight item scale (BSSS8; Hoyle, Stephenson, Palmgreen, Lorch, y Donohew, 2002) and the four item scale (BSSS4; Stephenson, Hoyle, Slater, y Palmgreen, 2003). Questionnaires were administered to 618 voluntary participants, with an average age of 13.6 years, from different levels of high school, state and private school in a district in the south of Lima. It analyzed the internal structure of both short versions using three models: a) unidimensional (M1), b) oblique or related dimensions (M2), and c) the bifactor model (M3). Results show that both instruments have a single dimension which best represents the variability of the items; a fact that can be explained both by the complexity of the concept and by the small number of items representing each factor, which is more noticeable in the BSSS4. Reliability is within levels found by previous studies: alpha: .745 = BSSS8 and BSSS4 =. 643; omega coefficient: .747 in BSSS8 and .651 in BSSS4. These are considered suitable for the type of instruments studied. Based on the correlation between the two instruments, it was found that there are satisfactory levels of equivalence between the BSSS8 and BSSS4. However, it is recommended that the BSSS4 is mainly used for research and for the purpose of describing populations.

  15. Non-ignorable missingness item response theory models for choice effects in examinee-selected items.

    Science.gov (United States)

    Liu, Chen-Wei; Wang, Wen-Chung

    2017-11-01

    Examinee-selected item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set, always yields incomplete data (i.e., when only the selected items are answered, data are missing for the others) that are likely non-ignorable in likelihood inference. Standard item response theory (IRT) models become infeasible when ESI data are missing not at random (MNAR). To solve this problem, the authors propose a two-dimensional IRT model that posits one unidimensional IRT model for observed data and another for nominal selection patterns. The two latent variables are assumed to follow a bivariate normal distribution. In this study, the mirt freeware package was adopted to estimate parameters. The authors conduct an experiment to demonstrate that ESI data are often non-ignorable and to determine how to apply the new model to the data collected. Two follow-up simulation studies are conducted to assess the parameter recovery of the new model and the consequences for parameter estimation of ignoring MNAR data. The results of the two simulation studies indicate good parameter recovery of the new model and poor parameter recovery when non-ignorable missing data were mistakenly treated as ignorable. © 2017 The British Psychological Society.

  16. An NCME Instructional Module on Item-Fit Statistics for Item Response Theory Models

    Science.gov (United States)

    Ames, Allison J.; Penfield, Randall D.

    2015-01-01

    Drawing valid inferences from item response theory (IRT) models is contingent upon a good fit of the data to the model. Violations of model-data fit have numerous consequences, limiting the usefulness and applicability of the model. This instructional module provides an overview of methods used for evaluating the fit of IRT models. Upon completing…

  17. Methods for Assessing Item, Step, and Threshold Invariance in Polytomous Items Following the Partial Credit Model

    Science.gov (United States)

    Penfield, Randall D.; Myers, Nicholas D.; Wolfe, Edward W.

    2008-01-01

    Measurement invariance in the partial credit model (PCM) can be conceptualized in several different but compatible ways. In this article the authors distinguish between three forms of measurement invariance in the PCM: step invariance, item invariance, and threshold invariance. Approaches for modeling these three forms of invariance are proposed,…

  18. Random selection of items. Selection of n1 samples among N items composing a stratum

    International Nuclear Information System (INIS)

    Jaech, J.L.; Lemaire, R.J.

    1987-02-01

    STR-224 provides generalized procedures to determine required sample sizes, for instance in the course of a Physical Inventory Verification at Bulk Handling Facilities. The present report describes procedures to generate random numbers and select groups of items to be verified in a given stratum through each of the measurement methods involved in the verification. (author). 3 refs

  19. Dependability of technical items: Problems of standardization

    Science.gov (United States)

    Fedotova, G. A.; Voropai, N. I.; Kovalev, G. F.

    2016-12-01

    This paper is concerned with problems blown up in the development of a new version of the Interstate Standard GOST 27.002 "Industrial product dependability. Terms and definitions". This Standard covers a wide range of technical items and is used in numerous regulations, specifications, standard and technical documentation. A currently available State Standard GOST 27.002-89 was introduced in 1990. Its development involved a participation of scientists and experts from different technical areas, its draft was debated in different audiences and constantly refined, so it was a high quality document. However, after 25 years of its application it's become necessary to develop a new version of the Standard that would reflect the current understanding of industrial dependability, accounting for the changes taking place in Russia in the production, management and development of various technical systems and facilities. The development of a new version of the Standard makes it possible to generalize on a terminological level the knowledge and experience in the area of reliability of technical items, accumulated over a quarter of the century in different industries and reliability research schools, to account for domestic and foreign experience of standardization. Working on the new version of the Standard, we have faced a number of issues and problems on harmonization with the International Standard IEC 60500-192, caused first of all by different approaches to the use of terms and differences in the mentalities of experts from different countries. The paper focuses on the problems related to the chapter "Maintenance, restoration and repair", which caused difficulties for the developers to harmonize term definitions both with experts and the International Standard, which is mainly related to differences between the Russian concept and practice of maintenance and repair and foreign ones.

  20. Are great apes able to reason from multi-item samples to populations of food items?

    Science.gov (United States)

    Eckert, Johanna; Rakoczy, Hannes; Call, Josep

    2017-10-01

    Inductive learning from limited observations is a cognitive capacity of fundamental importance. In humans, it is underwritten by our intuitive statistics, the ability to draw systematic inferences from populations to randomly drawn samples and vice versa. According to recent research in cognitive development, human intuitive statistics develops early in infancy. Recent work in comparative psychology has produced first evidence for analogous cognitive capacities in great apes who flexibly drew inferences from populations to samples. In the present study, we investigated whether great apes (Pongo abelii, Pan troglodytes, Pan paniscus, Gorilla gorilla) also draw inductive inferences in the opposite direction, from samples to populations. In two experiments, apes saw an experimenter randomly drawing one multi-item sample from each of two populations of food items. The populations differed in their proportion of preferred to neutral items (24:6 vs. 6:24) but apes saw only the distribution of food items in the samples that reflected the distribution of the respective populations (e.g., 4:1 vs. 1:4). Based on this observation they were then allowed to choose between the two populations. Results show that apes seemed to make inferences from samples to populations and thus chose the population from which the more favorable (4:1) sample was drawn in Experiment 1. In this experiment, the more attractive sample not only contained proportionally but also absolutely more preferred food items than the less attractive sample. Experiment 2, however, revealed that when absolute and relative frequencies were disentangled, apes performed at chance level. Whether these limitations in apes' performance reflect true limits of cognitive competence or merely performance limitations due to accessory task demands is still an open question. © 2017 Wiley Periodicals, Inc.

  1. Differential discriminator

    International Nuclear Information System (INIS)

    Dukhanov, V.I.; Mazurov, I.B.

    1981-01-01

    A principal flowsheet of a differential discriminator intended for operation in a spectrometric circuit with statistical time distribution of pulses is described. The differential discriminator includes four integrated discriminators and a channel of piled-up signal rejection. The presence of the rejection channel enables the discriminator to operate effectively at loads of 14x10 3 pulse/s. The temperature instability of the discrimination thresholds equals 250 μV/ 0 C. The discrimination level changes within 0.1-5 V, the level shift constitutes 0.5% for the filling ratio of 1:10. The rejection coefficient is not less than 90%. Alpha spectrum of the 228 Th source is presented to evaluate the discriminator operation with the rejector. The rejector provides 50 ns time resolution

  2. Differential topology

    CERN Document Server

    Margalef-Roig, J

    1992-01-01

    ...there are reasons enough to warrant a coherent treatment of the main body of differential topology in the realm of Banach manifolds, which is at the same time correct and complete. This book fills the gap: whenever possible the manifolds treated are Banach manifolds with corners. Corners add to the complications and the authors have carefully fathomed the validity of all main results at corners. Even in finite dimensions some results at corners are more complete and better thought out here than elsewhere in the literature. The proofs are correct and with all details. I see this book as a reliable monograph of a well-defined subject; the possibility to fall back to it adds to the feeling of security when climbing in the more dangerous realms of infinite dimensional differential geometry. Peter W. Michor

  3. Differential belongings

    DEFF Research Database (Denmark)

    Oldrup, Helene

    2014-01-01

    This paper explores suburban middle-class residents’ narratives about housing choice, everyday life and belonging in residential areas of Greater Copenhagen, Denmark, to understand how residential processes of social differentiation are constituted. Using Savage et al.’s concepts of discursive...... and not only to the area itself. In addition, rather than seeing suburban residential areas as homogenous, greater attention should be paid to differences within such areas....

  4. Location Indices for Ordinal Polytomous Items Based on Item Response Theory. Research Report. ETS RR-15-20

    Science.gov (United States)

    Ali, Usama S.; Chang, Hua-Hua; Anderson, Carolyn J.

    2015-01-01

    Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index is needed to describe an item's location along a latent trait continuum. Situations in which a single index would be needed include item selection in computerized adaptive testing or test assembly. Therefore single…

  5. Science Library of Test Items. Volume Twenty-Two. A Collection of Multiple Choice Test Items Relating Mainly to Skills.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  6. Science Library of Test Items. Volume Eighteen. A Collection of Multiple Choice Test Items Relating Mainly to Chemistry.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  7. Science Library of Test Items. Volume Twenty. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 1.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  8. Science Library of Test Items. Volume Seventeen. A Collection of Multiple Choice Test Items Relating Mainly to Biology.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  9. Science Library of Test Items. Volume Nineteen. A Collection of Multiple Choice Test Items Relating Mainly to Geology.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  10. Criteria for eliminating items of a Test of Figural Analogies

    Directory of Open Access Journals (Sweden)

    Diego Blum

    2013-12-01

    Full Text Available This paper describes the steps taken to eliminate two of the items in a Test of Figural Analogies (TFA. The main guidelines of psychometric analysis concerning Classical Test Theory (CTT and Item Response Theory (IRT are explained. The item elimination process was based on both the study of the CTT difficulty and discrimination index, and the unidimensionality analysis. The a, b, and c parameters of the Three Parameter Logistic Model of IRT were also considered for this purpose, as well as the assessment of each item fitting this model. The unfavourable characteristics of a group of TFA items are detailed, and decisions leading to their possible elimination are discussed.

  11. Item response modeling: a psychometric assessment of the children's fruit, vegetable, water, and physical activity self-efficacy scales among Chinese children.

    Science.gov (United States)

    Wang, Jing-Jing; Chen, Tzu-An; Baranowski, Tom; Lau, Patrick W C

    2017-09-16

    This study aimed to evaluate the psychometric properties of four self-efficacy scales (i.e., self-efficacy for fruit (FSE), vegetable (VSE), and water (WSE) intakes, and physical activity (PASE)) and to investigate their differences in item functioning across sex, age, and body weight status groups using item response modeling (IRM) and differential item functioning (DIF). Four self-efficacy scales were administrated to 763 Hong Kong Chinese children (55.2% boys) aged 8-13 years. Classical test theory (CTT) was used to examine the reliability and factorial validity of scales. IRM was conducted and DIF analyses were performed to assess the characteristics of item parameter estimates on the basis of children's sex, age and body weight status. All self-efficacy scales demonstrated adequate to excellent internal consistency reliability (Cronbach's α: 0.79-0.91). One FSE misfit item and one PASE misfit item were detected. Small DIF were found for all the scale items across children's age groups. Items with medium to large DIF were detected in different sex and body weight status groups, which will require modification. A Wright map revealed that items covered the range of the distribution of participants' self-efficacy for each scale except VSE. Several self-efficacy scales' items functioned differently by children's sex and body weight status. Additional research is required to modify the four self-efficacy scales to minimize these moderating influences for application.

  12. Sources of interference in item and associative recognition memory.

    Science.gov (United States)

    Osth, Adam F; Dennis, Simon

    2015-04-01

    A powerful theoretical framework for exploring recognition memory is the global matching framework, in which a cue's memory strength reflects the similarity of the retrieval cues being matched against the contents of memory simultaneously. Contributions at retrieval can be categorized as matches and mismatches to the item and context cues, including the self match (match on item and context), item noise (match on context, mismatch on item), context noise (match on item, mismatch on context), and background noise (mismatch on item and context). We present a model that directly parameterizes the matches and mismatches to the item and context cues, which enables estimation of the magnitude of each interference contribution (item noise, context noise, and background noise). The model was fit within a hierarchical Bayesian framework to 10 recognition memory datasets that use manipulations of strength, list length, list strength, word frequency, study-test delay, and stimulus class in item and associative recognition. Estimates of the model parameters revealed at most a small contribution of item noise that varies by stimulus class, with virtually no item noise for single words and scenes. Despite the unpopularity of background noise in recognition memory models, background noise estimates dominated at retrieval across nearly all stimulus classes with the exception of high frequency words, which exhibited equivalent levels of context noise and background noise. These parameter estimates suggest that the majority of interference in recognition memory stems from experiences acquired before the learning episode. (c) 2015 APA, all rights reserved).

  13. Constructing the 32-item Fitness-to-Drive Screening Measure.

    Science.gov (United States)

    Medhizadah, Shabnam; Classen, Sherrilene; Johnson, Andrew M

    2018-04-01

    The Fitness-to-Drive Screening Measure © (FTDS) enables proxies to identify at-risk older drivers via 54 driving-related items, but may be too lengthy for widespread uptake. We reduced the number of items in the FTDS and validated the shorter measure, using 200 caregiver responses. Exploratory factor analysis and classical test theory techniques were used to determine the most interpretable factor model and the minimum number of items to be used for predicting fitness to drive. The extent to which the shorter FTDS predicted the results of the 54-item FTDS was evaluated through correlational analysis. A three-factor model best represented the empirical data. Classical test theory techniques lead to the development of the 32-item FTDS. The 32-item FTDS was highly correlated ( r = .99, p = .05) with the FTDS. The 32-item FTDS may provide raters with a faster and more efficient way to identify at-risk older drivers.

  14. Feed mechanism and method for feeding minute items

    Science.gov (United States)

    Stringer, Timothy Kent [Bucyrus, KS; Yerganian, Simon Scott [Lee's Summit, MO

    2009-10-20

    A feeding mechanism and method for feeding minute items, such as capacitors, resistors, or solder preforms. The mechanism is adapted to receive a plurality of the randomly-positioned and randomly-oriented extremely small or minute items, and to isolate, orient, and position one or more of the items in a specific repeatable pickup location wherefrom they may be removed for use by, for example, a computer-controlled automated assembly machine. The mechanism comprises a sliding shelf adapted to receive and support the items; a wiper arm adapted to achieve a single even layer of the items; and a pushing arm adapted to push the items into the pickup location. The mechanism can be adapted for providing the items with a more exact orientation, and can also be adapted for use in a liquid environment.

  15. Negative affect impairs associative memory but not item memory.

    Science.gov (United States)

    Bisby, James A; Burgess, Neil

    2013-12-17

    The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine the effects of emotion on memory for items and their associations. By presenting neutral and negative items with background contexts, Experiment 1 demonstrated that item memory was facilitated by emotional affect, whereas memory for an associated context was reduced. In Experiment 2, arousal was manipulated independently of the memoranda, by a threat of shock, whereby encoding trials occurred under conditions of threat or safety. Memory for context was equally impaired by the presence of negative affect, whether induced by threat of shock or a negative item, relative to retrieval of the context of a neutral item in safety. In Experiment 3, participants were presented with neutral and negative items as paired associates, including all combinations of neutral and negative items. The results showed both above effects: compared to a neutral item, memory for the associate of a negative item (a second item here, context in Experiments 1 and 2) is impaired, whereas retrieval of the item itself is enhanced. Our findings suggest that negative affect impairs associative memory while recognition of a negative item is enhanced. They support dual-processing models in which negative affect or stress impairs hippocampal-dependent associative memory while the storage of negative sensory/perceptual representations is spared or even strengthened.

  16. Distinguishing Differential Testlet Functioning from Differential Bundle Functioning Using the Multilevel Measurement Model

    Science.gov (United States)

    Beretvas, S. Natasha; Walker, Cindy M.

    2012-01-01

    This study extends the multilevel measurement model to handle testlet-based dependencies. A flexible two-level testlet response model (the MMMT-2 model) for dichotomous items is introduced that permits assessment of differential testlet functioning (DTLF). A distinction is made between this study's conceptualization of DTLF and that of…

  17. Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?

    Science.gov (United States)

    Sinharay, Sandip

    2017-09-01

    Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.

  18. Does remembering emotional items impair recall of same-emotion items?

    Science.gov (United States)

    Sison, Jo Ann G; Mather, Mara

    2007-04-01

    In the part-set cuing effect, cuing a subset of previously studied items impairs recall of the remaining noncued items. This experiment reveals that cuing participants with previously-studied emotional pictures (e.g., fear-evoking pictures of people) can impair recall of pictures involving the same emotion but different content (e.g., fear-evoking pictures of animals). This indicates that new events can be organized in memory using emotion as a grouping function to create associations. However, whether new information is organized in memory along emotional or nonemotional lines appears to be a flexible process that depends on people's current focus. Mentioning in the instructions that the pictures were either amusement- or fear-related led to memory impairment for pictures with the same emotion as cued pictures, whereas mentioning that the pictures depicted either animals or people led to memory impairment for pictures with the same type of actor.

  19. Robust Measurement via A Fused Latent and Graphical Item Response Theory Model.

    Science.gov (United States)

    Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Ying, Zhiliang

    2018-03-12

    Item response theory (IRT) plays an important role in psychological and educational measurement. Unlike the classical testing theory, IRT models aggregate the item level information, yielding more accurate measurements. Most IRT models assume local independence, an assumption not likely to be satisfied in practice, especially when the number of items is large. Results in the literature and simulation studies in this paper reveal that misspecifying the local independence assumption may result in inaccurate measurements and differential item functioning. To provide more robust measurements, we propose an integrated approach by adding a graphical component to a multidimensional IRT model that can offset the effect of unknown local dependence. The new model contains a confirmatory latent variable component, which measures the targeted latent traits, and a graphical component, which captures the local dependence. An efficient proximal algorithm is proposed for the parameter estimation and structure learning of the local dependence. This approach can substantially improve the measurement, given no prior information on the local dependence structure. The model can be applied to measure both a unidimensional latent trait and multidimensional latent traits.

  20. Reading ability and print exposure: item response theory analysis of the author recognition test.

    Science.gov (United States)

    Moore, Mariah; Gordon, Peter C

    2015-12-01

    In the author recognition test (ART), participants are presented with a series of names and foils and are asked to indicate which ones they recognize as authors. The test is a strong predictor of reading skill, and this predictive ability is generally explained as occurring because author knowledge is likely acquired through reading or other forms of print exposure. In this large-scale study (1,012 college student participants), we used item response theory (IRT) to analyze item (author) characteristics in order to facilitate identification of the determinants of item difficulty, provide a basis for further test development, and optimize scoring of the ART. Factor analysis suggested a potential two-factor structure of the ART, differentiating between literary and popular authors. Effective and ineffective author names were identified so as to facilitate future revisions of the ART. Analyses showed that the ART is a highly significant predictor of the time spent encoding words, as measured using eyetracking during reading. The relationship between the ART and time spent reading provided a basis for implementing a higher penalty for selecting foils, rather than the standard method of ART scoring (names selected minus foils selected). The findings provide novel support for the view that the ART is a valid indicator of reading volume. Furthermore, they show that frequency data can be used to select items of appropriate difficulty, and that frequency data from corpora based on particular time periods and types of texts may allow adaptations of the test for different populations.

  1. Differential geometry

    CERN Document Server

    Ciarlet, Philippe G

    2007-01-01

    This book gives the basic notions of differential geometry, such as the metric tensor, the Riemann curvature tensor, the fundamental forms of a surface, covariant derivatives, and the fundamental theorem of surface theory in a selfcontained and accessible manner. Although the field is often considered a classical one, it has recently been rejuvenated, thanks to the manifold applications where it plays an essential role. The book presents some important applications to shells, such as the theory of linearly and nonlinearly elastic shells, the implementation of numerical methods for shells, and

  2. Differential equations

    CERN Document Server

    Tricomi, FG

    2013-01-01

    Based on his extensive experience as an educator, F. G. Tricomi wrote this practical and concise teaching text to offer a clear idea of the problems and methods of the theory of differential equations. The treatment is geared toward advanced undergraduates and graduate students and addresses only questions that can be resolved with rigor and simplicity.Starting with a consideration of the existence and uniqueness theorem, the text advances to the behavior of the characteristics of a first-order equation, boundary problems for second-order linear equations, asymptotic methods, and diff

  3. Differential topology

    CERN Document Server

    Guillemin, Victor

    2010-01-01

    Differential Topology provides an elementary and intuitive introduction to the study of smooth manifolds. In the years since its first publication, Guillemin and Pollack's book has become a standard text on the subject. It is a jewel of mathematical exposition, judiciously picking exactly the right mixture of detail and generality to display the richness within. The text is mostly self-contained, requiring only undergraduate analysis and linear algebra. By relying on a unifying idea-transversality-the authors are able to avoid the use of big machinery or ad hoc techniques to establish the main

  4. Item and test analysis to identify quality multiple choice questions (MCQS from an assessment of medical students of Ahmedabad, Gujarat

    Directory of Open Access Journals (Sweden)

    Sanju Gajjar

    2014-01-01

    Full Text Available Background: Multiple choice questions (MCQs are frequently used to assess students in different educational streams for their objectivity and wide reach of coverage in less time. However, the MCQs to be used must be of quality which depends upon its difficulty index (DIF I, discrimination index (DI and distracter efficiency (DE. Objective: To evaluate MCQs or items and develop a pool of valid items by assessing with DIF I, DI and DE and also to revise/ store or discard items based on obtained results. Settings: Study was conducted in a medical school of Ahmedabad. Materials and Methods: An internal examination in Community Medicine was conducted after 40 hours teaching during 1 st MBBS which was attended by 148 out of 150 students. Total 50 MCQs or items and 150 distractors were analyzed. Statistical Analysis: Data was entered and analyzed in MS Excel 2007 and simple proportions, mean, standard deviations, coefficient of variation were calculated and unpaired t test was applied. Results: Out of 50 items, 24 had "good to excellent" DIF I (31 - 60% and 15 had "good to excellent" DI (> 0.25. Mean DE was 88.6% considered as ideal/ acceptable and non functional distractors (NFD were only 11.4%. Mean DI was 0.14. Poor DI (< 0.15 with negative DI in 10 items indicates poor preparedness of students and some issues with framing of at least some of the MCQs. Increased proportion of NFDs (incorrect alternatives selected by < 5% students in an item decrease DE and makes it easier. There were 15 items with 17 NFDs, while rest items did not have any NFD with mean DE of 100%. Conclusion: Study emphasizes the selection of quality MCQs which truly assess the knowledge and are able to differentiate the students of different abilities in correct manner.

  5. Editorial Changes and Item Performance: Implications for Calibration and Pretesting

    Directory of Open Access Journals (Sweden)

    Heather Stoffel

    2014-11-01

    Full Text Available Previous research on the impact of text and formatting changes on test-item performance has produced mixed results. This matter is important because it is generally acknowledged that any change to an item requires that it be recalibrated. The present study investigated the effects of seven classes of stylistic changes on item difficulty, discrimination, and response time for a subset of 65 items that make up a standardized test for physician licensure completed by 31,918 examinees in 2012. One of two versions of each item (original or revised was randomly assigned to examinees such that each examinee saw only two experimental items, with each item being administered to approximately 480 examinees. The stylistic changes had little or no effect on item difficulty or discrimination; however, one class of edits -' changing an item from an open lead-in (incomplete statement to a closed lead-in (direct question -' did result in slightly longer response times. Data for nonnative speakers of English were analyzed separately with nearly identical results. These findings have implications for the conventional practice of repretesting (or recalibrating items that have been subjected to minor editorial changes.

  6. A note on monotonicity of item response functions for ordered polytomous item response theory models.

    Science.gov (United States)

    Kang, Hyeon-Ah; Su, Ya-Hui; Chang, Hua-Hua

    2018-03-08

    A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales. © 2018 The British Psychological Society.

  7. Efficient Algorithms for Segmentation of Item-Set Time Series

    Science.gov (United States)

    Chundi, Parvathi; Rosenkrantz, Daniel J.

    We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets—Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.

  8. Binary classification of items of interest in a repeatable process

    Science.gov (United States)

    Abell, Jeffrey A.; Spicer, John Patrick; Wincek, Michael Anthony; Wang, Hui; Chakraborty, Debejyo

    2014-06-24

    A system includes host and learning machines in electrical communication with sensors positioned with respect to an item of interest, e.g., a weld, and memory. The host executes instructions from memory to predict a binary quality status of the item. The learning machine receives signals from the sensor(s), identifies candidate features, and extracts features from the candidates that are more predictive of the binary quality status relative to other candidate features. The learning machine maps the extracted features to a dimensional space that includes most of the items from a passing binary class and excludes all or most of the items from a failing binary class. The host also compares the received signals for a subsequent item of interest to the dimensional space to thereby predict, in real time, the binary quality status of the subsequent item of interest.

  9. The basics of item response theory using R

    CERN Document Server

    Baker, Frank B

    2017-01-01

    This graduate-level textbook is a tutorial for item response theory that covers both the basics of item response theory and the use of R for preparing graphical presentation in writings about the theory. Item response theory has become one of the most powerful tools used in test construction, yet one of the barriers to learning and applying it is the considerable amount of sophisticated computational effort required to illustrate even the simplest concepts. This text provides the reader access to the basic concepts of item response theory freed of the tedious underlying calculations. It is intended for those who possess limited knowledge of educational measurement and psychometrics. Rather than presenting the full scope of item response theory, this textbook is concise and practical and presents basic concepts without becoming enmeshed in underlying mathematical and computational complexities. Clearly written text and succinct R code allow anyone familiar with statistical concepts to explore and apply item re...

  10. Attention restores discrete items to visual short-term memory.

    Science.gov (United States)

    Murray, Alexandra M; Nobre, Anna C; Clark, Ian A; Cravo, André M; Stokes, Mark G

    2013-04-01

    When a memory is forgotten, is it lost forever? Our study shows that selective attention can restore forgotten items to visual short-term memory (VSTM). In our two experiments, all stimuli presented in a memory array were designed to be equally task relevant during encoding. During the retention interval, however, participants were sometimes given a cue predicting which of the memory items would be probed at the end of the delay. This shift in task relevance improved recall for that item. We found that this type of cuing improved recall for items that otherwise would have been irretrievable, providing critical evidence that attention can restore forgotten information to VSTM. Psychophysical modeling of memory performance has confirmed that restoration of information in VSTM increases the probability that the cued item is available for recall but does not improve the representational quality of the memory. We further suggest that attention can restore discrete items to VSTM.

  11. Item-saving assessment of self-care performance in children with developmental disabilities: A prospective caregiver-report computerized adaptive test

    Science.gov (United States)

    Chen, Cheng-Te; Chen, Yu-Lan; Lin, Yu-Ching; Hsieh, Ching-Lin; Tzeng, Jeng-Yi

    2018-01-01

    Objective The purpose of this study was to construct a computerized adaptive test (CAT) for measuring self-care performance (the CAT-SC) in children with developmental disabilities (DD) aged from 6 months to 12 years in a content-inclusive, precise, and efficient fashion. Methods The study was divided into 3 phases: (1) item bank development, (2) item testing, and (3) a simulation study to determine the stopping rules for the administration of the CAT-SC. A total of 215 caregivers of children with DD were interviewed with the 73-item CAT-SC item bank. An item response theory model was adopted for examining the construct validity to estimate item parameters after investigation of the unidimensionality, equality of slope parameters, item fitness, and differential item functioning (DIF). In the last phase, the reliability and concurrent validity of the CAT-SC were evaluated. Results The final CAT-SC item bank contained 56 items. The stopping rules suggested were (a) reliability coefficient greater than 0.9 or (b) 14 items administered. The results of simulation also showed that 85% of the estimated self-care performance scores would reach a reliability higher than 0.9 with a mean test length of 8.5 items, and the mean reliability for the rest was 0.86. Administering the CAT-SC could reduce the number of items administered by 75% to 84%. In addition, self-care performances estimated by the CAT-SC and the full item bank were very similar to each other (Pearson r = 0.98). Conclusion The newly developed CAT-SC can efficiently measure self-care performance in children with DD whose performances are comparable to those of TD children aged from 6 months to 12 years as precisely as the whole item bank. The item bank of the CAT-SC has good reliability and a unidimensional self-care construct, and the CAT can estimate self-care performance with less than 25% of the items in the item bank. Therefore, the CAT-SC could be useful for measuring self-care performance in children with

  12. Differential Item Functioning in the Assessment of ADHD Symptoms Based on Gender and Rating Format

    OpenAIRE

    Arias Martínez, Benito; Universidad de Valladolid; Arias González, Víctor B.; Facultad de Psicología Universidad de Talca Chile; Gómez Sánchez, Laura Elísabet; Universidad de Oviedo; Calleja González, María Angélica Inmaculada; Universidad de Valladolid

    2012-01-01

    El objetivo de este estudio se centró en poner a prueba la invarianza de la sintomatología del Trastorno por Déficit de Atención con Hiperactividad (TDAH) en función del género, en una muestra de 634 niños. Se comprobó, en primer lugar, el ajuste de cinco modelos factoriales mediante análisis factorial confirmatorio, y se utilizó la regresión logística ordinal como método de estimación del funcionamiento diferencial del ítem (DIF), tanto uniforme como no uniforme. Los resultados pusieron de m...

  13. Consolidation differentially modulates schema effects on memory for items and associations

    NARCIS (Netherlands)

    van Kesteren, Marlieke T R; Rijpkema, Mark; Ruiter, Dirk J; Fernández, Guillén

    2013-01-01

    Newly learned information that is congruent with a preexisting schema is often better remembered than information that is incongrSaveuent. This schema effect on memory has previously been associated to more efficient encoding and consolidation mechanisms. However, this effect is not always

  14. Consolidation differentially modulates schema effects on memory for items and associations

    NARCIS (Netherlands)

    Kesteren, M.T. van; Rijpkema, M.J.P.; Ruiter, D.J.; Fernandez, G.S.E.

    2013-01-01

    Newly learned information that is congruent with a preexisting schema is often better remembered than information that is incongruent. This schema effect on memory has previously been associated to more efficient encoding and consolidation mechanisms. However, this effect is not always consistently

  15. Exploring Plausible Causes of Differential Item Functioning in the PISA Science Assessment: Language, Curriculum or Culture

    Science.gov (United States)

    Huang, Xiaoting; Wilson, Mark; Wang, Lei

    2016-01-01

    In recent years, large-scale international assessments have been increasingly used to evaluate and compare the quality of education across regions and countries. However, measurement variance between different versions of these assessments often posts threats to the validity of such cross-cultural comparisons. In this study, we investigated the…

  16. Differential Item Functioning and Educational Risk Factors in Guatemalan Reading Assessment

    Directory of Open Access Journals (Sweden)

    Alvaro M. Fortin Morales

    2013-01-01

    Full Text Available Examinamos indicadores de Funcionamiento Diferencial de Ítemes (FDI asociados a cuatro variables que han demostrado de manera repetida ser factores de riesgo para el logro escolar. Estos factores son el sobre-edad para el grado de matriculación, área de residencia urbana/rural, etnia y género. Para este estudio utilizamos los datos de las evaluaciones nacionales del tercer grado. Dado que en la literatura se reporta con frecuencia que los indicadores de FDI son inestables, utilizamos tres diferentes métodos para estimarlo (chi-cuadrado, Rasch, regresión logística y evaluamos su consistencia en datos de tres diferentes años de evaluaciones. Encontramos evidencia de FDI. Sin embargo, la eliminación de ítemes con FDI no cambió las diferencias entre grupos que se encontraron en las puntuaciones de las evaluaciones. Los hallazgos sugieren que los factores de riesgo educativo actúan de manera conjunta en esta población guatemalteca y que hay alguna interacción entre estos factores de riesgo para generar sesgo. Concluimos que será de beneficio tomar en cuenta múltiples variables de contexto asociadas al riesgo educativo de forma simultanea al analizar FDI y al desarrollar evaluaciones.

  17. Gender Differences in Mathematics Achievement in Jordan: A Differential Item Functioning Analysis of the 2015 TIMSS

    Science.gov (United States)

    Innabi, Hanan; Dodeen, Hamzeh

    2018-01-01

    This study is within the framework of the United Nations sustainable development goals related to equitable quality education. The total score on the 2015 Trends in International Mathematics and Science Study that indicated eighth-grade girls in Jordan significantly outperformed boys is hiding many details related to the quality of mathematics…

  18. Hazardous metals in yellow items used in RCAs

    International Nuclear Information System (INIS)

    Brown, K.F.; Rankin, W.N.

    1992-01-01

    Yellow items used in Radiologically Controlled Areas (RCAs) that could contain hazardous metals were identified. X-ray fluorescence analyses indicated that thirty of the fifty-two items do contain hazardous metals. It is important to minimize the hazardous metals put into the wastes. The authors recommend that the specifications for all yellow items stocked in Stores be changed to specify that they contain no hazardous metals

  19. Safety Evaluation for Packaging (onsite) T Plant Canyon Items

    International Nuclear Information System (INIS)

    OBRIEN, J.H.

    2000-01-01

    This safety evaluation for packaging (SEP) evaluates and documents the ability to safely ship mostly unique inventories of miscellaneous T Plant canyon waste items (T-P Items) encountered during the canyon deck clean off campaign. In addition, this SEP addresses contaminated items and material that may be shipped in a strong tight package (STP). The shipments meet the criteria for onsite shipments as specified by Fluor Hanford in HNF-PRO-154, Responsibilities and Procedures for all Hazardous Material Shipments

  20. Safety Evaluation for Packaging (onsite) T Plant Canyon Items

    Energy Technology Data Exchange (ETDEWEB)

    OBRIEN, J.H.

    2000-07-14

    This safety evaluation for packaging (SEP) evaluates and documents the ability to safely ship mostly unique inventories of miscellaneous T Plant canyon waste items (T-P Items) encountered during the canyon deck clean off campaign. In addition, this SEP addresses contaminated items and material that may be shipped in a strong tight package (STP). The shipments meet the criteria for onsite shipments as specified by Fluor Hanford in HNF-PRO-154, Responsibilities and Procedures for all Hazardous Material Shipments.

  1. An item response theory analysis of the Executive Interview and development of the EXIT8: A Project FRONTIER Study.

    Science.gov (United States)

    Jahn, Danielle R; Dressel, Jeffrey A; Gavett, Brandon E; O'Bryant, Sid E

    2015-01-01

    The Executive Interview (EXIT25) is an effective measure of executive dysfunction, but may be inefficient due to the time it takes to complete 25 interview-based items. The current study aimed to examine psychometric properties of the EXIT25, with a specific focus on determining whether a briefer version of the measure could comprehensively assess executive dysfunction. The current study applied a graded response model (a type of item response theory model for polytomous categorical data) to identify items that were most closely related to the underlying construct of executive functioning and best discriminated between varying levels of executive functioning. Participants were 660 adults ages 40 to 96 years living in West Texas, who were recruited through an ongoing epidemiological study of rural health and aging, called Project FRONTIER. The EXIT25 was the primary measure examined. Participants also completed the Trail Making Test and Controlled Oral Word Association Test, among other measures, to examine the convergent validity of a brief form of the EXIT25. Eight items were identified that provided the majority of the information about the underlying construct of executive functioning; total scores on these items were associated with total scores on other measures of executive functioning and were able to differentiate between cognitively healthy, mildly cognitively impaired, and demented participants. In addition, cutoff scores were recommended based on sensitivity and specificity of scores. A brief, eight-item version of the EXIT25 may be an effective and efficient screening for executive dysfunction among older adults.

  2. Three controversies over item disclosure in medical licensure examinations

    Directory of Open Access Journals (Sweden)

    Yoon Soo Park

    2015-09-01

    Full Text Available In response to views on public's right to know, there is growing attention to item disclosure – release of items, answer keys, and performance data to the public – in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations – 1 fairness and validity, 2 impact on passing levels, and 3 utility of item disclosure – by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers’ right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration.

  3. Negative effects of item repetition on source memory

    OpenAIRE

    Kim, Kyungmi; Yi, Do-Joon; Raye, Carol L.; Johnson, Marcia K.

    2012-01-01

    In the present study, we explored how item repetition affects source memory for new item–feature associations (picture–location or picture–color). We presented line drawings varying numbers of times in Phase 1. In Phase 2, each drawing was presented once with a critical new feature. In Phase 3, we tested memory for the new source feature of each item from Phase 2. Experiments 1 and 2 demonstrated and replicated the negative effects of item repetition on incidental source memory. Prior item re...

  4. Quantum partial search for uneven distribution of multiple target items

    Science.gov (United States)

    Zhang, Kun; Korepin, Vladimir

    2018-06-01

    Quantum partial search algorithm is an approximate search. It aims to find a target block (which has the target items). It runs a little faster than full Grover search. In this paper, we consider quantum partial search algorithm for multiple target items unevenly distributed in a database (target blocks have different number of target items). The algorithm we describe can locate one of the target blocks. Efficiency of the algorithm is measured by number of queries to the oracle. We optimize the algorithm in order to improve efficiency. By perturbation method, we find that the algorithm runs the fastest when target items are evenly distributed in database.

  5. Data Visualization of Item-Total Correlation by Median Smoothing

    Directory of Open Access Journals (Sweden)

    Chong Ho Yu

    2016-02-01

    Full Text Available This paper aims to illustrate how data visualization could be utilized to identify errors prior to modeling, using an example with multi-dimensional item response theory (MIRT. MIRT combines item response theory and factor analysis to identify a psychometric model that investigates two or more latent traits. While it may seem convenient to accomplish two tasks by employing one procedure, users should be cautious of problematic items that affect both factor analysis and IRT. When sample sizes are extremely large, reliability analyses can misidentify even random numbers as meaningful patterns. Data visualization, such as median smoothing, can be used to identify problematic items in preliminary data cleaning.

  6. A photographic method to measure food item intake. Validation in geriatric institutions.

    Science.gov (United States)

    Pouyet, Virginie; Cuvelier, Gérard; Benattar, Linda; Giboreau, Agnès

    2015-01-01

    From both a clinical and research perspective, measuring food intake is an important issue in geriatric institutions. However, weighing food in this context can be complex, particularly when the items remaining on a plate (side dish, meat or fish and sauce) need to be weighed separately following consumption. A method based on photography that involves taking photographs after a meal to determine food intake consequently seems to be a good alternative. This method enables the storage of raw data so that unhurried analyses can be performed to distinguish the food items present in the images. Therefore, the aim of this paper was to validate a photographic method to measure food intake in terms of differentiating food item intake in the context of a geriatric institution. Sixty-six elderly residents took part in this study, which was performed in four French nursing homes. Four dishes of standardized portions were offered to the residents during 16 different lunchtimes. Three non-trained assessors then independently estimated both the total and specific food item intakes of the participants using images of their plates taken after the meal (photographic method) and a reference image of one plate taken before the meal. Total food intakes were also recorded by weighing the food. To test the reliability of the photographic method, agreements between different assessors and agreements among various estimates made by the same assessor were evaluated. To test the accuracy and specificity of this method, food intake estimates for the four dishes were compared with the food intakes determined using the weighed food method. To illustrate the added value of the photographic method, food consumption differences between the dishes were explained by investigating the intakes of specific food items. Although they were not specifically trained for this purpose, the results demonstrated that the assessor estimates agreed between assessors and among various estimates made by the same

  7. Quantitative Analysis of Complex Multiple-Choice Items in Science Technology and Society: Item Scaling

    Directory of Open Access Journals (Sweden)

    Ángel Vázquez Alonso

    2005-05-01

    Full Text Available The scarce attention to assessment and evaluation in science education research has been especially harmful for Science-Technology-Society (STS education, due to the dialectic, tentative, value-laden, and controversial nature of most STS topics. To overcome the methodological pitfalls of the STS assessment instruments used in the past, an empirically developed instrument (VOSTS, Views on Science-Technology-Society have been suggested. Some methodological proposals, namely the multiple response models and the computing of a global attitudinal index, were suggested to improve the item implementation. The final step of these methodological proposals requires the categorization of STS statements. This paper describes the process of categorization through a scaling procedure ruled by a panel of experts, acting as judges, according to the body of knowledge from history, epistemology, and sociology of science. The statement categorization allows for the sound foundation of STS items, which is useful in educational assessment and science education research, and may also increase teachers’ self-confidence in the development of the STS curriculum for science classrooms.

  8. Maintenance of item and order information in verbal working memory.

    Science.gov (United States)

    Camos, Valérie; Lagner, Prune; Loaiza, Vanessa M

    2017-09-01

    Although verbal recall of item and order information is well-researched in short-term memory paradigms, there is relatively little research concerning item and order recall from working memory. The following study examined whether manipulating the opportunity for attentional refreshing and articulatory rehearsal in a complex span task differently affected the recall of item- and order-specific information of the memoranda. Five experiments varied the opportunity for articulatory rehearsal and attentional refreshing in a complex span task, but the type of recall was manipulated between experiments (item and order, order only, and item only recall). The results showed that impairing attentional refreshing and articulatory rehearsal similarly affected recall regardless of whether the scoring procedure (Experiments 1 and 4) or recall requirements (Experiments 2, 3, and 5) reflected item- or order-specific recall. This implies that both mechanisms sustain the maintenance of item and order information, and suggests that the common cumulative functioning of these two mechanisms to maintain items could be at the root of order maintenance.

  9. Group differences in the heritability of items and test scores

    NARCIS (Netherlands)

    Wicherts, J.M.; Johnson, W.

    2009-01-01

    It is important to understand potential sources of group differences in the heritability of intelligence test scores. On the basis of a basic item response model we argue that heritabilities which are based on dichotomous item scores normally do not generalize from one sample to the next. If groups

  10. The development of a single-item Food Choice Questionnaire

    NARCIS (Netherlands)

    Onwezen, M.C.; Reinders, M.J.; Verain, M.C.D.; Snoek, H.M.

    2019-01-01

    Based on the multi-item Food Choice Questionnaire (FCQ) originally developed by Steptoe and colleagues (1995), the current study developed a single-item FCQ that provides an acceptable balance between practical needs and psychometric concerns. Studies 1 (N = 1851) and 2 (2a (N = 3290), 2b (N =

  11. 41 CFR 109-1.5109 - Control of sensitive items.

    Science.gov (United States)

    2010-07-01

    ... administrative control of sensitive items assigned for general use within an organizational unit as appropriate... 41 Public Contracts and Property Management 3 2010-07-01 2010-07-01 false Control of sensitive...-INTRODUCTION 1.51-Personal Property Management Standards and Practices § 109-1.5109 Control of sensitive items...

  12. 17 CFR 229.1010 - (Item 1010) Financial statements.

    Science.gov (United States)

    2010-04-01

    ....1010 (Item 1010) Financial statements. (a) Financial information. Furnish the following financial information: (1) Audited financial statements for the two fiscal years required to be filed with the company's... 17 Commodity and Securities Exchanges 2 2010-04-01 2010-04-01 false (Item 1010) Financial...

  13. Item Construction and Psychometric Models Appropriate for Constructed Responses

    Science.gov (United States)

    1991-08-01

    which involve only one attribute per item. This is especially true when we are dealing with constructed-response items, we have to measure much more...Service University of Ilinois Educacional Testing Service Rosedal Road Capign. IL 61801 Princeton. K3 08541 Princeton. N3 08541 Dr. Charles LeiS Dr

  14. 17 CFR 229.406 - (Item 406) Code of ethics.

    Science.gov (United States)

    2010-04-01

    ... 17 Commodity and Securities Exchanges 2 2010-04-01 2010-04-01 false (Item 406) Code of ethics. 229... 406) Code of ethics. (a) Disclose whether the registrant has adopted a code of ethics that applies to... code of ethics, explain why it has not done so. (b) For purposes of this Item 406, the term code of...

  15. Mathematical-programming approaches to test item pool design

    NARCIS (Netherlands)

    Veldkamp, Bernard P.; van der Linden, Willem J.; Ariel, A.

    2002-01-01

    This paper presents an approach to item pool design that has the potential to improve on the quality of current item pools in educational and psychological testing andhence to increase both measurement precision and validity. The approach consists of the application of mathematical programming

  16. Elu kui näitemäng / Helju Koger

    Index Scriptorium Estoniae

    Koger, Helju, 1943-

    2007-01-01

    VI kihelkonnapäevadest Juurus. Juuru Mihkli kirikus esines ansambel Resonabilis. Konverentsil räägiti Järlepa mõisast, Anu Allikvee pidas ettekande "August von Kotzebue elu nagu näitemäng" jm. Näitemängu "Pärmi Jaagu unenägu" nägi kohalike asjaarmastajate esituses

  17. Effects of Aging and IQ on Item and Associative Memory

    Science.gov (United States)

    Ratcliff, Roger; Thapar, Anjali; McKoon, Gail

    2011-01-01

    The effects of aging and IQ on performance were examined in 4 memory tasks: item recognition, associative recognition, cued recall, and free recall. For item and associative recognition, accuracy and the response time (RT) distributions for correct and error responses were explained by Ratcliff's (1978) diffusion model at the level of individual…

  18. The Influence of Item Properties on Association-Memory

    Science.gov (United States)

    Madan, Christopher R.; Glaholt, Mackenzie G.; Caplan, Jeremy B.

    2010-01-01

    Word properties like imageability and word frequency improve cued recall of verbal paired-associates. We asked whether these enhancements follow simply from prior effects on item-memory, or also strengthen associations between items. Participants studied word pairs varying in imageability or frequency: pairs were "pure" (high-high, low-low) or…

  19. 31 CFR 50.14 - Separate line item.

    Science.gov (United States)

    2010-07-01

    ....14 Money and Finance: Treasury Office of the Secretary of the Treasury TERRORISM RISK INSURANCE PROGRAM Disclosures as Conditions for Federal Payment § 50.14 Separate line item. An insurer is deemed to be in compliance with the requirement of providing disclosure on a “separate line item in the policy...

  20. Procedures for Selecting Items for Computerized Adaptive Tests.

    Science.gov (United States)

    Kingsbury, G. Gage; Zara, Anthony R.

    1989-01-01

    Several classical approaches and alternative approaches to item selection for computerized adaptive testing (CAT) are reviewed and compared. The study also describes procedures for constrained CAT that may be added to classical item selection approaches to allow them to be used for applied testing. (TJH)

  1. Negative Affect Impairs Associative Memory but Not Item Memory

    Science.gov (United States)

    Bisby, James A.; Burgess, Neil

    2014-01-01

    The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine…

  2. Item response theory at subject- and group-level

    NARCIS (Netherlands)

    Tobi, Hilde

    1990-01-01

    This paper reviews the literature about item response models for the subject level and aggregated level (group level). Group-level item response models (IRMs) are used in the United States in large-scale assessment programs such as the National Assessment of Educational Progress and the California

  3. Repair systems with exchangeable items and the longest queue mechanism

    NARCIS (Netherlands)

    Ravid, R.; Boxma, O.J.; Perry, D.

    2013-01-01

    We consider a repair facility consisting of one repairman and two arrival streams of failed items, from bases 1 and 2. The arrival processes are independent Poisson processes, and the repair times are independent and identically exponentially distributed. The item types are exchangeable, and a

  4. Repair systems with exchangeable items and the longest queue mechanism

    NARCIS (Netherlands)

    Ravid, R.; Boxma, O.J.; Perry, D.

    2011-01-01

    We consider a repair facility consisting of one repairman and two arrival streams of failed items, from bases 1 and 2. The arrival processes are independent Poisson processes, and the repair times are independent and identically exponentially distributed. The item types are exchangeable, and a

  5. The Role of Item Feedback in Self-Adapted Testing.

    Science.gov (United States)

    Roos, Linda L.; And Others

    1997-01-01

    The importance of item feedback in self-adapted testing was studied by comparing feedback and no feedback conditions for computerized adaptive tests and self-adapted tests taken by 363 college students. Results indicate that item feedback is not necessary to realize score differences between self-adapted and computerized adaptive testing. (SLD)

  6. Characterizing Sources of Uncertainty in Item Response Theory Scale Scores

    Science.gov (United States)

    Yang, Ji Seung; Hansen, Mark; Cai, Li

    2012-01-01

    Traditional estimators of item response theory scale scores ignore uncertainty carried over from the item calibration process, which can lead to incorrect estimates of the standard errors of measurement (SEMs). Here, the authors review a variety of approaches that have been applied to this problem and compare them on the basis of their statistical…

  7. Practical Guide to Conducting an Item Response Theory Analysis

    Science.gov (United States)

    Toland, Michael D.

    2014-01-01

    Item response theory (IRT) is a psychometric technique used in the development, evaluation, improvement, and scoring of multi-item scales. This pedagogical article provides the necessary information needed to understand how to conduct, interpret, and report results from two commonly used ordered polytomous IRT models (Samejima's graded…

  8. Item Response Theory Modeling of the Philadelphia Naming Test

    Science.gov (United States)

    Fergadiotis, Gerasimos; Kellough, Stacey; Hula, William D.

    2015-01-01

    Purpose: In this study, we investigated the fit of the Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996) to an item-response-theory measurement model, estimated the precision of the resulting scores and item parameters, and provided a theoretical rationale for the interpretation of PNT overall scores by relating…

  9. 48 CFR 53.212 - Acquisition of commercial items.

    Science.gov (United States)

    2010-10-01

    ... 48 Federal Acquisition Regulations System 2 2010-10-01 2010-10-01 false Acquisition of commercial... (CONTINUED) CLAUSES AND FORMS FORMS Prescription of Forms 53.212 Acquisition of commercial items. SF 1449 (Rev. 3/2005), Solicitation/Contract/Order for Commercial Items. SF 1449 is prescribed for use in...

  10. 48 CFR 52.212-2 - Evaluation-Commercial Items.

    Science.gov (United States)

    2010-10-01

    ... 48 Federal Acquisition Regulations System 2 2010-10-01 2010-10-01 false Evaluation-Commercial....212-2 Evaluation—Commercial Items. As prescribed in 12.301(c), the Contracting Officer may insert a provision substantially as follows: Evaluation—Commercial Items (JAN 1999) (a) The Government will award a...

  11. 48 CFR 46.202-1 - Contracts for commercial items.

    Science.gov (United States)

    2010-10-01

    ... 48 Federal Acquisition Regulations System 1 2010-10-01 2010-10-01 false Contracts for commercial... CONTRACT MANAGEMENT QUALITY ASSURANCE Contract Quality Requirements 46.202-1 Contracts for commercial items. When acquiring commercial items (see part 12), the Government shall rely on contractors' existing...

  12. Dissociation between source and item memory in Parkinson's disease

    Institute of Scientific and Technical Information of China (English)

    Hu Panpan; Li Youhai; Ma Huijuan; Xi Chunhua; Chen Xianwen; Wang Kai

    2014-01-01

    Background Episodic memory includes information about item memory and source memory.Many researches support the hypothesis that these two memory systems are implemented by different brain structures.The aim of this study was to investigate the characteristics of item memory and source memory processing in patients with Parkinson's disease (PD),and to further verify the hypothesis of dual-process model of source and item memory.Methods We established a neuropsychological battery to measure the performance of item memory and source memory.Totally 35 PD individuals and 35 matched healthy controls (HC) were administrated with the battery.Item memory task consists of the learning and recognition of high-frequency national Chinese characters; source memory task consists of the learning and recognition of three modes (character,picture,and image) of objects.Results Compared with the controls,the idiopathic PD patients have been impaired source memory (PD vs.HC:0.65±0.06 vs.0.72±0.09,P=0.001),but not impaired in item memory (PD vs.HC:0.65±0.07 vs.0.67±0.08,P=0.240).Conclusions The present experiment provides evidence for dissociation between item and source memory in PD patients,thereby strengthening the claim that the item or source memory rely on different brain structures.PD patients show poor source memory,in which dopamine plays a critical role.

  13. Semiparametric Item Response Functions in the Context of Guessing

    Science.gov (United States)

    Falk, Carl F.; Cai, Li

    2016-01-01

    We present a logistic function of a monotonic polynomial with a lower asymptote, allowing additional flexibility beyond the three-parameter logistic model. We develop a maximum marginal likelihood-based approach to estimate the item parameters. The new item response model is demonstrated on math assessment data from a state, and a computationally…

  14. Loglinear multidimensional IRT models for polytomously scored items

    NARCIS (Netherlands)

    Kelderman, Henk; Rijkes, Carl P.M.; Rijkes, Carl

    1994-01-01

    A loglinear IRT model is proposed that relates polytomously scored item responses to a multidimensional latent space. The analyst may specify a response function for each response, indicating which latent abilities are necessary to arrive at that response. Each item may have a different number of

  15. Graphical modeling for item difficulty in medical faculty exams

    African Journals Online (AJOL)

    . Conclusion: The ... difficulty criteria. Key words: Item difficulty, quality control, statistical process control, variable control charts ..... assumed that 68% of the values fall in the interval ± 1.S; .... The balance of the construction of items of exam has ...

  16. A person fit test for IRT models for polytomous items

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; Dagohoy, A.V.

    2007-01-01

    A person fit test based on the Lagrange multiplier test is presented for three item response theory models for polytomous items: the generalized partial credit model, the sequential model, and the graded response model. The test can also be used in the framework of multidimensional ability

  17. Bayes factor covariance testing in item response models

    NARCIS (Netherlands)

    Fox, J.P.; Mulder, J.; Sinharay, Sandip

    2017-01-01

    Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning

  18. Bayes Factor Covariance Testing in Item Response Models

    NARCIS (Netherlands)

    Fox, Jean-Paul; Mulder, Joris; Sinharay, Sandip

    2017-01-01

    Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning

  19. Optimizing incomplete sample designs for item response model parameters

    NARCIS (Netherlands)

    van der Linden, Willem J.

    Several models for optimizing incomplete sample designs with respect to information on the item parameters are presented. The following cases are considered: (1) known ability parameters; (2) unknown ability parameters; (3) item sets with multiple ability scales; and (4) response models with

  20. Algorithms for computerized test construction using classical item parameters

    NARCIS (Netherlands)

    Adema, Jos J.; van der Linden, Willem J.

    1989-01-01

    Recently, linear programming models for test construction were developed. These models were based on the information function from item response theory. In this paper another approach is followed. Two 0-1 linear programming models for the construction of tests using classical item and test

  1. AVOID BECOMING A VICTIM OF COUNTERFEIT ITEMS

    Energy Technology Data Exchange (ETDEWEB)

    WARRINER RD

    2011-07-13

    In today's globalized economy, we cannot live without imported products. Most people do not realize how thin the safety net of regulation and inspection really is. Less than three percent of imported products receive any form of government inspection prior to sale. Avoid flea markets, street vendors and deep discount stores. The sellers of counterfeit wares know where to market their products. They look for individuals who are hungry for a brand name item but do not want to pay a brand name price for it. The internet provides anonymity to the sellers of counterfeit products. Unlike Europe, U.S. law does not hold internet-marketing organizations, responsible for the quality of the products sold on their websites. These organizations will remove an individual vendor when a sufficient number of complaints are lodged, but they will not take responsibility for the counterfeit products you may have purchased. EBay has a number of counterfeit product guides to help you avoid being a victim of the sellers of these products. Ten percent of all medications taken worldwide are counterfeit. If you do buy medications on-line, be sure that the National Association of Boards of Pharmacy Verified Internet Pharmacy Practice Sites (VIPPS) recommends the pharmacy you choose to use. Inspect all medication purchases and report any change in color, shape, imprinting or odor to your pharmacist. If you take generic medications these attributes may change from one manufacturer to another. Your pharmacist should inform you of any changes when you refill your prescription. If they do not, get clarification prior to taking the medication. Please note that the Federal Drug Administration (FDA) does not regulate supplements. The FDA only steps in when a specific supplement proves to cause physical harm or contains a regulated ingredient. Due to counterfeiting, Underwriters Laboratories (UL) changed their label design three times since 1996. The new gold label should be attached to the cord

  2. Teaching children with autism spectrum disorders to mand for the removal of stimuli that prevent access to preferred items.

    Science.gov (United States)

    Shillingsburg, M Alice; Powell, Nicole M; Bowen, Crystal N

    2013-01-01

    Mand training is often a primary focus in early language instruction and typically includes mands that are positively reinforced. However, mands maintained by negative reinforcement are also important skills to teach. These include mands to escape aversive demands or unwanted items. Another type of negatively reinforced mand important to teach involves the removal of a stimulus that prevents access to a preferred activity. We taught 5 participants diagnosed with autism spectrum disorders to mand for the removal of a stimulus in order to access a preferred item that had been blocked. An evaluation was conducted to determine if participants responded differentially when the establishing operations for the preferred item were present versus absent. All participants learned to mand for the removal of the stimulus exclusively under conditions when the establishing operation was present.

  3. Optimization approach of background value and initial item for improving prediction precision of GM(1,1) model

    Institute of Scientific and Technical Information of China (English)

    Yuhong Wang; Qin Liu; Jianrong Tang; Wenbin Cao; Xiaozhong Li

    2014-01-01

    A combination method of optimization of the back-ground value and optimization of the initial item is proposed. The sequences of the unbiased exponential distribution are simulated and predicted through the optimization of the background value in grey differential equations. The principle of the new information priority in the grey system theory and the rationality of the initial item in the original GM(1,1) model are ful y expressed through the improvement of the initial item in the proposed time response function. A numerical example is employed to il ustrate that the proposed method is able to simulate and predict sequences of raw data with the unbiased exponential distribution and has better simulation performance and prediction precision than the original GM(1,1) model relatively.

  4. Remembered but Unused: The Accessory Items in Working Memory that Do Not Guide Attention

    Science.gov (United States)

    Peters, Judith C.; Goebel, Rainer; Roelfsema, Pieter R.

    2009-01-01

    If we search for an item, a representation of this item in our working memory guides attention to matching items in the visual scene. We can hold multiple items in working memory. Do all these items guide attention in parallel? We asked participants to detect a target object in a stream of objects while they maintained a second item in memory for…

  5. More is not Always Better: The Relation between Item Response and Item Response Time in Raven’s Matrices

    Directory of Open Access Journals (Sweden)

    Frank Goldhammer

    2015-03-01

    Full Text Available The role of response time in completing an item can have very different interpretations. Responding more slowly could be positively related to success as the item is answered more carefully. However, the association may be negative if working faster indicates higher ability. The objective of this study was to clarify the validity of each assumption for reasoning items considering the mode of processing. A total of 230 persons completed a computerized version of Raven’s Advanced Progressive Matrices test. Results revealed that response time overall had a negative effect. However, this effect was moderated by items and persons. For easy items and able persons the effect was strongly negative, for difficult items and less able persons it was less negative or even positive. The number of rules involved in a matrix problem proved to explain item difficulty significantly. Most importantly, a positive interaction effect between the number of rules and item response time indicated that the response time effect became less negative with an increasing number of rules. Moreover, exploratory analyses suggested that the error type influenced the response time effect.

  6. The medial temporal lobes distinguish between within-item and item-context relations during autobiographical memory retrieval.

    Science.gov (United States)

    Sheldon, Signy; Levine, Brian

    2015-12-01

    During autobiographical memory retrieval, the medial temporal lobes (MTL) relate together multiple event elements, including object (within-item relations) and context (item-context relations) information, to create a cohesive memory. There is consistent support for a functional specialization within the MTL according to these relational processes, much of which comes from recognition memory experiments. In this study, we compared brain activation patterns associated with retrieving within-item relations (i.e., associating conceptual and sensory-perceptual object features) and item-context relations (i.e., spatial relations among objects) with respect to naturalistic autobiographical retrieval. We developed a novel paradigm that cued participants to retrieve information about past autobiographical events, non-episodic within-item relations, and non-episodic item-context relations with the perceptuomotor aspects of retrieval equated across these conditions. We used multivariate analysis techniques to extract common and distinct patterns of activity among these conditions within the MTL and across the whole brain, both in terms of spatial and temporal patterns of activity. The anterior MTL (perirhinal cortex and anterior hippocampus) was preferentially recruited for generating within-item relations later in retrieval whereas the posterior MTL (posterior parahippocampal cortex and posterior hippocampus) was preferentially recruited for generating item-context relations across the retrieval phase. These findings provide novel evidence for functional specialization within the MTL with respect to naturalistic memory retrieval. © 2015 Wiley Periodicals, Inc.

  7. 41 CFR 101-26.605 - Items other than petroleum products and electronic items available from the Defense Logistics...

    Science.gov (United States)

    2010-07-01

    ... petroleum products and electronic items available from the Defense Logistics Agency. 101-26.605 Section 101... available from the Defense Logistics Agency. Agencies required to use GSA supply sources should also use... Logistics Agency, the catalog will contain only those items in Federal supply classification classes which...

  8. Modeling Item-Level and Step-Level Invariance Effects in Polytomous Items Using the Partial Credit Model

    Science.gov (United States)

    Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D.

    2012-01-01

    Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…

  9. The Piper Fatigue Scale-12 (PFS-12): psychometric findings and item reduction in a cohort of breast cancer survivors.

    Science.gov (United States)

    Reeve, Bryce B; Stover, Angela M; Alfano, Catherine M; Smith, Ashley Wilder; Ballard-Barbash, Rachel; Bernstein, Leslie; McTiernan, Anne; Baumgartner, Kathy B; Piper, Barbara F

    2012-11-01

    Brief, valid measures of fatigue, a prevalent and distressing cancer symptom, are needed for use in research. This study's primary aim was to create a shortened version of the revised Piper Fatigue Scale (PFS-R) based on data from a diverse cohort of breast cancer survivors. A secondary aim was to determine whether the PFS captured multiple distinct aspects of fatigue (a multidimensional model) or a single overall fatigue factor (a unidimensional model). Breast cancer survivors (n = 799; stages in situ through IIIa; ages 29-86 years) were recruited through three SEER registries (New Mexico, Western Washington, and Los Angeles, CA) as part of the Health, Eating, Activity, and Lifestyle (HEAL) study. Fatigue was measured approximately 3 years post-diagnosis using the 22-item PFS-R that has four subscales (Behavior, Affect, Sensory, and Cognition). Confirmatory factor analysis was used to compare unidimensional and multidimensional models. Six criteria were used to make item selections to shorten the PFS-R: scale's content validity, items' relationship with fatigue, content redundancy, differential item functioning by race and/or education, scale reliability, and literacy demand. Factor analyses supported the original 4-factor structure. There was also evidence from the bi-factor model for a dominant underlying fatigue factor. Six items tested positive for differential item functioning between African-American and Caucasian survivors. Four additional items either showed poor association, local dependence, or content validity concerns. After removing these 10 items, the reliability of the PFS-12 subscales ranged from 0.87 to 0.89, compared to 0.90-0.94 prior to item removal. The newly developed PFS-12 can be used to assess fatigue in African-American and Caucasian breast cancer survivors and reduces response burden without compromising reliability or validity. This is the first study to determine PFS literacy demand and to compare PFS-R responses in African

  10. Model EPQ Multi Item yang Dimodifikasi untuk Dua Permintaan secara Simultan

    Directory of Open Access Journals (Sweden)

    Taufiq Rahman

    2017-05-01

    Full Text Available Inventory is one of many factors of the business operation that need to be controlled by industries in order to improve efficiency, enhance productivity, and decrease the holding cost. The holding cost of inventories in supply chain contribute to 20% - 40% of the product value. It can be controlled by applying appropriate inventory model, such as EPQ/Economic Production Quantity and EOQ/Economic Order Quantity. EPQ is an inventory model that used to determine the optimum production lot size with balanced the production setup cost and holding cost. Even the classic EPQ has applied widely in industries, the assumption used by this model differed between the researchers whether it is continuous or discrete demand, because the multi delivery or discrete demand is mostly used by industries. Even so, there are industries that used both continuous and discrete demand simultaneously. Based on previous research, there was an advanced EPQ model that synchronizing both assumptions simultaneously, but it still addressed single item problem. Since almost the industries produced multi item, this model has lack of applicability. Therefore, this research proposed a multi item EPQ Model that synchronizing continuous and discrete demand simultaneously. The solution procedure that used in this proposed model are classical calculus method/differential calculus and simultaneous approach. A numerical example is given to show the effectiveness of the proposed approach based on the data from the literature.

  11. Instemmingsgeneigdheid en verskillende item- en responsformate in 'n gesommeerde selfbeoordelingskaal

    Directory of Open Access Journals (Sweden)

    Nadene Hanekom

    1998-06-01

    Full Text Available This study examines the degree of acquiescence present when the item and response formats of a summated rating scale are varied. It is often recommended that acquiescence response bias in rating scales may be controlled by using both positively and negatively worded items. Such items are generally worded in the Likert-type format of statements. The purpose of the study was to establish whether items in question format would result in a smaller degree of acquiescence than items worded as statements. the response format was also varied (five- and seven-point options to determine whether this would influence the reliability and degree of acquiescence in the scales. A twenty-item Locus of Control (LC questionnaire was used, but each item was complemented by its opposite, resulting in 40 items. The subjects, divided randomly into two groups, were second year students who had to complete four versions of the questionnaire, plus a shortened version of Bass's scale for measuring acquiescence. The LC version were questions or statements each combined with a five- or seven-point respons format. Partial counterbalancing was introduced by testing on two separate occasions, presenting the tests to the two groups in the opposite order. The degree of acquiescence was assessed by correlating the items with their opposite, and by correlating scores on each version with scores on the acquiescence questionnaire. No major difference were found between the various item and response format in relation to acquiescence. Opsomming Hierdie ondersoek is uitgevoer om te bepaal of die mate van instemmingsgeneigdheid deur die item- en responsformaat van 'n gesommeerde selfbeoordelingskaal beinvloed word. Daar word dikwels aanbeveel dat die gebruik van positief- sowel as negatiefbewoorde items in 'n vraelys instemmingsgeneigdheid beperk. Suike items word gewoonlik in die tradisionele Likertformaat as stellings geformuleer. Die doel van die ondersoek was om te bepaal of items

  12. Macrostructural Treatment of Multi-word Lexical Items

    Directory of Open Access Journals (Sweden)

    Alenka Vrbinc

    2011-05-01

    Full Text Available The paper discusses the macrostructural treatment of multi-word lexical items in mono- and bilingual dictionaries. First, the classification of multi-word lexical items is presented, and special attention is paid to the discussion of compounds – a specific group of multi-word lexical items that is most commonly afforded headword status but whose inclusion in the headword list may also depend on spelling. Then the inclusion of multi-word lexical items in monolingual dictionaries is dealt with in greater detail, while the results of a short survey on the inclusion of five randomly chosen multi-word lexical items in seven English monolingual dictionaries are presented. The proposals as to how to treat these five multi-word lexical items in bilingual dictionaries are presented in the section about the inclusion of multi-word lexical items in bilingual dictionaries. The conclusion is that it is most important to take the users’ needs into consideration and to make any dictionary as user friendly as possible.

  13. Can Item Keyword Feedback Help Remediate Knowledge Gaps?

    Science.gov (United States)

    Feinberg, Richard A; Clauser, Amanda L

    2016-10-01

    In graduate medical education, assessment results can effectively guide professional development when both assessment and feedback support a formative model. When individuals cannot directly access the test questions and responses, a way of using assessment results formatively is to provide item keyword feedback. The purpose of the following study was to investigate whether exposure to item keyword feedback aids in learner remediation. Participants included 319 trainees who completed a medical subspecialty in-training examination (ITE) in 2012 as first-year fellows, and then 1 year later in 2013 as second-year fellows. Performance on 2013 ITE items in which keywords were, or were not, exposed as part of the 2012 ITE score feedback was compared across groups based on the amount of time studying (preparation). For the same items common to both 2012 and 2013 ITEs, response patterns were analyzed to investigate changes in answer selection. Test takers who indicated greater amounts of preparation on the 2013 ITE did not perform better on the items in which keywords were exposed compared to those who were not exposed. The response pattern analysis substantiated overall growth in performance from the 2012 ITE. For items with incorrect responses on both attempts, examinees selected the same option 58% of the time. Results from the current study were unsuccessful in supporting the use of item keywords in aiding remediation. Unfortunately, the results did provide evidence of examinees retaining misinformation.

  14. Mediate gamma radiation effects on some packaged food items

    International Nuclear Information System (INIS)

    Inamura, Patricia Y.; Uehara, Vanessa B.; Teixeira, Christian A.H.M.; Mastro, Nelida L. del

    2012-01-01

    For most of prepackaged foods a 10 kGy radiation dose is considered the maximum dose needed; however, the commercially available and practically accepted packaging materials must be suitable for such application. This work describes the application of ionizing radiation on several packaged food items, using 5 dehydrated food items, 5 ready-to-eat meals and 5 ready-to-eat food items irradiated in a 60 Co gamma source with a 3 kGy dose. The quality evaluation of the irradiated samples was performed 2 and 8 months after irradiation. Microbiological analysis (bacteria, fungus and yeast load) was performed. The sensory characteristics were established for appearance, aroma, texture and flavor attributes were also established. From these data, the acceptability of all irradiated items was obtained. All ready-to-eat food items assayed like manioc flour, some pâtés and blocks of raw brown sugar and most of ready-to-eat meals like sausages and chicken with legumes were considered acceptable for microbial and sensory characteristics. On the other hand, the dehydrated food items chosen for this study, such as dehydrated bacon potatoes or pea soups were not accepted by the sensory analysis. A careful dose choice and special irradiation conditions must be used in order to achieve sensory acceptability needed for the commercialization of specific irradiated food items. - Highlights: ► We applied gamma radiation on several kinds of packaged food items. ► Microbiological and sensory analyses were performed 2 and 8 months after irradiation. ► All ready-to-eat food items assayed were approved for microbial and sensory characteristics. ► Most ready-to-eat meals like sausages and chicken with legumes were also acceptable. ► Dehydrated bacon potatoes or pea soups were considered not acceptable.

  15. Identifying predictors of physics item difficulty: A linear regression approach

    Science.gov (United States)

    Mesic, Vanes; Muratovic, Hasnija

    2011-06-01

    Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge

  16. Identifying predictors of physics item difficulty: A linear regression approach

    Directory of Open Access Journals (Sweden)

    Hasnija Muratovic

    2011-06-01

    Full Text Available Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal

  17. Item-Based Top-N Recommendation Algorithms

    Science.gov (United States)

    2003-01-20

    basket of items, utilized by many e-commerce sites, cannot take advantage of pre-computed user-to-user similarities. Finally, even though the...not discriminate between items that are present in frequent itemsets and items that are not, while still maintaining the computational advantages of...453219 0.02% 7.74 ccard 42629 68793 398619 0.01% 9.35 ecommerce 6667 17491 91222 0.08% 13.68 em 8002 1648 769311 5.83% 96.14 ml 943 1682 100000 6.31

  18. Use of commercial grade item dedication to reduce procurement costs

    International Nuclear Information System (INIS)

    Rosch, F.

    1995-01-01

    In the mid-1980s, the Nuclear Regulatory Industry (NRC) began inspecting utility practices of procuring and dedicating commercial grade items intended for plant safety-related applications. As a result of the industry efforts to address NRC concerns, nuclear utilities have enhanced existing programs and procedures for dedication of commercial grade items. Though these programs were originally enhanced to meet NRC concerns, utilities have discovered that the dedication of commercial grade items can also reduce overall procurement costs. This paper will discuss the enhancement of utility dedication programs and demonstrates how utilities have utilized them to reduce procurement costs

  19. QA in the procurement of items and services

    International Nuclear Information System (INIS)

    Wilhelm, H.

    1980-01-01

    Procurement of items and services is one of the important elements during the design and construction of Nuclear Power Plants. The purchaser has to establish and implement controls over the procurement process to ensure that the quality criteria, quality level and other quality requirements specified for the particuliar item or service are taken into account. The effect on safety of an error in service or the malfunction of an item is the most important factor to be considered in determining the extent of quality assurance efforts. A typical example of a procurement process will be demonstrated for safety related mechanical components. (orig./RW)

  20. Measurement invariance across educational levels and gender in 12-item Zarit Burden Interview (ZBI) on caregivers of people with dementia.

    Science.gov (United States)

    Lin, Chung-Ying; Ku, Li-Jung Elizabeth; Pakpour, Amir H

    2017-11-01

    The Zarit Burden Interview (ZBI) is a commonly used self-report to assess caregiver burden. A 12-item short form of the ZBI has been developed; however, its measurement invariance has not been examined across some different demographics. It is unclear whether different genders and educational levels of a population interpret the ZBI items similarly. Therefore, this study aimed to examine the measurement invariance of the 12-item ZBI across gender and educational levels in a Taiwanese sample. Caregivers who had a family member with dementia (n = 270) completed the ZBI through telephone interviews. Three confirmatory factor analysis (CFA) models were conducted: Model 1 was the configural model, Model 2 constrained all factor loadings, Model 3 constrained all factor loadings and item intercepts. Multiple group CFAs and the differential item functioning (DIF) contrast under Rasch analyses were used to detect measurement invariance across males (n = 100) and females (n = 170) and across educational levels of junior high schools and below (n = 86) and senior high schools and above (n = 183). The fit index differences between models supported the measurement invariance across gender and across educational levels (∆ comparative fit index (CFI) = -0.010 and 0.003; ∆ root mean square error of approximation (RMSEA) = -0.006 to 0.004). No substantial DIF contrast was found across gender and educational levels (value = -0.36 to 0.29). The ZBI is appropriate for combined use and for comparisons in caregivers across gender and different educational levels in Taiwan.