WorldWideScience

Sample records for irt item parameter

  1. IRT Item Parameter Recovery with Marginal Maximum Likelihood Estimation Using Loglinear Smoothing Models

    Science.gov (United States)

    Casabianca, Jodi M.; Lewis, Charles

    2015-01-01

    Loglinear smoothing (LLS) estimates the latent trait distribution while making fewer assumptions about its form and maintaining parsimony, thus leading to more precise item response theory (IRT) item parameter estimates than standard marginal maximum likelihood (MML). This article provides the expectation-maximization algorithm for MML estimation…

  2. A Review of the Effects on IRT Item Parameter Estimates with a Focus on Misbehaving Common Items in Test Equating.

    Science.gov (United States)

    Michaelides, Michalis P

    2010-01-01

    Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.

  3. A review of the effects on IRT item parameter estimates with a focus on misbehaving common items in test equating

    Directory of Open Access Journals (Sweden)

    Michalis P Michaelides

    2010-10-01

    Full Text Available Many studies have investigated the topic of change or drift in item parameter estimates in the context of Item Response Theory. Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.

  4. Item Response Theory with Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model

    Science.gov (United States)

    Tay, Louis; Huang, Qiming; Vermunt, Jeroen K.

    2016-01-01

    In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…

  5. TWO-PARAMETER IRT MODEL APPLICATION TO ASSESS PROBABILISTIC CHARACTERISTICS OF PROHIBITED ITEMS DETECTION BY AVIATION SECURITY SCREENERS

    Directory of Open Access Journals (Sweden)

    Alexander K. Volkov

    2017-01-01

    Full Text Available The modern approaches to the aviation security screeners’ efficiency have been analyzedand, certain drawbacks have been considered. The main drawback is the complexity of ICAO recommendations implementation concerning taking into account of shadow x-ray image complexity factors during preparation and evaluation of prohibited items detection efficiency by aviation security screeners. Х-ray image based factors are the specific properties of the x-ray image that in- fluence the ability to detect prohibited items by aviation security screeners. The most important complexity factors are: geometric characteristics of a prohibited item; view difficulty of prohibited items; superposition of prohibited items byother objects in the bag; bag content complexity; the color similarity of prohibited and usual items in the luggage.The one-dimensional two-parameter IRT model and the related criterion of aviation security screeners’ qualification have been suggested. Within the suggested model the probabilistic detection characteristics of aviation security screeners are considered as functions of such parameters as the difference between level of qualification and level of x-ray images com- plexity, and also between the aviation security screeners’ responsibility and structure of their professional knowledge. On the basis of the given model it is possible to consider two characteristic functions: first of all, characteristic function of qualifica- tion level which describes multi-complexity level of x-ray image interpretation competency of the aviation security screener; secondly, characteristic function of the x-ray image complexity which describes the range of x-ray image interpretation com- petency of the aviation security screeners having various training levels to interpret the x-ray image of a certain level of com- plexity. The suggested complex criterion to assess the level of the aviation security screener qualification allows to evaluate his or

  6. Robust Scale Transformation Methods in IRT True Score Equating under Common-Item Nonequivalent Groups Design

    Science.gov (United States)

    He, Yong

    2013-01-01

    Common test items play an important role in equating multiple test forms under the common-item nonequivalent groups design. Inconsistent item parameter estimates among common items can lead to large bias in equated scores for IRT true score equating. Current methods extensively focus on detection and elimination of outlying common items, which…

  7. Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating

    Science.gov (United States)

    He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei

    2013-01-01

    Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…

  8. Loglinear multidimensional IRT models for polytomously scired Items

    NARCIS (Netherlands)

    Kelderman, Henk

    1988-01-01

    A loglinear item response theory (IRT) model is proposed that relates polytomously scored item responses to a multidimensional latent space. Each item may have a different response function where each item response may be explained by one or more latent traits. Item response functions may follow a

  9. Item selection via Bayesian IRT models.

    Science.gov (United States)

    Arima, Serena

    2015-02-10

    With reference to a questionnaire that aimed to assess the quality of life for dysarthric speakers, we investigate the usefulness of a model-based procedure for reducing the number of items. We propose a mixed cumulative logit model, which is known in the psychometrics literature as the graded response model: responses to different items are modelled as a function of individual latent traits and as a function of item characteristics, such as their difficulty and their discrimination power. We jointly model the discrimination and the difficulty parameters by using a k-component mixture of normal distributions. Mixture components correspond to disjoint groups of items. Items that belong to the same groups can be considered equivalent in terms of both difficulty and discrimination power. According to decision criteria, we select a subset of items such that the reduced questionnaire is able to provide the same information that the complete questionnaire provides. The model is estimated by using a Bayesian approach, and the choice of the number of mixture components is justified according to information criteria. We illustrate the proposed approach on the basis of data that are collected for 104 dysarthric patients by local health authorities in Lecce and in Milan. Copyright © 2014 John Wiley & Sons, Ltd.

  10. Loglinear multidimensional IRT models for polytomously scored items

    NARCIS (Netherlands)

    Kelderman, Henk; Rijkes, Carl P.M.; Rijkes, Carl

    1994-01-01

    A loglinear IRT model is proposed that relates polytomously scored item responses to a multidimensional latent space. The analyst may specify a response function for each response, indicating which latent abilities are necessary to arrive at that response. Each item may have a different number of

  11. A Comparison of Item Fit Statistics for Mixed IRT Models

    Science.gov (United States)

    Chon, Kyong Hee; Lee, Won-Chan; Dunbar, Stephen B.

    2010-01-01

    In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's G[superscript 2], Orlando and Thissen's S-X[superscript 2] and S-G[superscript 2], and Stone's chi[superscript 2*] and G[superscript 2*]. To investigate the…

  12. IRT-Estimated Reliability for Tests Containing Mixed Item Formats

    Science.gov (United States)

    Shu, Lianghua; Schwarz, Richard D.

    2014-01-01

    As a global measure of precision, item response theory (IRT) estimated reliability is derived for four coefficients (Cronbach's a, Feldt-Raju, stratified a, and marginal reliability). Models with different underlying assumptions concerning test-part similarity are discussed. A detailed computational example is presented for the targeted…

  13. Effect of Item Response Theory (IRT) Model Selection on Testlet-Based Test Equating. Research Report. ETS RR-14-19

    Science.gov (United States)

    Cao, Yi; Lu, Ru; Tao, Wei

    2014-01-01

    The local item independence assumption underlying traditional item response theory (IRT) models is often not met for tests composed of testlets. There are 3 major approaches to addressing this issue: (a) ignore the violation and use a dichotomous IRT model (e.g., the 2-parameter logistic [2PL] model), (b) combine the interdependent items to form a…

  14. Optimal item discrimination and maximum information for logistic IRT models

    NARCIS (Netherlands)

    Veerkamp, W.J.J.; Veerkamp, Wim J.J.; Berger, Martijn P.F.; Berger, Martijn

    1999-01-01

    Items with the highest discrimination parameter values in a logistic item response theory model do not necessarily give maximum information. This paper derives discrimination parameter values, as functions of the guessing parameter and distances between person parameters and item difficulty, that

  15. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions.

    Science.gov (United States)

    Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

    2016-01-01

    This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.

  16. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions

    Directory of Open Access Journals (Sweden)

    Yoon Soo ePark

    2016-02-01

    Full Text Available This study investigates the impact of item parameter drift (IPD on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effect on item parameters and examinee ability.

  17. Detection of Differential Item Functioning with Nonlinear Regression: A Non-IRT Approach Accounting for Guessing

    Science.gov (United States)

    Drabinová, Adéla; Martinková, Patrícia

    2017-01-01

    In this article we present a general approach not relying on item response theory models (non-IRT) to detect differential item functioning (DIF) in dichotomous items with presence of guessing. The proposed nonlinear regression (NLR) procedure for DIF detection is an extension of method based on logistic regression. As a non-IRT approach, NLR can…

  18. Using the Item Response Theory (IRT) for Educational Evaluation through Games

    Science.gov (United States)

    Euzébio Batista, Marcelo Henrique; Victória Barbosa, Jorge Luis; da Rosa Tavares, João Elison; Hackenhaar, Jonathan Luis

    2013-01-01

    This article shows the application of Item Response Theory (IRT) for educational evaluation using games. The article proposes a computational model to create user profiles, called Psychometric Profile Generator (PPG). PPG uses the IRT mathematical model for exploring the levels of skills and behaviors in the form of items and/or stimuli. The model…

  19. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    Science.gov (United States)

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  20. The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory

    Science.gov (United States)

    Sahin, Alper; Anil, Duygu

    2017-01-01

    This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…

  1. A person fit test for IRT models for polytomous items

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; Dagohoy, A.V.

    2007-01-01

    A person fit test based on the Lagrange multiplier test is presented for three item response theory models for polytomous items: the generalized partial credit model, the sequential model, and the graded response model. The test can also be used in the framework of multidimensional ability

  2. Analysis Test of Understanding of Vectors with the Three-Parameter Logistic Model of Item Response Theory and Item Response Curves Technique

    Science.gov (United States)

    Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

    2016-01-01

    This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming…

  3. Methodological issues regarding power of classical test theory (CTT and item response theory (IRT-based approaches for the comparison of patient-reported outcomes in two groups of patients - a simulation study

    Directory of Open Access Journals (Sweden)

    Boyer François

    2010-03-01

    Full Text Available Abstract Background Patients-Reported Outcomes (PRO are increasingly used in clinical and epidemiological research. Two main types of analytical strategies can be found for these data: classical test theory (CTT based on the observed scores and models coming from Item Response Theory (IRT. However, whether IRT or CTT would be the most appropriate method to analyse PRO data remains unknown. The statistical properties of CTT and IRT, regarding power and corresponding effect sizes, were compared. Methods Two-group cross-sectional studies were simulated for the comparison of PRO data using IRT or CTT-based analysis. For IRT, different scenarios were investigated according to whether items or person parameters were assumed to be known, to a certain extent for item parameters, from good to poor precision, or unknown and therefore had to be estimated. The powers obtained with IRT or CTT were compared and parameters having the strongest impact on them were identified. Results When person parameters were assumed to be unknown and items parameters to be either known or not, the power achieved using IRT or CTT were similar and always lower than the expected power using the well-known sample size formula for normally distributed endpoints. The number of items had a substantial impact on power for both methods. Conclusion Without any missing data, IRT and CTT seem to provide comparable power. The classical sample size formula for CTT seems to be adequate under some conditions but is not appropriate for IRT. In IRT, it seems important to take account of the number of items to obtain an accurate formula.

  4. Fitting Diffusion Item Response Theory Models for Responses and Response Times Using the R Package diffIRT

    Directory of Open Access Journals (Sweden)

    Dylan Molenaar

    2015-08-01

    Full Text Available In the psychometric literature, item response theory models have been proposed that explicitly take the decision process underlying the responses of subjects to psychometric test items into account. Application of these models is however hampered by the absence of general and flexible software to fit these models. In this paper, we present diffIRT, an R package that can be used to fit item response theory models that are based on a diffusion process. We discuss parameter estimation and model fit assessment, show the viability of the package in a simulation study, and illustrate the use of the package with two datasets pertaining to extraversion and mental rotation. In addition, we illustrate how the package can be used to fit the traditional diffusion model (as it has been originally developed in experimental psychology to data.

  5. An Investigation of Invariance Properties of One, Two and Three Parameter Logistic Item Response Theory Models

    Directory of Open Access Journals (Sweden)

    O.A. Awopeju

    2017-12-01

    Full Text Available The study investigated the invariance properties of one, two and three parame-ter logistic item response theory models. It examined the best fit among one parameter logistic (1PL, two-parameter logistic (2PL and three-parameter logistic (3PL IRT models for SSCE, 2008 in Mathematics. It also investigated the degree of invariance of the IRT models based item difficulty parameter estimates in SSCE in Mathematics across different samples of examinees and examined the degree of invariance of the IRT models based item discrimination estimates in SSCE in Mathematics across different samples of examinees. In order to achieve the set objectives, 6000 students (3000 males and 3000 females were drawn from the population of 35262 who wrote the 2008 paper 1 Senior Secondary Certificate Examination (SSCE in Mathematics organized by National Examination Council (NECO. The item difficulty and item discrimination parameter estimates from CTT and IRT were tested for invariance using BLOG MG 3 and correlation analysis was achieved using SPSS version 20. The research findings were that two parameter model IRT item difficulty and discrimination parameter estimates exhibited invariance property consistently across different samples and that 2-parameter model was suitable for all samples of examinees unlike one-parameter model and 3-parameter model.

  6. Effectiveness of Item Response Theory (IRT) Proficiency Estimation Methods under Adaptive Multistage Testing. Research Report. ETS RR-15-11

    Science.gov (United States)

    Kim, Sooyeon; Moses, Tim; Yoo, Hanwook Henry

    2015-01-01

    The purpose of this inquiry was to investigate the effectiveness of item response theory (IRT) proficiency estimators in terms of estimation bias and error under multistage testing (MST). We chose a 2-stage MST design in which 1 adaptation to the examinees' ability levels takes place. It includes 4 modules (1 at Stage 1, 3 at Stage 2) and 3 paths…

  7. Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

    Directory of Open Access Journals (Sweden)

    Suttida Rakkapao

    2016-10-01

    Full Text Available This study investigated the multiple-choice test of understanding of vectors (TUV, by applying item response theory (IRT. The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test’s distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.

  8. Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

    Science.gov (United States)

    Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

    2016-12-01

    This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC) that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test's distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.

  9. equateIRT: An R Package for IRT Test Equating

    Directory of Open Access Journals (Sweden)

    Michela Battauz

    2015-12-01

    Full Text Available The R package equateIRT implements item response theory (IRT methods for equating different forms composed of dichotomous items. In particular, the IRT models included are the three-parameter logistic model, the two-parameter logistic model, the one-parameter logistic model and the Rasch model. Forms can be equated when they present common items (direct equating or when they can be linked through a chain of forms that present common items in pairs (indirect or chain equating. When two forms can be equated through different paths, a single conversion can be obtained by averaging the equating coefficients. The package calculates direct and chain equating coefficients. The averaging of direct and chain coefficients that link the same two forms is performed through the bisector method. Furthermore, the package provides analytic standard errors of direct, chain and average equating coefficients.

  10. An empirical comparison of Item Response Theory and Classical Test Theory

    Directory of Open Access Journals (Sweden)

    Špela Progar

    2008-11-01

    Full Text Available Based on nonlinear models between the measured latent variable and the item response, item response theory (IRT enables independent estimation of item and person parameters and local estimation of measurement error. These properties of IRT are also the main theoretical advantages of IRT over classical test theory (CTT. Empirical evidence, however, often failed to discover consistent differences between IRT and CTT parameters and between invariance measures of CTT and IRT parameter estimates. In this empirical study a real data set from the Third International Mathematics and Science Study (TIMSS 1995 was used to address the following questions: (1 How comparable are CTT and IRT based item and person parameters? (2 How invariant are CTT and IRT based item parameters across different participant groups? (3 How invariant are CTT and IRT based item and person parameters across different item sets? The findings indicate that the CTT and the IRT item/person parameters are very comparable, that the CTT and the IRT item parameters show similar invariance property when estimated across different groups of participants, that the IRT person parameters are more invariant across different item sets, and that the CTT item parameters are at least as much invariant in different item sets as the IRT item parameters. The results furthermore demonstrate that, with regards to the invariance property, IRT item/person parameters are in general empirically superior to CTT parameters, but only if the appropriate IRT model is used for modelling the data.

  11. Re-evaluating a vision-related quality of life questionnaire with item response theory (IRT and differential item functioning (DIF analyses

    Directory of Open Access Journals (Sweden)

    Knol Dirk L

    2011-09-01

    Full Text Available Abstract Background For the Low Vision Quality Of Life questionnaire (LVQOL it is unknown whether the psychometric properties are satisfactory when an item response theory (IRT perspective is considered. This study evaluates some essential psychometric properties of the LVQOL questionnaire in an IRT model, and investigates differential item functioning (DIF. Methods Cross-sectional data were used from an observational study among visually-impaired patients (n = 296. Calibration was performed for every dimension of the LVQOL in the graded response model. Item goodness-of-fit was assessed with the S-X2-test. DIF was assessed on relevant background variables (i.e. age, gender, visual acuity, eye condition, rehabilitation type and administration type with likelihood-ratio tests for DIF. The magnitude of DIF was interpreted by assessing the largest difference in expected scores between subgroups. Measurement precision was assessed by presenting test information curves; reliability with the index of subject separation. Results All items of the LVQOL dimensions fitted the model. There was significant DIF on several items. For two items the maximum difference between expected scores exceeded one point, and DIF was found on multiple relevant background variables. Item 1 'Vision in general' from the "Adjustment" dimension and item 24 'Using tools' from the "Reading and fine work" dimension were removed. Test information was highest for the "Reading and fine work" dimension. Indices for subject separation ranged from 0.83 to 0.94. Conclusions The items of the LVQOL showed satisfactory item fit to the graded response model; however, two items were removed because of DIF. The adapted LVQOL with 21 items is DIF-free and therefore seems highly appropriate for use in heterogeneous populations of visually impaired patients.

  12. Evaluation of the Hospital Anxiety and Depression Scale (HADS) in screening stroke patients for symptoms: Item Response Theory (IRT) analysis.

    Science.gov (United States)

    Ayis, Salma A; Ayerbe, Luis; Ashworth, Mark; DA Wolfe, Charles

    2018-03-01

    Variations have been reported in the number of underlying constructs and choice of thresholds that determine caseness of anxiety and /or depression using the Hospital Anxiety and Depression scale (HADS). This study examined the properties of each item of HADS as perceived by stroke patients, and assessed the information these items convey about anxiety and depression between 3 months to 5 years after stroke. The study included 1443 stroke patients from the South London Stroke Register (SLSR). The dimensionality of HADS was examined using factor analysis methods, and items' properties up to 5 years after stroke were tested using Item Response Theory (IRT) methods, including graded response models (GRMs). The presence of two dimensions of HADS (anxiety and depression) for stroke patients was confirmed. Items that accurately inferred about the severity of anxiety and depression, and offered good discrimination of caseness were identified as "I can laugh and see the funny side of things" (Q4) and "I get sudden feelings of panic" (Q13), discrimination 2.44 (se = 0.26), and 3.34 (se = 0.35), respectively. Items that shared properties, hence replicate inference were: "I get a sort of frightened feeling as if something awful is about to happen" (Q3), "I get a sort of frightened feeling like butterflies in my stomach" (Q6), and "Worrying thoughts go through my mind" (Q9). Item properties were maintained over time. Approximately 20% of patients were lost to follow up. A more concise selection of items based on their properties, would provide a precise approach for screening patients and for an optimal allocation of patients into clinical trials. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Psychometric Consequences of Subpopulation Item Parameter Drift

    Science.gov (United States)

    Huggins-Manley, Anne Corinne

    2017-01-01

    This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…

  14. Application of multidimensional IRT models to longitudinal data

    NARCIS (Netherlands)

    te Marvelde, J.M.; Glas, Cornelis A.W.; Van Landeghem, Georges; Van Damme, Jan

    2006-01-01

    The application of multidimensional item response theory (IRT) models to longitudinal educational surveys where students are repeatedly measured is discussed and exemplified. A marginal maximum likelihood (MML) method to estimate the parameters of a multidimensional generalized partial credit model

  15. Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

    Science.gov (United States)

    Suh, Youngsuk

    2016-01-01

    This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…

  16. Detection of Differential Item Functioning with Nonlinear Regression: A Non-IRT Approach Accounting for Guessing

    Czech Academy of Sciences Publication Activity Database

    Drabinová, Adéla; Martinková, Patrícia

    2017-01-01

    Roč. 54, č. 4 (2017), s. 498-517 ISSN 0022-0655 R&D Projects: GA ČR GJ15-15856Y Institutional support: RVO:67985807 Keywords : differential item functioning * non-linear regression * logistic regression * item response theory Subject RIV: AM - Education OBOR OECD: Statistics and probability Impact factor: 0.979, year: 2016

  17. An Investigation of Methods for Reducing Sampling Error in Certain IRT (Item Response Theory) Procedures.

    Science.gov (United States)

    1983-08-01

    Standard Errors for B1 Bell-shaped distribution Rectangular Item b Bn-45 n=90 n-45 n=45 -No. i i N-1500 N=1500 N-6000 N=1500 1 -2.01 -1.75 0.516 0.466...34th Streets Lawrence, KS 66045 Baltimore, MD 21218 ENIC Facility-Acquisitions 1 Dr. Ron Hambleton 4t33 Rugby Avenue School of Education Lcthesda, !ID

  18. Estimating Non-Normal Latent Trait Distributions within Item Response Theory Using True and Estimated Item Parameters

    Science.gov (United States)

    Sass, D. A.; Schmitt, T. A.; Walker, C. M.

    2008-01-01

    Item response theory (IRT) procedures have been used extensively to study normal latent trait distributions and have been shown to perform well; however, less is known concerning the performance of IRT with non-normal latent trait distributions. This study investigated the degree of latent trait estimation error under normal and non-normal…

  19. Pengembangan tes kemampuan literasi sains pada materi momentum dan impuls dengan Analisis Item Response Theory (IRT

    Directory of Open Access Journals (Sweden)

    Della Apriyani Kusuma Putri

    2018-04-01

    Full Text Available Kemampuan literasi sains adalah suatu kemampuan yang memungkinkan seseorang untuk membuat suatu keputusan dengan pengetahuan konsep dan proses sains yang dimilikinya. Berbagai macam permasalahan yang terjadi di era globalisasi ini menuntut siswa untuk tidak hanya cakap dalam aspek kognitif tapi juga mampu memberi keputusan untuk memecahkan permasalahan, sehingga dapat dikatakan bahwa kemampuan literasi sains adalah kemampuan yang penting dan harus dimiliki siswa. Oleh karena itu, dibutuhkan instrumen untuk mengukur kemampuan literasi sains. hal inilah yang mendasari peneliti mengembangkan instrumen kemampuan literasi sains. Tujuan penelitian ini adalah untuk mengembangkan dan mengetahui karakteristik tes kemampuan literasi sains fisika siswa SMA pada materi momentum dan impuls berdasarkan aspek literasi sains yang dikemukakan oleh Gormally. Metode penelitian yang diterapkan adalah penelitian dan pengembangan (Research and Development yaitu metode penelitian yang digunakan untuk menghasilkan produk tertentu, dan menguji keefektifan produk tersebut. Sebelum diuji coba tes telah divalidasi oleh tiga orang validator dan menghasilkan kesimpulan bahwa tes cukup baik dan dapat diuji coba. Hasil analisis menggunakan Item Response Theory menunjukkan bahwa model 3PL adalah model yang sesuai dengan karakteristik tes. Sedangkan karakteristik tes yang meliputi daya pembeda, tingkat kesukaran, dan faktor tebakan termasuk dalam kategori baik. Science literacy skills is an ability that allows one to make a decision with the knowledge of the concepts and processes of science has. A wide variety of problems that occur in a globalized world requires students to not only proficient in cognitive but also able to make a decision to solve the problem, so it can be said that the ability of science literacy is an important capability and must be owned by the students. Therefore, the instrument is required to measure the ability of science literacy. This problem is

  20. Desenvolvimento de uma escala para medir o potencial empreendedor utilizando a Teoria da Resposta ao Item (TRI Development of a scale to measure the entrepreneurial potential using the Item Response Theory (IRT

    Directory of Open Access Journals (Sweden)

    Luciano Ricardo Rath Alves

    2011-01-01

    referenced in theories of entrepreneur's personality. The samples include 664 undergraduate and graduate students of Brazilian universitie, and 100 entrepreneurs of the state of Alagoas. A two- parameter logistic IRT model was used. The parameter estimates were obtained from a sample of 764 people who responded to an instrument containing 103 items. The information and the standard error curves and the qualitative interpretation of the scale levels allowed us to determine the most appropriate range for the instrument use. The results showed that the scale is most adequate to evaluate individuals with low to moderately high entrepreneurial potential. Therefore, it is suggested that new items are incorporated into the instrument to measure and interpret even higher levels. The Item Response Theory allows the calibration of new items to measure entrepreneurs with high entrepreneurial potential using previously obtained data.

  1. Comparing Different Approaches of Bias Correction for Ability Estimation in IRT Models. Research Report. ETS RR-08-13

    Science.gov (United States)

    Lee, Yi-Hsuan; Zhang, Jinming

    2008-01-01

    The method of maximum-likelihood is typically applied to item response theory (IRT) models when the ability parameter is estimated while conditioning on the true item parameters. In practice, the item parameters are unknown and need to be estimated first from a calibration sample. Lewis (1985) and Zhang and Lu (2007) proposed the expected response…

  2. Conditioning factors of test-taking engagement in PIAAC: an exploratory IRT modelling approach considering person and item characteristics

    Directory of Open Access Journals (Sweden)

    Frank Goldhammer

    2017-11-01

    Full Text Available Abstract Background A potential problem of low-stakes large-scale assessments such as the Programme for the International Assessment of Adult Competencies (PIAAC is low test-taking engagement. The present study pursued two goals in order to better understand conditioning factors of test-taking disengagement: First, a model-based approach was used to investigate whether item indicators of disengagement constitute a continuous latent person variable by domain. Second, the effects of person and item characteristics were jointly tested using explanatory item response models. Methods Analyses were based on the Canadian sample of Round 1 of the PIAAC, with N = 26,683 participants completing test items in the domains of literacy, numeracy, and problem solving. Binary item disengagement indicators were created by means of item response time thresholds. Results The results showed that disengagement indicators define a latent dimension by domain. Disengagement increased with lower educational attainment, lower cognitive skills, and when the test language was not the participant’s native language. Gender did not exert any effect on disengagement, while age had a positive effect for problem solving only. An item’s location in the second of two assessment modules was positively related to disengagement, as was item difficulty. The latter effect was negatively moderated by cognitive skill, suggesting that poor test-takers are especially likely to disengage with more difficult items. Conclusions The negative effect of cognitive skill, the positive effect of item difficulty, and their negative interaction effect support the assumption that disengagement is the outcome of individual expectations about success (informed disengagement.

  3. Statistical Indexes for Monitoring Item Behavior under Computer Adaptive Testing Environment.

    Science.gov (United States)

    Zhu, Renbang; Yu, Feng; Liu, Su

    A computerized adaptive test (CAT) administration usually requires a large supply of items with accurately estimated psychometric properties, such as item response theory (IRT) parameter estimates, to ensure the precision of examinee ability estimation. However, an estimated IRT model of a given item in any given pool does not always correctly…

  4. Explanatory IRT Analysis Using the SPIRIT Macro in SPSS

    Directory of Open Access Journals (Sweden)

    DiTrapani, Jack

    2018-04-01

    Full Text Available Item Response Theory (IRT is a modeling framework that can be applied to a large variety of research questions spanning several disciplines. To make IRT models more accessible for the general researcher, a free tool has been created that can easily conduct one-parameter logistic IRT (1PL analyses using the convenient point-and-click interface in SPSS without any required downloads or add-ons. This tool, the SPIRIT macro, can fit 1PL models with person and item covariates, DIF analyses, multidimensional models, multigroup models, rating scale models, and several other variations. Example explanatory models are presented with an applied dataset containing responses to an ADHD rating scale. Illustrations of how to fit basic 1PL models as well as two more complicated analyses using SPIRIT are given.

  5. IRT i pomiar edukacyjny

    Directory of Open Access Journals (Sweden)

    Bartosz Kondratek

    2013-12-01

    Full Text Available Pod nazwą „item response theory” kryje się rodzina narzędzi statystycznych wykorzystywanych do modelowania odpowiedzi na rozwiązywane zadania oraz umiejętności uczniów. Modele IRT czynią to poprzez wprowadzenie parametryzacji, która określa: właściwości zadań oraz rozkład poziomu umiejętności uczniów. W artykule przedstawiony zostanie ogólny opis jednowymiarowego modelu IRT, przybliżone zostaną najczęściej stosowane modele dla zadań ocenianych dwupunktowo (2PLM, 3PLM, 1PLM oraz wielopunktowo (GPCM, a także zarysowana zostanie problematyka estymacji poziomu umiejętności. Artykuł ma za zadanie wprowadzić czytelnika w techniczne szczegóły związane z modelowaniem IRT oraz przedstawić wybrane zastosowania praktyczne w pomiarze edukacyjnym. Wśród zastosowań praktycznych omówiono wykorzystanie IRT w analizie skomplikowanych schematów badawczych, zrównywaniu/łączeniu wyników testowych, adaptatywnym testowaniu oraz przy tworzeniu map zadań.

  6. Differences in symptom expression between unipolar and bipolar spectrum depression: Results from a nationally representative sample using item response theory (IRT).

    Science.gov (United States)

    Hoertel, Nicolas; Blanco, Carlos; Peyre, Hugo; Wall, Melanie M; McMahon, Kibby; Gorwood, Philip; Lemogne, Cédric; Limosin, Frédéric

    2016-11-01

    The inclusion of subsyndromal forms of bipolarity in the fifth edition of the DSM has major implications for the way in which we approach the diagnosis of individuals with depressive symptoms. The aim of the present study was to use methods based on item response theory (IRT) to examine whether, when equating for levels of depression severity, there are differences in the likelihood of reporting DSM-IV symptoms of major depressive episode (MDE) between subjects with and without a lifetime history of manic symptoms. We conducted these analyses using a large, nationally representative sample from the USA (n=34,653), the second wave of the National Epidemiologic Survey on Alcohol and Related Conditions. The items sadness, appetite disturbance and psychomotor symptoms were better indicators of depression severity in participants without a lifetime history of manic symptoms, in a clinically meaningful way. DSM-IV symptoms of MDE were substantially less informative in participants with a lifetime history of manic symptoms than in those without such history. Clinical information on DSM-IV depressive and manic symptoms was based on retrospective self-report The clinical presentation of depressive symptoms may substantially differ in individuals with and without a lifetime history of manic symptoms. These findings alert to the possibility of atypical symptomatic presentations among individuals with co-occurring symptoms or disorders and highlight the importance of continued research into specific pathophysiology differentiating unipolar and bipolar depression. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. Investigating Separate and Concurrent Approaches for Item Parameter Drift in 3PL Item Response Theory Equating

    Science.gov (United States)

    Arce-Ferrer, Alvaro J.; Bulut, Okan

    2017-01-01

    This study examines separate and concurrent approaches to combine the detection of item parameter drift (IPD) and the estimation of scale transformation coefficients in the context of the common item nonequivalent groups design with the three-parameter item response theory equating. The study uses real and synthetic data sets to compare the two…

  8. Examining sex differences in DSM-IV-TR narcissistic personality disorder symptom expression using Item Response Theory (IRT).

    Science.gov (United States)

    Hoertel, Nicolas; Peyre, Hugo; Lavaud, Pierre; Blanco, Carlos; Guerin-Langlois, Christophe; René, Margaux; Schuster, Jean-Pierre; Lemogne, Cédric; Delorme, Richard; Limosin, Frédéric

    2017-12-14

    The limited published literature on the subject suggests that there may be differences in how females and males experience narcissistic personality disorder (NPD) symptoms. The aim of this study was to use methods based on item response theory to examine whether, when equating for levels of NPD symptom severity, there are sex differences in the likelihood of reporting DSM-IV-TR NPD symptoms. We conducted these analyses using a large, nationally representative sample from the USA (n=34,653), the second wave of the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). There were statistically and clinically significant sex differences for 2 out of the 9 DSM-IV-TR NPD symptoms. We found that males were more likely to endorse the item 'lack of empathy' at lower levels of narcissistic personality disorder severity than females. The item 'being envious' was a better indicator of NPD severity in males than in females. There were no clinically significant sex differences on the remaining NPD symptoms. Overall, our findings indicate substantial sex differences in narcissistic personality disorder symptom expression. Although our results may reflect sex-bias in diagnostic criteria, they are consistent with recent views suggesting that narcissistic personality disorder may be underpinned by shared and sex-specific mechanisms. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Optimizing incomplete sample designs for item response model parameters

    NARCIS (Netherlands)

    van der Linden, Willem J.

    Several models for optimizing incomplete sample designs with respect to information on the item parameters are presented. The following cases are considered: (1) known ability parameters; (2) unknown ability parameters; (3) item sets with multiple ability scales; and (4) response models with

  10. Algorithmic test design using classical item parameters

    NARCIS (Netherlands)

    van der Linden, Willem J.; Adema, Jos J.

    Two optimalization models for the construction of tests with a maximal value of coefficient alpha are given. Both models have a linear form and can be solved by using a branch-and-bound algorithm. The first model assumes an item bank calibrated under the Rasch model and can be used, for instance,

  11. Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

    Science.gov (United States)

    Lee, Yi-Hsuan; Zhang, Jinming

    2017-01-01

    Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

  12. Are symptom features of depression during pregnancy, the postpartum period and outside the peripartum period distinct? Results from a nationally representative sample using item response theory (IRT).

    Science.gov (United States)

    Hoertel, Nicolas; López, Saioa; Peyre, Hugo; Wall, Melanie M; González-Pinto, Ana; Limosin, Frédéric; Blanco, Carlos

    2015-02-01

    Whether there are systematic differences in depression symptom expression during pregnancy, the postpartum period and outside these periods (i.e., outside the peripartum period) remains debated. The aim of this study was to use methods based on item response theory (IRT) to examine, after equating for depression severity, differences in the likelihood of reporting DSM-IV symptoms of major depressive episode (MDE) in women of childbearing age (i.e., aged 18-50) during pregnancy, the postpartum period and outside the peripartum period. We conducted these analyses using a large, nationally representative sample of women of childbearing age from the United States (n = 11,256) who participated in the second wave of the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). The overall 12-month prevalence of all depressive criteria (except for worthlessness/guilt) was significantly lower in pregnant women than in women of childbearing age outside the peripartum period, whereas the prevalence of all symptoms (except for "psychomotor symptoms") was not significantly different between the postpartum and the nonperipartum group. There were no clinically significant differences in the endorsement rates of symptoms of MDE by pregnancy status when equating for levels of depression severity. This study suggests that the clinical presentation of depressive symptoms in women of childbearing age does not differ during pregnancy, the postpartum period and outside the peripartum period. These findings do not provide psychometric support for the inclusion of the peripartum onset specifier for major depressive disorder in the DSM-5. © 2014 Wiley Periodicals, Inc.

  13. Comparing the IRT Pre-equating and Section Pre-equating: A Simulation Study.

    Science.gov (United States)

    Hwang, Chi-en; Cleary, T. Anne

    The results obtained from two basic types of pre-equatings of tests were compared: the item response theory (IRT) pre-equating and section pre-equating (SPE). The simulated data were generated from a modified three-parameter logistic model with a constant guessing parameter. Responses of two replication samples of 3000 examinees on two 72-item…

  14. A simple and fast item selection procedure for adaptive testing

    NARCIS (Netherlands)

    Veerkamp, W.J.J.; Veerkamp, Wim J.J.; Berger, Martijn; Berger, Martijn P.F.

    1994-01-01

    Items with the highest discrimination parameter values in a logistic item response theory (IRT) model do not necessarily give maximum information. This paper shows which discrimination parameter values (as a function of the guessing parameter and the distance between person ability and item

  15. Finite Mixture Multilevel Multidimensional Ordinal IRT Models for Large Scale Cross-Cultural Research

    Science.gov (United States)

    de Jong, Martijn G.; Steenkamp, Jan-Benedict E. M.

    2010-01-01

    We present a class of finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Our model is proposed for confirmatory research settings. Our prior for item parameters is a mixture distribution to accommodate situations where different groups of countries have different measurement operations, while…

  16. Criteria for eliminating items of a Test of Figural Analogies

    Directory of Open Access Journals (Sweden)

    Diego Blum

    2013-12-01

    Full Text Available This paper describes the steps taken to eliminate two of the items in a Test of Figural Analogies (TFA. The main guidelines of psychometric analysis concerning Classical Test Theory (CTT and Item Response Theory (IRT are explained. The item elimination process was based on both the study of the CTT difficulty and discrimination index, and the unidimensionality analysis. The a, b, and c parameters of the Three Parameter Logistic Model of IRT were also considered for this purpose, as well as the assessment of each item fitting this model. The unfavourable characteristics of a group of TFA items are detailed, and decisions leading to their possible elimination are discussed.

  17. IRT-based test construction

    OpenAIRE

    van der Linden, Willem J.; Theunissen, T.J.J.M.; Boekkooi-Timminga, Ellen; Kelderman, Henk

    1987-01-01

    Four discussions of test construction based on item response theory (IRT) are presented. The first discussion, "Test Design as Model Building in Mathematical Programming" (T.J.J.M. Theunissen), presents test design as a decision process under certainty. A natural way of modeling this process leads to mathematical programming. General models of test construction are discussed, with information about algorithms and heuristics; ideas about the analysis and refinement of test constraints are also...

  18. Numerical Differentiation Methods for Computing Error Covariance Matrices in Item Response Theory Modeling: An Evaluation and a New Proposal

    Science.gov (United States)

    Tian, Wei; Cai, Li; Thissen, David; Xin, Tao

    2013-01-01

    In item response theory (IRT) modeling, the item parameter error covariance matrix plays a critical role in statistical inference procedures. When item parameters are estimated using the EM algorithm, the parameter error covariance matrix is not an automatic by-product of item calibration. Cai proposed the use of Supplemented EM algorithm for…

  19. Item level diagnostics and model - data fit in item response theory ...

    African Journals Online (AJOL)

    Item response theory (IRT) is a framework for modeling and analyzing item response data. Item-level modeling gives IRT advantages over classical test theory. The fit of an item score pattern to an item response theory (IRT) models is a necessary condition that must be assessed for further use of item and models that best fit ...

  20. A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)

    Science.gov (United States)

    Arenson, Ethan A.; Karabatsos, George

    2017-01-01

    Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…

  1. Extending item response theory to online homework

    Directory of Open Access Journals (Sweden)

    Gerd Kortemeyer

    2014-05-01

    Full Text Available Item response theory (IRT becomes an increasingly important tool when analyzing “big data” gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large-enrollment physics course for scientists and engineers, the study compares outcomes from IRT analyses of exam and homework data, and then proceeds to investigate the effects of each confounding factor introduced in the online realm. It is found that IRT yields the correct trends for learner ability and meaningful item parameters, yet overall agreement with exam data is moderate. It is also found that learner ability and item discrimination is robust over a wide range with respect to model assumptions and introduced noise. Item difficulty is also robust, but over a narrower range.

  2. Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers

    Directory of Open Access Journals (Sweden)

    Stochl Jan

    2012-06-01

    Full Text Available Abstract Background Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Methods Scalability of data from 1 a cross-sectional health survey (the Scottish Health Education Population Survey and 2 a general population birth cohort study (the National Child Development Study illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. Results and conclusions After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items we show that all items from the 12-item General Health Questionnaire (GHQ-12 – when binary scored – were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech’s “well-being” and “distress” clinical scales. An illustration of ordinal item analysis

  3. Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers.

    Science.gov (United States)

    Stochl, Jan; Jones, Peter B; Croudace, Tim J

    2012-06-11

    Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related) Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Scalability of data from 1) a cross-sectional health survey (the Scottish Health Education Population Survey) and 2) a general population birth cohort study (the National Child Development Study) illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items) we show that all items from the 12-item General Health Questionnaire (GHQ-12)--when binary scored--were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech's "well-being" and "distress" clinical scales). An illustration of ordinal item analysis confirmed that all 14 positively worded items of the Warwick-Edinburgh Mental

  4. Effects of Misbehaving Common Items on Aggregate Scores and an Application of the Mantel-Haenszel Statistic in Test Equating. CSE Report 688

    Science.gov (United States)

    Michaelides, Michalis P.

    2006-01-01

    Consistent behavior is a desirable characteristic that common items are expected to have when administered to different groups. Findings from the literature have established that items do not always behave in consistent ways; item indices and IRT item parameter estimates of the same items differ when obtained from different administrations.…

  5. The Impact of Three Factors on the Recovery of Item Parameters for the Three-Parameter Logistic Model

    Science.gov (United States)

    Kim, Kyung Yong; Lee, Won-Chan

    2017-01-01

    This article provides a detailed description of three factors (specification of the ability distribution, numerical integration, and frame of reference for the item parameter estimates) that might affect the item parameter estimation of the three-parameter logistic model, and compares five item calibration methods, which are combinations of the…

  6. Detection of differential item functioning using Lagrange multiplier tests

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    1996-01-01

    In this paper it is shown that differential item functioning can be evaluated using the Lagrange multiplier test or C. R. Rao's efficient score test. The test is presented in the framework of a number of item response theory (IRT) models such as the Rasch model, the one-parameter logistic model, the

  7. Item response theory analysis of Centers for Disease Control and Prevention Health-Related Quality of Life (CDC HRQOL) items in adults with arthritis.

    Science.gov (United States)

    Mielenz, Thelma J; Callahan, Leigh F; Edwards, Michael C

    2016-03-12

    Examine the feasibility of performing an item response theory (IRT) analysis on two of the Centers for Disease Control and Prevention health-related quality of life (CDC HRQOL) modules - the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM). Previous principal components analyses confirm that the two scales both assess a mix of mental (CDC-MH) and physical health (CDC-PH). The purpose is to conduct item response theory (IRT) analysis on the CDC-MH and CDC-PH scales separately. 2182 patients with self-reported or physician-diagnosed arthritis completed a cross-sectional survey including HDCM and HDSM items. Besides global health, the other 8 items ask the number of days that some statement was true; we chose to recode the data into 8 categories based on observed clustering. The IRT assumptions were assessed using confirmatory factor analysis and the data could be modeled using an unidimensional IRT model. The graded response model was used for IRT analyses and CDC-MH and CDC-PH scales were analyzed separately in flexMIRT. The IRT parameter estimates for the five-item CDC-PH all appeared reasonable. The three-item CDC-MH did not have reasonable parameter estimates. The CDC-PH scale is amenable to IRT analysis but the existing The CDC-MH scale is not. We suggest either using the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM) as they currently stand or the CDC-PH scale alone if the primary goal is to measure physical health related HRQOL.

  8. Uncertainties in the Item Parameter Estimates and Robust Automated Test Assembly

    Science.gov (United States)

    Veldkamp, Bernard P.; Matteucci, Mariagiulia; de Jong, Martijn G.

    2013-01-01

    Item response theory parameters have to be estimated, and because of the estimation process, they do have uncertainty in them. In most large-scale testing programs, the parameters are stored in item banks, and automated test assembly algorithms are applied to assemble operational test forms. These algorithms treat item parameters as fixed values,…

  9. A Note on the Item Information Function of the Four-Parameter Logistic Model

    Science.gov (United States)

    Magis, David

    2013-01-01

    This article focuses on four-parameter logistic (4PL) model as an extension of the usual three-parameter logistic (3PL) model with an upper asymptote possibly different from 1. For a given item with fixed item parameters, Lord derived the value of the latent ability level that maximizes the item information function under the 3PL model. The…

  10. Algorithms for computerized test construction using classical item parameters

    NARCIS (Netherlands)

    Adema, Jos J.; van der Linden, Willem J.

    1989-01-01

    Recently, linear programming models for test construction were developed. These models were based on the information function from item response theory. In this paper another approach is followed. Two 0-1 linear programming models for the construction of tests using classical item and test

  11. The Effect of Error in Item Parameter Estimates on the Test Response Function Method of Linking.

    Science.gov (United States)

    Kaskowitz, Gary S.; De Ayala, R. J.

    2001-01-01

    Studied the effect of item parameter estimation for computation of linking coefficients for the test response function (TRF) linking/equating method. Simulation results showed that linking was more accurate when there was less error in the parameter estimates, and that 15 or 25 common items provided better results than 5 common items under both…

  12. Computing replenishment cycle policy parameters for a perishable item

    NARCIS (Netherlands)

    Rossi, R.; Tarim, S.A.; Hnich, B.; Prestwich, S.

    2010-01-01

    In many industrial environments there is a significant class of problems for which the perishable nature of the inventory cannot be ignored in developing replenishment order plans. Food is the most salient example of a perishable inventory item. In this work, we consider the periodic-review,

  13. Bayesian Estimation of the Logistic Positive Exponent IRT Model

    Science.gov (United States)

    Bolfarine, Heleno; Bazan, Jorge Luis

    2010-01-01

    A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…

  14. A MATLAB Package for Markov Chain Monte Carlo with a Multi-Unidimensional IRT Model

    Directory of Open Access Journals (Sweden)

    Yanyan Sheng

    2008-11-01

    Full Text Available Unidimensional item response theory (IRT models are useful when each item is designed to measure some facet of a unified latent trait. In practical applications, items are not necessarily measuring the same underlying trait, and hence the more general multi-unidimensional model should be considered. This paper provides the requisite information and description of software that implements the Gibbs sampler for such models with two item parameters and a normal ogive form. The software developed is written in the MATLAB package IRTmu2no. The package is flexible enough to allow a user the choice to simulate binary response data with multiple dimensions, set the number of total or burn-in iterations, specify starting values or prior distributions for model parameters, check convergence of the Markov chain, as well as obtain Bayesian fit statistics. Illustrative examples are provided to demonstrate and validate the use of the software package.

  15. ITEM LEVEL DIAGNOSTICS AND MODEL - DATA FIT IN ITEM ...

    African Journals Online (AJOL)

    Global Journal

    Item response theory (IRT) is a framework for modeling and analyzing item response ... data. Though, there is an argument that the evaluation of fit in IRT modeling has been ... National Council on Measurement in Education ... model data fit should be based on three types of ... prediction should be assessed through the.

  16. Item Response Data Analysis Using Stata Item Response Theory Package

    Science.gov (United States)

    Yang, Ji Seung; Zheng, Xiaying

    2018-01-01

    The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

  17. Detection of differential item functioning using Lagrange multiplier tests

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    1998-01-01

    Abstract: In the present paper it is shown that differential item functioning can be evaluated using the Lagrange multiplier test or Rao’s efficient score test. The test is presented in the framework of a number of IRT models such as the Rasch model, the OPLM, the 2-parameter logistic model, the

  18. Item Response Theory: A Basic Concept

    Science.gov (United States)

    Mahmud, Jumailiyah

    2017-01-01

    With the development in computing technology, item response theory (IRT) develops rapidly, and has become a user friendly application in psychometrics world. Limitation in classical theory is one aspect that encourages the use of IRT. In this study, the basic concept of IRT will be discussed. In addition, it will briefly review the ability…

  19. Non-ignorable missingness item response theory models for choice effects in examinee-selected items.

    Science.gov (United States)

    Liu, Chen-Wei; Wang, Wen-Chung

    2017-11-01

    Examinee-selected item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set, always yields incomplete data (i.e., when only the selected items are answered, data are missing for the others) that are likely non-ignorable in likelihood inference. Standard item response theory (IRT) models become infeasible when ESI data are missing not at random (MNAR). To solve this problem, the authors propose a two-dimensional IRT model that posits one unidimensional IRT model for observed data and another for nominal selection patterns. The two latent variables are assumed to follow a bivariate normal distribution. In this study, the mirt freeware package was adopted to estimate parameters. The authors conduct an experiment to demonstrate that ESI data are often non-ignorable and to determine how to apply the new model to the data collected. Two follow-up simulation studies are conducted to assess the parameter recovery of the new model and the consequences for parameter estimation of ignoring MNAR data. The results of the two simulation studies indicate good parameter recovery of the new model and poor parameter recovery when non-ignorable missing data were mistakenly treated as ignorable. © 2017 The British Psychological Society.

  20. Measurement Invariance in Careers Research: Using IRT to Study Gender Differences in Medical Students' Specialization Decisions

    Science.gov (United States)

    Behrend, Tara S.; Thompson, Lori Foster; Meade, Adam W.; Newton, Dale A.; Grayson, Martha S.

    2008-01-01

    The current study demonstrates the use of item response theory (IRT) to conduct measurement invariance analyses in careers research. A self-report survey was used to assess the importance 1,363 fourth-year medical students placed on opportunities to provide comprehensive patient care when choosing a career specialty. IRT analyses supported…

  1. Measuring individual significant change on the Beck Depression Inventory-II through IRT-based statistics.

    NARCIS (Netherlands)

    Brouwer, D.; Meijer, R.R.; Zevalkink, D.J.

    2013-01-01

    Several researchers have emphasized that item response theory (IRT)-based methods should be preferred over classical approaches in measuring change for individual patients. In the present study we discuss and evaluate the use of IRT-based statistics to measure statistical significant individual

  2. The Prediction of Item Parameters Based on Classical Test Theory and Latent Trait Theory

    Science.gov (United States)

    Anil, Duygu

    2008-01-01

    In this study, the prediction power of the item characteristics based on the experts' predictions on conditions try-out practices cannot be applied was examined for item characteristics computed depending on classical test theory and two-parameters logistic model of latent trait theory. The study was carried out on 9914 randomly selected students…

  3. A Multidimensional Partial Credit Model with Associated Item and Test Statistics: An Application to Mixed-Format Tests

    Science.gov (United States)

    Yao, Lihua; Schwarz, Richard D.

    2006-01-01

    Multidimensional item response theory (IRT) models have been proposed for better understanding the dimensional structure of data or to define diagnostic profiles of student learning. A compensatory multidimensional two-parameter partial credit model (M-2PPC) for constructed-response items is presented that is a generalization of those proposed to…

  4. Application of Item Response Theory to Tests of Substance-related Associative Memory

    Science.gov (United States)

    Shono, Yusuke; Grenard, Jerry L.; Ames, Susan L.; Stacy, Alan W.

    2015-01-01

    A substance-related word association test (WAT) is one of the commonly used indirect tests of substance-related implicit associative memory and has been shown to predict substance use. This study applied an item response theory (IRT) modeling approach to evaluate psychometric properties of the alcohol- and marijuana-related WATs and their items among 775 ethnically diverse at-risk adolescents. After examining the IRT assumptions, item fit, and differential item functioning (DIF) across gender and age groups, the original 18 WAT items were reduced to 14- and 15-items in the alcohol- and marijuana-related WAT, respectively. Thereafter, unidimensional one- and two-parameter logistic models (1PL and 2PL models) were fitted to the revised WAT items. The results demonstrated that both alcohol- and marijuana-related WATs have good psychometric properties. These results were discussed in light of the framework of a unified concept of construct validity (Messick, 1975, 1989, 1995). PMID:25134051

  5. Item response theory - A first approach

    Science.gov (United States)

    Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

    2017-07-01

    The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.

  6. Improving Measurement Efficiency of the Inner EAR Scale with Item Response Theory.

    Science.gov (United States)

    Jessen, Annika; Ho, Andrew D; Corrales, C Eduardo; Yueh, Bevan; Shin, Jennifer J

    2018-02-01

    Objectives (1) To assess the 11-item Inner Effectiveness of Auditory Rehabilitation (Inner EAR) instrument with item response theory (IRT). (2) To determine whether the underlying latent ability could also be accurately represented by a subset of the items for use in high-volume clinical scenarios. (3) To determine whether the Inner EAR instrument correlates with pure tone thresholds and word recognition scores. Design IRT evaluation of prospective cohort data. Setting Tertiary care academic ambulatory otolaryngology clinic. Subjects and Methods Modern psychometric methods, including factor analysis and IRT, were used to assess unidimensionality and item properties. Regression methods were used to assess prediction of word recognition and pure tone audiometry scores. Results The Inner EAR scale is unidimensional, and items varied in their location and information. Information parameter estimates ranged from 1.63 to 4.52, with higher values indicating more useful items. The IRT model provided a basis for identifying 2 sets of items with relatively lower information parameters. Item information functions demonstrated which items added insubstantial value over and above other items and were removed in stages, creating a 8- and 3-item Inner EAR scale for more efficient assessment. The 8-item version accurately reflected the underlying construct. All versions correlated moderately with word recognition scores and pure tone averages. Conclusion The 11-, 8-, and 3-item versions of the Inner EAR scale have strong psychometric properties, and there is correlational validity evidence for the observed scores. Modern psychometric methods can help streamline care delivery by maximizing relevant information per item administered.

  7. Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

    Science.gov (United States)

    Kim, Seonghoon

    2013-01-01

    With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…

  8. IRT-LR-DIF with Estimation of the Focal-Group Density as an Empirical Histogram

    Science.gov (United States)

    Woods, Carol M.

    2008-01-01

    Item response theory-likelihood ratio-differential item functioning (IRT-LR-DIF) is used to evaluate the degree to which items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Usually, the latent distribution is presumed normal for both…

  9. Optimal and Most Exact Confidence Intervals for Person Parameters in Item Response Theory Models

    Science.gov (United States)

    Doebler, Anna; Doebler, Philipp; Holling, Heinz

    2013-01-01

    The common way to calculate confidence intervals for item response theory models is to assume that the standardized maximum likelihood estimator for the person parameter [theta] is normally distributed. However, this approximation is often inadequate for short and medium test lengths. As a result, the coverage probabilities fall below the given…

  10. An Introduction to Item Response Theory for Health Behavior Researchers

    Science.gov (United States)

    Warne, Russell T.; McKyer, E. J. Lisako; Smith, Matthew L.

    2012-01-01

    Objective: To introduce item response theory (IRT) to health behavior researchers by contrasting it with classical test theory and providing an example of IRT in health behavior. Method: Demonstrate IRT by fitting the 2PL model to substance-use survey data from the Adolescent Health Risk Behavior questionnaire (n = 1343 adolescents). Results: An…

  11. Item Response Theory: Overview, Applications, and Promise for Institutional Research

    Science.gov (United States)

    Bowman, Nicholas A.; Herzog, Serge; Sharkness, Jessica

    2014-01-01

    Item Response Theory (IRT) is a measurement theory that is ideal for scale and test development in institutional research, but it is not without its drawbacks. This chapter provides an overview of IRT, describes an example of its use, and highlights the pros and cons of using IRT in applied settings.

  12. Item Banking with Embedded Standards

    Science.gov (United States)

    MacCann, Robert G.; Stanley, Gordon

    2009-01-01

    An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…

  13. The Impact of Varied Discrimination Parameters on Mixed-Format Item Response Theory Model Selection

    Science.gov (United States)

    Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G.

    2013-01-01

    Whittaker, Chang, and Dodd compared the performance of model selection criteria when selecting among mixed-format IRT models and found that the criteria did not perform adequately when selecting the more parameterized models. It was suggested by M. S. Johnson that the problems when selecting the more parameterized models may be because of the low…

  14. Careful with Those Priors: A Note on Bayesian Estimation in Two-Parameter Logistic Item Response Theory Models

    Science.gov (United States)

    Marcoulides, Katerina M.

    2018-01-01

    This study examined the use of Bayesian analysis methods for the estimation of item parameters in a two-parameter logistic item response theory model. Using simulated data under various design conditions with both informative and non-informative priors, the parameter recovery of Bayesian analysis methods were examined. Overall results showed that…

  15. IRTPRO 2.1 for Windows (Item Response Theory for Patient-Reported Outcomes)

    Science.gov (United States)

    Paek, Insu; Han, Kyung T.

    2013-01-01

    This article reviews a new item response theory (IRT) model estimation program, IRTPRO 2.1, for Windows that is capable of unidimensional and multidimensional IRT model estimation for existing and user-specified constrained IRT models for dichotomously and polytomously scored item response data. (Contains 1 figure and 2 notes.)

  16. Relationships among Classical Test Theory and Item Response Theory Frameworks via Factor Analytic Models

    Science.gov (United States)

    Kohli, Nidhi; Koran, Jennifer; Henn, Lisa

    2015-01-01

    There are well-defined theoretical differences between the classical test theory (CTT) and item response theory (IRT) frameworks. It is understood that in the CTT framework, person and item statistics are test- and sample-dependent. This is not the perception with IRT. For this reason, the IRT framework is considered to be theoretically superior…

  17. Using Patient Health Questionnaire-9 item parameters of a common metric resulted in similar depression scores compared to independent item response theory model reestimation.

    Science.gov (United States)

    Liegl, Gregor; Wahl, Inka; Berghöfer, Anne; Nolte, Sandra; Pieh, Christoph; Rose, Matthias; Fischer, Felix

    2016-03-01

    To investigate the validity of a common depression metric in independent samples. We applied a common metrics approach based on item-response theory for measuring depression to four German-speaking samples that completed the Patient Health Questionnaire (PHQ-9). We compared the PHQ item parameters reported for this common metric to reestimated item parameters that derived from fitting a generalized partial credit model solely to the PHQ-9 items. We calibrated the new model on the same scale as the common metric using two approaches (estimation with shifted prior and Stocking-Lord linking). By fitting a mixed-effects model and using Bland-Altman plots, we investigated the agreement between latent depression scores resulting from the different estimation models. We found different item parameters across samples and estimation methods. Although differences in latent depression scores between different estimation methods were statistically significant, these were clinically irrelevant. Our findings provide evidence that it is possible to estimate latent depression scores by using the item parameters from a common metric instead of reestimating and linking a model. The use of common metric parameters is simple, for example, using a Web application (http://www.common-metrics.org) and offers a long-term perspective to improve the comparability of patient-reported outcome measures. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Comparing Two Versions of the MEOCS Using Differential Item Functioning

    National Research Council Canada - National Science Library

    Truhon, Stephen

    2003-01-01

    ...) from item response theory (IRT). DIF was found for the majority of the 40 items examined, although in many cases the DIF indicated improvements in the revised items. Implications for these scales and for the use of IRT with the MEOCS are discussed.

  19. Does guessing matter? Differences between ability estimates from 2PL and 3PL IRT models in case of guessing

    Directory of Open Access Journals (Sweden)

    Tomasz Żółtak

    2015-09-01

    Full Text Available Modern approaches to measuring cognitive ability and testing knowledge frequently use multiple-choice items. These can be simply and rapidly scored without problems associated with rater subjectivity. Nevertheless, multiple-choice tests are often criticized owing to their vulnerability to guessing. In this paper the impact of guessing was examined using simulation. Ability estimates were obtained from the two IRT models commonly used for binary-scored items: the two-parameter logistic model and the three-parameter logistic model. The latter approach explicitly models guessing, whilst the former does not. Rather counter-intuitively, little difference was identified for point estimates of ability from the 2PLM and 3PLM. Nevertheless, it should be noted that difficulty and discrimination parameters are severely downwardly biased if a 2PLM is used to calibrate data generated by processes involving guessing. Estimated standard errors for ability estimates also differ considerably between these models.

  20. Effects of Initial Values and Convergence Criterion in the Two-Parameter Logistic Model When Estimating the Latent Distribution in BILOG-MG 3.

    Directory of Open Access Journals (Sweden)

    Ingo W Nader

    Full Text Available Parameters of the two-parameter logistic model are generally estimated via the expectation-maximization algorithm, which improves initial values for all parameters iteratively until convergence is reached. Effects of initial values are rarely discussed in item response theory (IRT, but initial values were recently found to affect item parameters when estimating the latent distribution with full non-parametric maximum likelihood. However, this method is rarely used in practice. Hence, the present study investigated effects of initial values on item parameter bias and on recovery of item characteristic curves in BILOG-MG 3, a widely used IRT software package. Results showed notable effects of initial values on item parameters. For tighter convergence criteria, effects of initial values decreased, but item parameter bias increased, and the recovery of the latent distribution worsened. For practical application, it is advised to use the BILOG default convergence criterion with appropriate initial values when estimating the latent distribution from data.

  1. Better assessment of physical function: item improvement is neglected but essential.

    Science.gov (United States)

    Bruce, Bonnie; Fries, James F; Ambrosini, Debbie; Lingala, Bharathi; Gandek, Barbara; Rose, Matthias; Ware, John E

    2009-01-01

    Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models

  2. Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

    Science.gov (United States)

    Atar, Burcu; Kamata, Akihito

    2011-01-01

    The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

  3. IRT analyses of the Swedish Dark Triad Dirty Dozen

    Directory of Open Access Journals (Sweden)

    Danilo Garcia

    2018-03-01

    Full Text Available Background: The Dark Triad (i.e., Machiavellianism, narcissism, and psychopathy can be captured quickly with 12 items using the Dark Triad Dirty Dozen (Jonason and Webster, 2010. Previous Item Response Theory (IRT analyses of the original English Dark Triad Dirty Dozen have shown that all three subscales adequately tap into the dark domains of personality. The aim of the present study was to analyze the Swedish version of the Dark Triad Dirty Dozen using IRT. Method: 570 individuals (nmales = 326, nfemales = 242, and 2 unreported, including university students and white-collar workers with an age range between 19 and 65 years, responded to the Swedish version of the Dark Triad Dirty Dozen (Garcia et al., 2017a,b. Results: Contrary to previous research, we found that the narcissism scale provided most information, followed by psychopathy, and finally Machiavellianism. Moreover, the psychopathy scale required a higher level of the latent trait for endorsement of its items than the narcissism and Machiavellianism scales. Overall, all items provided reasonable amounts of information and are thus effective for discriminating between individuals. The mean item discriminations (alphas were 1.92 for Machiavellianism, 2.31 for narcissism, and 1.99 for psychopathy. Conclusion: This is the first study to provide IRT analyses of the Swedish version of the Dark Triad Dirty Dozen. Our findings add to a growing literature on the Dark Triad Dirty Dozen scale in different cultures and highlight psychometric characteristics, which can be used for comparative studies. Items tapping into psychopathy showed higher thresholds for endorsement than the other two scales. Importantly, the narcissism scale seems to provide more information about a lack of narcissism, perhaps mirroring cultural conditions. Keywords: Psychology, Psychiatry, Clinical psychology

  4. Robust Measurement via A Fused Latent and Graphical Item Response Theory Model.

    Science.gov (United States)

    Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Ying, Zhiliang

    2018-03-12

    Item response theory (IRT) plays an important role in psychological and educational measurement. Unlike the classical testing theory, IRT models aggregate the item level information, yielding more accurate measurements. Most IRT models assume local independence, an assumption not likely to be satisfied in practice, especially when the number of items is large. Results in the literature and simulation studies in this paper reveal that misspecifying the local independence assumption may result in inaccurate measurements and differential item functioning. To provide more robust measurements, we propose an integrated approach by adding a graphical component to a multidimensional IRT model that can offset the effect of unknown local dependence. The new model contains a confirmatory latent variable component, which measures the targeted latent traits, and a graphical component, which captures the local dependence. An efficient proximal algorithm is proposed for the parameter estimation and structure learning of the local dependence. This approach can substantially improve the measurement, given no prior information on the local dependence structure. The model can be applied to measure both a unidimensional latent trait and multidimensional latent traits.

  5. Measuring organizational effectiveness in information and communication technology companies using item response theory.

    Science.gov (United States)

    Trierweiller, Andréa Cristina; Peixe, Blênio César Severo; Tezza, Rafael; Pereira, Vera Lúcia Duarte do Valle; Pacheco, Waldemar; Bornia, Antonio Cezar; de Andrade, Dalton Francisco

    2012-01-01

    The aim of this paper is to measure the effectiveness of the organizations Information and Communication Technology (ICT) from the point of view of the manager, using Item Response Theory (IRT). There is a need to verify the effectiveness of these organizations which are normally associated to complex, dynamic, and competitive environments. In academic literature, there is disagreement surrounding the concept of organizational effectiveness and its measurement. A construct was elaborated based on dimensions of effectiveness towards the construction of the items of the questionnaire which submitted to specialists for evaluation. It demonstrated itself to be viable in measuring organizational effectiveness of ICT companies under the point of view of a manager through using Two-Parameter Logistic Model (2PLM) of the IRT. This modeling permits us to evaluate the quality and property of each item placed within a single scale: items and respondents, which is not possible when using other similar tools.

  6. Geriatric Anxiety Scale: item response theory analysis, differential item functioning, and creation of a ten-item short form (GAS-10).

    Science.gov (United States)

    Mueller, Anne E; Segal, Daniel L; Gavett, Brandon; Marty, Meghan A; Yochim, Brian; June, Andrea; Coolidge, Frederick L

    2015-07-01

    The Geriatric Anxiety Scale (GAS; Segal et al. (Segal, D. L., June, A., Payne, M., Coolidge, F. L. and Yochim, B. (2010). Journal of Anxiety Disorders, 24, 709-714. doi:10.1016/j.janxdis.2010.05.002) is a self-report measure of anxiety that was designed to address unique issues associated with anxiety assessment in older adults. This study is the first to use item response theory (IRT) to examine the psychometric properties of a measure of anxiety in older adults. A large sample of older adults (n = 581; mean age = 72.32 years, SD = 7.64 years, range = 60 to 96 years; 64% women; 88% European American) completed the GAS. IRT properties were examined. The presence of differential item functioning (DIF) or measurement bias by age and sex was assessed, and a ten-item short form of the GAS (called the GAS-10) was created. All GAS items had discrimination parameters of 1.07 or greater. Items from the somatic subscale tended to have lower discrimination parameters than items on the cognitive or affective subscales. Two items were flagged for DIF, but the impact of the DIF was negligible. Women scored significantly higher than men on the GAS and its subscales. Participants in the young-old group (60 to 79 years old) scored significantly higher on the cognitive subscale than participants in the old-old group (80 years old and older). Results from the IRT analyses indicated that the GAS and GAS-10 have strong psychometric properties among older adults. We conclude by discussing implications and future research directions.

  7. Psychometric properties of the Global Operative Assessment of Laparoscopic Skills (GOALS) using item response theory.

    Science.gov (United States)

    Watanabe, Yusuke; Madani, Amin; Ito, Yoichi M; Bilgic, Elif; McKendy, Katherine M; Feldman, Liane S; Fried, Gerald M; Vassiliou, Melina C

    2017-02-01

    The extent to which each item assessed using the Global Operative Assessment of Laparoscopic Skills (GOALS) contributes to the total score remains unknown. The purpose of this study was to evaluate the level of difficulty and discriminative ability of each of the 5 GOALS items using item response theory (IRT). A total of 396 GOALS assessments for a variety of laparoscopic procedures over a 12-year time period were included. Threshold parameters of item difficulty and discrimination power were estimated for each item using IRT. The higher slope parameters seen with "bimanual dexterity" and "efficiency" are indicative of greater discriminative ability than "depth perception", "tissue handling", and "autonomy". IRT psychometric analysis indicates that the 5 GOALS items do not demonstrate uniform difficulty and discriminative power, suggesting that they should not be scored equally. "Bimanual dexterity" and "efficiency" seem to have stronger discrimination. Weighted scores based on these findings could improve the accuracy of assessing individual laparoscopic skills. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Combination of classical test theory (CTT) and item response theory (IRT) analysis to study the psychometric properties of the French version of the Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form (Q-LES-Q-SF).

    Science.gov (United States)

    Bourion-Bédès, Stéphanie; Schwan, Raymund; Epstein, Jonathan; Laprevote, Vincent; Bédès, Alex; Bonnet, Jean-Louis; Baumann, Cédric

    2015-02-01

    The study aimed to examine the construct validity and reliability of the Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form (Q-LES-Q-SF) according to both classical test and item response theories. The psychometric properties of the French version of this instrument were investigated in a cross-sectional, multicenter study. A total of 124 outpatients with a substance dependence diagnosis participated in the study. Psychometric evaluation included descriptive analysis, internal consistency, test-retest reliability, and validity. The dimensionality of the instrument was explored using a combination of the classical test, confirmatory factor analysis (CFA), and an item response theory analysis, the Person Separation Index (PSI), in a complementary manner. The results of the Q-LES-Q-SF revealed that the questionnaire was easy to administer and the acceptability was good. The internal consistency and the test-retest reliability were 0.9 and 0.88, respectively. All items were significantly correlated with the total score and the SF-12 used in the study. The CFA with one factor model was good, and for the unidimensional construct, the PSI was found to be 0.902. The French version of the Q-LES-Q-SF yielded valid and reliable clinical assessments of the quality of life for future research and clinical practice involving French substance abusers. In response to recent questioning regarding the unidimensionality or bidimensionality of the instrument and according to the underlying theoretical unidimensional construct used for its development, this study suggests the Q-LES-Q-SF as a one-dimension questionnaire in French QoL studies.

  9. A Comparison of Four Linear Equating Methods for the Common-Item Nonequivalent Groups Design Using Simulation Methods. ACT Research Report Series, 2013 (2)

    Science.gov (United States)

    Topczewski, Anna; Cui, Zhongmin; Woodruff, David; Chen, Hanwei; Fang, Yu

    2013-01-01

    This paper investigates four methods of linear equating under the common item nonequivalent groups design. Three of the methods are well known: Tucker, Angoff-Levine, and Congeneric-Levine. A fourth method is presented as a variant of the Congeneric-Levine method. Using simulation data generated from the three-parameter logistic IRT model we…

  10. An Item Response Theory–Based, Computerized Adaptive Testing Version of the MacArthur–Bates Communicative Development Inventory: Words & Sentences (CDI:WS)

    DEFF Research Database (Denmark)

    Makransky, Guido; Dale, Philip S.; Havmose, Philip

    2016-01-01

    precision. Method: Parent-reported vocabulary for the American CDI:WS norming sample consisting 1461 children between the ages of 16 and 30 months was used to investigate the fit of the items to the 2 parameter logistic (2-PL) IRT model, and to simulate CDI-CAT versions with 400, 200, 100, 50, 25, 10 and 5...

  11. Practical Guide to Conducting an Item Response Theory Analysis

    Science.gov (United States)

    Toland, Michael D.

    2014-01-01

    Item response theory (IRT) is a psychometric technique used in the development, evaluation, improvement, and scoring of multi-item scales. This pedagogical article provides the necessary information needed to understand how to conduct, interpret, and report results from two commonly used ordered polytomous IRT models (Samejima's graded…

  12. Item parameters dissociate between expectation formats: A regression analysis of time-frequency decomposed EEG data

    Directory of Open Access Journals (Sweden)

    Irene Fernández Monsalve

    2014-08-01

    Full Text Available During language comprehension, semantic contextual information is used to generate expectations about upcoming items. This has been commonly studied through the N400 event-related potential (ERP, as a measure of facilitated lexical retrieval. However, the associative relationships in multi-word expressions (MWE may enable the generation of a categorical expectation, leading to lexical retrieval before target word onset. Processing of the target word would thus reflect a target-identification mechanism, possibly indexed by a P3 ERP component. However, given their time overlap (200-500 ms post-stimulus onset, differentiating between N400/P3 ERP responses (averaged over multiple linguistically variable trials is problematic. In the present study, we analyzed EEG data from a previous experiment, which compared ERP responses to highly expected words that were placed either in a MWE or a regular non-fixed compositional context, and to low predictability controls. We focused on oscillatory dynamics and regression analyses, in order to dissociate between the two contexts by modeling the electrophysiological response as a function of item-level parameters. A significant interaction between word position and condition was found in the regression model for power in a theta range (~7-9 Hz, providing evidence for the presence of qualitative differences between conditions. Power levels within this band were lower for MWE than compositional contexts then the target word appeared later on in the sentence, confirming that in the former lexical retrieval would have taken place before word onset. On the other hand, gamma-power (~50-70 Hz was also modulated by predictability of the item in all conditions, which is interpreted as an index of a similar `matching' sub-step for both types of contexts, binding an expected representation and the external input.

  13. An NCME Instructional Module on Item-Fit Statistics for Item Response Theory Models

    Science.gov (United States)

    Ames, Allison J.; Penfield, Randall D.

    2015-01-01

    Drawing valid inferences from item response theory (IRT) models is contingent upon a good fit of the data to the model. Violations of model-data fit have numerous consequences, limiting the usefulness and applicability of the model. This instructional module provides an overview of methods used for evaluating the fit of IRT models. Upon completing…

  14. Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

    Science.gov (United States)

    Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

    2013-07-01

    Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.

  15. A Non-Parametric Item Response Theory Evaluation of the CAGE Instrument Among Older Adults.

    Science.gov (United States)

    Abdin, Edimansyah; Sagayadevan, Vathsala; Vaingankar, Janhavi Ajit; Picco, Louisa; Chong, Siow Ann; Subramaniam, Mythily

    2018-02-23

    The validity of the CAGE using item response theory (IRT) has not yet been examined in older adult population. This study aims to investigate the psychometric properties of the CAGE using both non-parametric and parametric IRT models, assess whether there is any differential item functioning (DIF) by age, gender and ethnicity and examine the measurement precision at the cut-off scores. We used data from the Well-being of the Singapore Elderly study to conduct Mokken scaling analysis (MSA), dichotomous Rasch and 2-parameter logistic IRT models. The measurement precision at the cut-off scores were evaluated using classification accuracy (CA) and classification consistency (CC). The MSA showed the overall scalability H index was 0.459, indicating a medium performing instrument. All items were found to be homogenous, measuring the same construct and able to discriminate well between respondents with high levels of the construct and the ones with lower levels. The item discrimination ranged from 1.07 to 6.73 while the item difficulty ranged from 0.33 to 2.80. Significant DIF was found for 2-item across ethnic group. More than 90% (CC and CA ranged from 92.5% to 94.3%) of the respondents were consistently and accurately classified by the CAGE cut-off scores of 2 and 3. The current study provides new evidence on the validity of the CAGE from the IRT perspective. This study provides valuable information of each item in the assessment of the overall severity of alcohol problem and the precision of the cut-off scores in older adult population.

  16. Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain.

    Science.gov (United States)

    Crins, Martine H P; Roorda, Leo D; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Terwee, Caroline B

    2015-01-01

    The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT) model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA). Items were calibrated using the graded response model (GRM), an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF) for language (Dutch vs. English) was examined. Reliability was evaluated based on standard errors and Cronbach's alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire) were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986). Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44). The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF), good reliability (Cronbach's alpha = 0.98), and good construct validity (Pearson correlations between 0.62 and 0.75). A computer adaptive test (CAT) and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed.

  17. Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain.

    Directory of Open Access Journals (Sweden)

    Martine H P Crins

    Full Text Available The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA. Items were calibrated using the graded response model (GRM, an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF for language (Dutch vs. English was examined. Reliability was evaluated based on standard errors and Cronbach's alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986. Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44. The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF, good reliability (Cronbach's alpha = 0.98, and good construct validity (Pearson correlations between 0.62 and 0.75. A computer adaptive test (CAT and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed.

  18. Psychometric properties for the Balanced Inventory of Desirable Responding: dichotomous versus polytomous conventional and IRT scoring.

    Science.gov (United States)

    Vispoel, Walter P; Kim, Han Yi

    2014-09-01

    [Correction Notice: An Erratum for this article was reported in Vol 26(3) of Psychological Assessment (see record 2014-16017-001). The mean, standard deviation and alpha coefficient originally reported in Table 1 should be 74.317, 10.214 and .802, respectively. The validity coefficients in the last column of Table 4 are affected as well. Correcting this error did not change the substantive interpretations of the results, but did increase the mean, standard deviation, alpha coefficient, and validity coefficients reported for the Honesty subscale in the text and in Tables 1 and 4. The corrected versions of Tables 1 and Table 4 are shown in the erratum.] Item response theory (IRT) models were applied to dichotomous and polytomous scoring of the Self-Deceptive Enhancement and Impression Management subscales of the Balanced Inventory of Desirable Responding (Paulhus, 1991, 1999). Two dichotomous scoring methods reflecting exaggerated endorsement and exaggerated denial of socially desirable behaviors were examined. The 1- and 2-parameter logistic models (1PLM, 2PLM, respectively) were applied to dichotomous responses, and the partial credit model (PCM) and graded response model (GRM) were applied to polytomous responses. For both subscales, the 2PLM fit dichotomous responses better than did the 1PLM, and the GRM fit polytomous responses better than did the PCM. Polytomous GRM and raw scores for both subscales yielded higher test-retest and convergent validity coefficients than did PCM, 1PLM, 2PLM, and dichotomous raw scores. Information plots showed that the GRM provided consistently high measurement precision that was superior to that of all other IRT models over the full range of both construct continuums. Dichotomous scores reflecting exaggerated endorsement of socially desirable behaviors provided noticeably weak precision at low levels of the construct continuums, calling into question the use of such scores for detecting instances of "faking bad." Dichotomous

  19. Parameter Recovery for the 1-P HGLLM with Non-Normally Distributed Level-3 Residuals

    Science.gov (United States)

    Kara, Yusuf; Kamata, Akihito

    2017-01-01

    A multilevel Rasch model using a hierarchical generalized linear model is one approach to multilevel item response theory (IRT) modeling and is referred to as a one-parameter hierarchical generalized linear logistic model (1-P HGLLM). Although it has the flexibility to model nested structure of data with covariates, the model assumes the normality…

  20. Conscientiousness at the workplace: Applying mixture IRT to investigate scalability and predictive validity

    NARCIS (Netherlands)

    Egberink, I.J.L.; Meijer, R.R.; Veldkamp, Bernard P.

    2010-01-01

    Mixture item response theory (IRT) models have been used to assess multidimensionality of the construct being measured and to detect different response styles for different groups. In this study a mixture version of the graded response model was applied to investigate scalability and predictive

  1. Conscientiousness in the workplace : Applying mixture IRT to investigate scalability and predictive validity

    NARCIS (Netherlands)

    Egberink, I.J.L.; Meijer, R.R.; Veldkamp, B.P.

    Mixture item response theory (IRT) models have been used to assess multidimensionality of the construct being measured and to detect different response styles for different groups. In this study a mixture version of the graded response model was applied to investigate scalability and predictive

  2. A Bifactor Multidimensional Item Response Theory Model for Differential Item Functioning Analysis on Testlet-Based Items

    Science.gov (United States)

    Fukuhara, Hirotaka; Kamata, Akihito

    2011-01-01

    A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…

  3. Applicability of Item Response Theory to the Korean Nurses' Licensing Examination

    Directory of Open Access Journals (Sweden)

    Geum-Hee Jeong

    2005-06-01

    Full Text Available To test the applicability of item response theory (IRT to the Korean Nurses' Licensing Examination (KNLE, item analysis was performed after testing the unidimensionality and goodness-of-fit. The results were compared with those based on classical test theory. The results of the 330-item KNLE administered to 12,024 examinees in January 2004 were analyzed. Unidimensionality was tested using DETECT and the goodness-of-fit was tested using WINSTEPS for the Rasch model and Bilog-MG for the two-parameter logistic model. Item analysis and ability estimation were done using WINSTEPS. Using DETECT, Dmax ranged from 0.1 to 0.23 for each subject. The mean square value of the infit and outfit values of all items using WINSTEPS ranged from 0.1 to 1.5, except for one item in pediatric nursing, which scored 1.53. Of the 330 items, 218 (42.7% were misfit using the two-parameter logistic model of Bilog-MG. The correlation coefficients between the difficulty parameter using the Rasch model and the difficulty index from classical test theory ranged from 0.9039 to 0.9699. The correlation between the ability parameter using the Rasch model and the total score from classical test theory ranged from 0.9776 to 0.9984. Therefore, the results of the KNLE fit unidimensionality and goodness-of-fit for the Rasch model. The KNLE should be a good sample for analysis according to the IRT Rasch model, so further research using IRT is possible.

  4. Psychometric aspects of item mapping for criterion-referenced interpretation and bookmark standard setting.

    Science.gov (United States)

    Huynh, Huynh

    2010-01-01

    Locating an item on an achievement continuum (item mapping) is well-established in technical work for educational/psychological assessment. Applications of item mapping may be found in criterion-referenced (CR) testing (or scale anchoring, Beaton and Allen, 1992; Huynh, 1994, 1998a, 2000a, 2000b, 2006), computer-assisted testing, test form assembly, and in standard setting methods based on ordered test booklets. These methods include the bookmark standard setting originally used for the CTB/TerraNova tests (Lewis, Mitzel, Green, and Patz, 1999), the item descriptor process (Ferrara, Perie, and Johnson, 2002) and a similar process described by Wang (2003) for multiple-choice licensure and certification examinations. While item response theory (IRT) models such as the Rasch and two-parameter logistic (2PL) models traditionally place a binary item at its location, Huynh has argued in the cited papers that such mapping may not be appropriate in selecting items for CR interpretation and scale anchoring.

  5. An NCME Instructional Module on Polytomous Item Response Theory Models

    Science.gov (United States)

    Penfield, Randall David

    2014-01-01

    A polytomous item is one for which the responses are scored according to three or more categories. Given the increasing use of polytomous items in assessment practices, item response theory (IRT) models specialized for polytomous items are becoming increasingly common. The purpose of this ITEMS module is to provide an accessible overview of…

  6. Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

    Science.gov (United States)

    Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina

    2015-06-01

    This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism. © The Author(s) 2014.

  7. PENGEMBANGAN TES BERPIKIR KRITIS DENGAN PENDEKATAN ITEM RESPONSE THEORY

    Directory of Open Access Journals (Sweden)

    Fajrianthi Fajrianthi

    2016-06-01

    Full Text Available Penelitian ini bertujuan untuk menghasilkan sebuah alat ukur (tes berpikir kritis yang valid dan reliabel untuk digunakan, baik dalam lingkup pendidikan maupun kerja di Indonesia. Tahapan penelitian dilakukan berdasarkan tahap pengembangan tes menurut Hambleton dan Jones (1993. Kisi-kisi dan pembuatan butir didasarkan pada konsep dalam tes Watson-Glaser Critical Thinking Appraisal (WGCTA. Pada WGCTA, berpikir kritis terdiri dari lima dimensi yaitu Inference, Recognition Assumption, Deduction, Interpretation dan Evaluation of arguments. Uji coba tes dilakukan pada 1.453 peserta tes seleksi karyawan di Surabaya, Gresik, Tuban, Bojonegoro, Rembang. Data dikotomi dianalisis dengan menggunakan model IRT dengan dua parameter yaitu daya beda dan tingkat kesulitan butir. Analisis dilakukan dengan menggunakan program statistik Mplus versi 6.11 Sebelum melakukan analisis dengan IRT, dilakukan pengujian asumsi yaitu uji unidimensionalitas, independensi lokal dan Item Characteristic Curve (ICC. Hasil analisis terhadap 68 butir menghasilkan 15 butir dengan daya beda yang cukup baik dan tingkat kesulitan butir yang berkisar antara –4 sampai dengan 2.448. Sedikitnya jumlah butir yang berkualitas baik disebabkan oleh kelemahan dalam menentukan subject matter experts di bidang berpikir kritis dan pemilihan metode skoring. Kata kunci: Pengembangan tes, berpikir kritis, item response theory   DEVELOPING CRITICAL THINKING TEST UTILISING ITEM RESPONSE THEORY Abstract The present study was aimed to develop a valid and reliable instrument in assesing critical thinking which can be implemented both in educational and work settings in Indonesia. Following the Hambleton and Jones’s (1993 procedures on test development, the study developed the instrument by employing the concept of critical thinking from Watson-Glaser Critical Thinking Appraisal (WGCTA. The study included five dimensions of critical thinking as adopted from the WGCTA: Inference, Recognition

  8. Development of a Microsoft Excel tool for one-parameter Rasch model of continuous items: an application to a safety attitude survey

    Directory of Open Access Journals (Sweden)

    Tsair-Wei Chien

    2017-01-01

    Full Text Available Abstract Background Many continuous item responses (CIRs are encountered in healthcare settings, but no one uses item response theory’s (IRT probabilistic modeling to present graphical presentations for interpreting CIR results. A computer module that is programmed to deal with CIRs is required. To present a computer module, validate it, and verify its usefulness in dealing with CIR data, and then to apply the model to real healthcare data in order to show how the CIR that can be applied to healthcare settings with an example regarding a safety attitude survey. Methods Using Microsoft Excel VBA (Visual Basic for Applications, we designed a computer module that minimizes the residuals and calculates model’s expected scores according to person responses across items. Rasch models based on a Wright map and on KIDMAP were demonstrated to interpret results of the safety attitude survey. Results The author-made CIR module yielded OUTFIT mean square (MNSQ and person measures equivalent to those yielded by professional Rasch Winsteps software. The probabilistic modeling of the CIR module provides messages that are much more valuable to users and show the CIR advantage over classic test theory. Conclusions Because of advances in computer technology, healthcare users who are familiar to MS Excel can easily apply the study CIR module to deal with continuous variables to benefit comparisons of data with a logistic distribution and model fit statistics.

  9. Development of a Microsoft Excel tool for one-parameter Rasch model of continuous items: an application to a safety attitude survey.

    Science.gov (United States)

    Chien, Tsair-Wei; Shao, Yang; Kuo, Shu-Chun

    2017-01-10

    Many continuous item responses (CIRs) are encountered in healthcare settings, but no one uses item response theory's (IRT) probabilistic modeling to present graphical presentations for interpreting CIR results. A computer module that is programmed to deal with CIRs is required. To present a computer module, validate it, and verify its usefulness in dealing with CIR data, and then to apply the model to real healthcare data in order to show how the CIR that can be applied to healthcare settings with an example regarding a safety attitude survey. Using Microsoft Excel VBA (Visual Basic for Applications), we designed a computer module that minimizes the residuals and calculates model's expected scores according to person responses across items. Rasch models based on a Wright map and on KIDMAP were demonstrated to interpret results of the safety attitude survey. The author-made CIR module yielded OUTFIT mean square (MNSQ) and person measures equivalent to those yielded by professional Rasch Winsteps software. The probabilistic modeling of the CIR module provides messages that are much more valuable to users and show the CIR advantage over classic test theory. Because of advances in computer technology, healthcare users who are familiar to MS Excel can easily apply the study CIR module to deal with continuous variables to benefit comparisons of data with a logistic distribution and model fit statistics.

  10. Translation Fidelity of Psychological Scales: An Item Response Theory Analysis of an Individualism-Collectivism Scale.

    Science.gov (United States)

    Bontempo, Robert

    1993-01-01

    Describes a method for assessing the quality of translations based on item response theory (IRT). Results from the IRT technique with French and Chinese versions of a scale measuring individualism-collectivism for samples of 250 U.S., 357 French, and 290 Chinese undergraduates show how several biased items are detected. (SLD)

  11. A signal detection-item response theory model for evaluating neuropsychological measures.

    Science.gov (United States)

    Thomas, Michael L; Brown, Gregory G; Gur, Ruben C; Moore, Tyler M; Patt, Virginie M; Risbrough, Victoria B; Baker, Dewleen G

    2018-02-05

    Models from signal detection theory are commonly used to score neuropsychological test data, especially tests of recognition memory. Here we show that certain item response theory models can be formulated as signal detection theory models, thus linking two complementary but distinct methodologies. We then use the approach to evaluate the validity (construct representation) of commonly used research measures, demonstrate the impact of conditional error on neuropsychological outcomes, and evaluate measurement bias. Signal detection-item response theory (SD-IRT) models were fitted to recognition memory data for words, faces, and objects. The sample consisted of U.S. Infantry Marines and Navy Corpsmen participating in the Marine Resiliency Study. Data comprised item responses to the Penn Face Memory Test (PFMT; N = 1,338), Penn Word Memory Test (PWMT; N = 1,331), and Visual Object Learning Test (VOLT; N = 1,249), and self-report of past head injury with loss of consciousness. SD-IRT models adequately fitted recognition memory item data across all modalities. Error varied systematically with ability estimates, and distributions of residuals from the regression of memory discrimination onto self-report of past head injury were positively skewed towards regions of larger measurement error. Analyses of differential item functioning revealed little evidence of systematic bias by level of education. SD-IRT models benefit from the measurement rigor of item response theory-which permits the modeling of item difficulty and examinee ability-and from signal detection theory-which provides an interpretive framework encompassing the experimentally validated constructs of memory discrimination and response bias. We used this approach to validate the construct representation of commonly used research measures and to demonstrate how nonoptimized item parameters can lead to erroneous conclusions when interpreting neuropsychological test data. Future work might include the

  12. Marginal Maximum Likelihood Estimation of Item Response Models in R

    Directory of Open Access Journals (Sweden)

    Matthew S. Johnson

    2007-02-01

    Full Text Available Item response theory (IRT models are a class of statistical models used by researchers to describe the response behaviors of individuals to a set of categorically scored items. The most common IRT models can be classified as generalized linear fixed- and/or mixed-effect models. Although IRT models appear most often in the psychological testing literature, researchers in other fields have successfully utilized IRT-like models in a wide variety of applications. This paper discusses the three major methods of estimation in IRT and develops R functions utilizing the built-in capabilities of the R environment to find the marginal maximum likelihood estimates of the generalized partial credit model. The currently available R packages ltm is also discussed.

  13. Using Item Response Theory to Evaluate LSCI Learning Gains

    Science.gov (United States)

    Schlingman, Wayne M.; Prather, E. E.; Collaboration of Astronomy Teaching Scholars CATS

    2012-01-01

    Analyzing the data from the recent national study using the Light and Spectroscopy Concept Inventory (LSCI), this project uses Item Response Theory (IRT) to investigate the learning gains of students as measured by the LSCI. IRT provides a theoretical model to generate parameters accounting for students’ abilities. We use IRT to measure changes in students’ abilities to reason about light from pre- to post-instruction. Changes in students’ abilities are compared by classroom to better understand the learning that is taking place in classrooms across the country. We compare the average change in ability for each classroom to the Interactivity Assessment Score (IAS) to provide further insight into the prior results presented from this data set. This material is based upon work supported by the National Science Foundation under Grant No. 0715517, a CCLI Phase III Grant for the Collaboration of Astronomy Teaching Scholars (CATS). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

  14. Sampling Variances and Covariances of Parameter Estimates in Item Response Theory.

    Science.gov (United States)

    1982-08-01

    substituting (15) into (16) and solving for k and K k = b b1 - o K , (17)k where b and b are means for m and r items, respectively. To find the variance...C5 , and C12 were treated as known. We find that the standard errors of B1 to B5 are increased drastically by ignorance of C 1 to C5 ; all...ERIC Facilltv-Acquisitlons Davie Hall 013A 4833 Rugby Avenue Chapel Hill, NC 27514 Bethesda, MD 20014 -7- Dr. A. J. Eschenbrenner 1 Dr. John R

  15. A deterministic inventory model for deteriorating items with selling price dependent demand and three-parameter Weibull distributed deterioration

    Directory of Open Access Journals (Sweden)

    Asoke Kumar Bhunia

    2014-06-01

    Full Text Available In this paper, an attempt is made to develop two inventory models for deteriorating items with variable demand dependent on the selling price and frequency of advertisement of items. In the first model, shortages are not allowed whereas in the second, these are allowed and partially backlogged with a variable rate dependent on the duration of waiting time up to the arrival of next lot. In both models, the deterioration rate follows three-parameter Weibull distribution and the transportation cost is considered explicitly for replenishing the order quantity. This cost is dependent on the lot-size as well as the distance from the source to the destination. The corresponding models have been formulated and solved. Two numerical examples have been considered to illustrate the results and the significant features of the results are discussed. Finally, based on these examples, the effects of different parameters on the initial stock level, shortage level (in case of second model only, cycle length along with the optimal profit have been studied by sensitivity analyses taking one parameter at a time keeping the other parameters as same.

  16. Evaluation of adding item-response theory analysis for evaluation of the European Board of Ophthalmology Diploma examination.

    Science.gov (United States)

    Mathysen, Danny G P; Aclimandos, Wagih; Roelant, Ella; Wouters, Kristien; Creuzot-Garcher, Catherine; Ringens, Peter J; Hawlina, Marko; Tassignon, Marie-José

    2013-11-01

    To investigate whether introduction of item-response theory (IRT) analysis, in parallel to the 'traditional' statistical analysis methods available for performance evaluation of multiple T/F items as used in the European Board of Ophthalmology Diploma (EBOD) examination, has proved beneficial, and secondly, to study whether the overall assessment performance of the current written part of EBOD is sufficiently high (KR-20≥ 0.90) to be kept as examination format in future EBOD editions. 'Traditional' analysis methods for individual MCQ item performance comprise P-statistics, Rit-statistics and item discrimination, while overall reliability is evaluated through KR-20 for multiple T/F items. The additional set of statistical analysis methods for the evaluation of EBOD comprises mainly IRT analysis. These analysis techniques are used to monitor whether the introduction of negative marking for incorrect answers (since EBOD 2010) has a positive influence on the statistical performance of EBOD as a whole and its individual test items in particular. Item-response theory analysis demonstrated that item performance parameters should not be evaluated individually, but should be related to one another. Before the introduction of negative marking, the overall EBOD reliability (KR-20) was good though with room for improvement (EBOD 2008: 0.81; EBOD 2009: 0.78). After the introduction of negative marking, the overall reliability of EBOD improved significantly (EBOD 2010: 0.92; EBOD 2011:0.91; EBOD 2012: 0.91). Although many statistical performance parameters are available to evaluate individual items, our study demonstrates that the overall reliability assessment remains the only crucial parameter to be evaluated allowing comparison. While individual item performance analysis is worthwhile to undertake as secondary analysis, drawing final conclusions seems to be more difficult. Performance parameters need to be related, as shown by IRT analysis. Therefore, IRT analysis has

  17. Future development of the research nuclear reactor IRT-2000 in Sofia

    International Nuclear Information System (INIS)

    Apostolov, T.G.

    1999-01-01

    The present paper presents a short description of the research reactor IRT-2000 Sofia, started in 1961 and operated for 28 years. Some items are considered, connected to the improvements made in the contemporary safety requirements and the unrealized project for modernization to 5 MW. Proposals are considered for reconstruction of reactor site to a 'reactor of low power' for education purposes and as a basis for the country's nuclear technology development. (author)

  18. Future development of the research nuclear reactor IRT-2000 in Sofia

    Energy Technology Data Exchange (ETDEWEB)

    Apostolov, T.G. [Institute for Nuclear Research and Nuclear Energy, BAS, Sofia (Bulgaria)

    1999-07-01

    The present paper presents a short description of the research reactor IRT-2000 Sofia, started in 1961 and operated for 28 years. Some items are considered, connected to the improvements made in the contemporary safety requirements and the unrealized project for modernization to 5 MW. Proposals are considered for reconstruction of reactor site to a 'reactor of low power' for education purposes and as a basis for the country's nuclear technology development. (author)

  19. Measuring Anxiety in Visually-Impaired People: A Comparison between the Linear and the Nonlinear IRT Approaches

    Science.gov (United States)

    Ferrando, Pere J.; Pallero, Rafael; Anguiano-Carrasco, Cristina

    2013-01-01

    The present study has two main interests. First, some pending issues about the psychometric properties of the CTAC (an anxiety questionnaire for blind and visually-impaired people) are assessed using item response theory (IRT). Second, the linear model is compared to the graded response model (GRM) in terms of measurement precision, sensitivity…

  20. Item response theory analysis of the Pain Self-Efficacy Questionnaire.

    Science.gov (United States)

    Costa, Daniel S J; Asghari, Ali; Nicholas, Michael K

    2017-01-01

    The Pain Self-Efficacy Questionnaire (PSEQ) is a 10-item instrument designed to assess the extent to which a person in pain believes s/he is able to accomplish various activities despite their pain. There is strong evidence for the validity and reliability of both the full-length PSEQ and a 2-item version. The purpose of this study is to further examine the properties of the PSEQ using an item response theory (IRT) approach. We used the two-parameter graded response model to examine the category probability curves, and location and discrimination parameters of the 10 PSEQ items. In item response theory, responses to a set of items are assumed to be probabilistically determined by a latent (unobserved) variable. In the graded-response model specifically, item response threshold (the value of the latent variable for which adjacent response categories are equally likely) and discrimination parameters are estimated for each item. Participants were 1511 mixed, chronic pain patients attending for initial assessment at a tertiary pain management centre. All items except item 7 ('I can cope with my pain without medication') performed well in IRT analysis, and the category probability curves suggested that participants used the 7-point response scale consistently. Items 6 ('I can still do many of the things I enjoy doing, such as hobbies or leisure activity, despite pain'), 8 ('I can still accomplish most of my goals in life, despite the pain') and 9 ('I can live a normal lifestyle, despite the pain') captured higher levels of the latent variable with greater precision. The results from this IRT analysis add to the body of evidence based on classical test theory illustrating the strong psychometric properties of the PSEQ. Despite the relatively poor performance of Item 7, its clinical utility warrants its retention in the questionnaire. The strong psychometric properties of the PSEQ support its use as an effective tool for assessing self-efficacy in people with pain

  1. Nuclear Research Center IRT reactor dynamics calculation

    International Nuclear Information System (INIS)

    Aleman Fernandez, J.R.

    1990-01-01

    The main features of the code DIRT, for dynamical calculations are described in the paper. With the results obtained by the program, an analysis of the dynamic behaviour of the Research Reactor IRT of the Nuclear Research Center (CIN) is performed. Different transitories were considered such as variation of the system reactivity, coolant inlet temperature variation and also variations of the coolant velocity through the reactor core. 3 refs

  2. Radiation monitoring program at nuclear scientific experimental and educational center - IRT-Sofia

    International Nuclear Information System (INIS)

    Mladenov, A.; Stankov, D.; Marinov, K.; Nonova, T.; Krezhov, K.

    2012-01-01

    Ensuring minimal risk of personnel exposure without exceeding the dose limits is the main task of the General Program for Radiation Monitoring of Nuclear Scientific Experimental and Education Centre (NSEEC) with research reactor IRT. Since 2006 the IRT-Sofia is equipped with a new and modern Radiation Monitoring System (RMS). All RMS detectors are connected to the server RAMSYS. They have online (real-time) visualization in two workstations with RAMVISION software. The RMS allows the implementation of technological and environmental monitoring at the nuclear facility site. Environmental monitoring with the RMS external system includes monitoring of dose rate; alpha and beta activity; radon activity; Po-218, Po-214, Po-212 activity; gamma control of vehicles. Technological control of reactor gases includes: Alpha beta particulate monitor; Iodine monitor; Noble gases monitor; Stack flow monitor. The General Program based on the radiation monitoring system allows real-time monitoring and control of radiation parameters in the controlled area and provides for a high level of radiation protection of IRT staff and users of its facilities. This paper presents the technical and functional parameters of the radiation monitoring system and radiation protection activities within the restricted zone in IRT facilities. (authors)

  3. Assessing difference between classical test theory and item ...

    African Journals Online (AJOL)

    Assessing difference between classical test theory and item response theory methods in scoring primary four multiple choice objective test items. ... All research participants were ranked on the CTT number correct scores and the corresponding IRT item pattern scores from their performance on the PRISMADAT. Wilcoxon ...

  4. A scale purification procedure for evaluation of differential item functioning

    NARCIS (Netherlands)

    Khalid, Muhammad Naveed; Glas, Cornelis A.W.

    2014-01-01

    Item bias or differential item functioning (DIF) has an important impact on the fairness of psychological and educational testing. In this paper, DIF is seen as a lack of fit to an item response (IRT) model. Inferences about the presence and importance of DIF require a process of so-called test

  5. Item Response Theory Models for Performance Decline during Testing

    Science.gov (United States)

    Jin, Kuan-Yu; Wang, Wen-Chung

    2014-01-01

    Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

  6. IRT - Sofia conversion feasibility study experience 2002-2009

    Energy Technology Data Exchange (ETDEWEB)

    Belousov, S.I.; Apostolov, T.G. [Institute for Nuclear Research and Nuclear Energy of Bulgarian Academy of Science, Tsarigradsko 72, 1784 Sofia (Bulgaria)

    2010-07-01

    A joint conversion feasibility study concerning the IRT - Sofia research reactor between INRNE and the RERTR Program at ANL was initiated in 2002. The initial steps studies (up to 2006) were mainly focused on neutronics properties significant for reactor application and safety analyses. Thermal hydraulic, accident analyses as well as additional neutronics study required were performed after that (up to 2010). The obtained results show that the IRT-4M LEU fuel assemblies (19.75% {sup 235}U enrichment) are appropriate for IRT-Sofia conversion (IRT-Sofia was initially designed for the IRT-2M HEU fuel assemblies with 36% {sup 235}U enrichment). The results obtained in the frames of the joint study show that the IRT-Sofia operation even with usage of only one pump in the primary circuit meets all safety requirements at power level up to 1000 kW and that safety is maintained for accident transients. Presented results of analyses (neutronics, thermal hydraulic, and accident) and accumulated experience for the IRT-Sofia will be useful for other research reactors where conversion from IRT-2M (HEU) to IRT-4M (LEU) fuel is underway and/or foreseen. (authors)

  7. Building an Evaluation Scale using Item Response Theory.

    Science.gov (United States)

    Lalor, John P; Wu, Hao; Yu, Hong

    2016-11-01

    Evaluation of NLP methods requires testing against a previously vetted gold-standard test set and reporting standard metrics (accuracy/precision/recall/F1). The current assumption is that all items in a given test set are equal with regards to difficulty and discriminating power. We propose Item Response Theory (IRT) from psychometrics as an alternative means for gold-standard test-set generation and NLP system evaluation. IRT is able to describe characteristics of individual items - their difficulty and discriminating power - and can account for these characteristics in its estimation of human intelligence or ability for an NLP task. In this paper, we demonstrate IRT by generating a gold-standard test set for Recognizing Textual Entailment. By collecting a large number of human responses and fitting our IRT model, we show that our IRT model compares NLP systems with the performance in a human population and is able to provide more insight into system performance than standard evaluation metrics. We show that a high accuracy score does not always imply a high IRT score, which depends on the item characteristics and the response pattern.

  8. Screening for word reading and spelling problems in elementary school: An item response theory perspective

    NARCIS (Netherlands)

    Keuning, J.; Verhoeven, L.T.W.

    2008-01-01

    The purpose of the present study was to explore whether the Item Response Theory (IRT) provides a suitable framework to screen for word reading and spelling problems during the elementary school period. The following issues were addressed from an IRT perspective: (a) the dimensionality of word

  9. Extended Mixed-Efects Item Response Models with the MH-RM Algorithm

    Science.gov (United States)

    Chalmers, R. Philip

    2015-01-01

    A mixed-effects item response theory (IRT) model is presented as a logical extension of the generalized linear mixed-effects modeling approach to formulating explanatory IRT models. Fixed and random coefficients in the extended model are estimated using a Metropolis-Hastings Robbins-Monro (MH-RM) stochastic imputation algorithm to accommodate for…

  10. Measuring Constructs in Family Science: How Can Item Response Theory Improve Precision and Validity?

    Science.gov (United States)

    Gordon, Rachel A.

    2015-01-01

    This article provides family scientists with an understanding of contemporary measurement perspectives and the ways in which item response theory (IRT) can be used to develop measures with desired evidence of precision and validity for research uses. The article offers a nontechnical introduction to some key features of IRT, including its…

  11. How Often Is the Misfit of Item Response Theory Models Practically Significant?

    Science.gov (United States)

    Sinharay, Sandip; Haberman, Shelby J.

    2014-01-01

    Standard 3.9 of the Standards for Educational and Psychological Testing ([, 1999]) demands evidence of model fit when item response theory (IRT) models are employed to data from tests. Hambleton and Han ([Hambleton, R. K., 2005]) and Sinharay ([Sinharay, S., 2005]) recommended the assessment of practical significance of misfit of IRT models, but…

  12. Validation of self-directed learning instrument and establishment of normative data for nursing students in taiwan: using polytomous item response theory.

    Science.gov (United States)

    Cheng, Su-Fen; Lee-Hsieh, Jane; Turton, Michael A; Lin, Kuan-Chia

    2014-06-01

    Little research has investigated the establishment of norms for nursing students' self-directed learning (SDL) ability, recognized as an important capability for professional nurses. An item response theory (IRT) approach was used to establish norms for SDL abilities valid for the different nursing programs in Taiwan. The purposes of this study were (a) to use IRT with a graded response model to reexamine the SDL instrument, or the SDLI, originally developed by this research team using confirmatory factor analysis and (b) to establish SDL ability norms for the four different nursing education programs in Taiwan. Stratified random sampling with probability proportional to size was used. A minimum of 15% of students from the four different nursing education degree programs across Taiwan was selected. A total of 7,879 nursing students from 13 schools were recruited. The research instrument was the 20-item SDLI developed by Cheng, Kuo, Lin, and Lee-Hsieh (2010). IRT with the graded response model was used with a two-parameter logistic model (discrimination and difficulty) for the data analysis, calculated using MULTILOG. Norms were established using percentile rank. Analysis of item information and test information functions revealed that 18 items exhibited very high discrimination and two items had high discrimination. The test information function was higher in this range of scores, indicating greater precision in the estimate of nursing student SDL. Reliability fell between .80 and .94 for each domain and the SDLI as a whole. The total information function shows that the SDLI is appropriate for all nursing students, except for the top 2.5%. SDL ability norms were established for each nursing education program and for the nation as a whole. IRT is shown to be a potent and useful methodology for scale evaluation. The norms for SDL established in this research will provide practical standards for nursing educators and students in Taiwan.

  13. Extending Item Response Theory to Online Homework

    Science.gov (United States)

    Kortemeyer, Gerd

    2014-01-01

    Item response theory (IRT) becomes an increasingly important tool when analyzing "big data" gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large-enrollment physics course for…

  14. Improving ability measurement in surveys by following the principles of IRT: The Wordsum vocabulary test in the General Social Survey.

    Science.gov (United States)

    Cor, M Ken; Haertel, Edward; Krosnick, Jon A; Malhotra, Neil

    2012-09-01

    Survey researchers often administer batteries of questions to measure respondents' abilities, but these batteries are not always designed in keeping with the principles of optimal test construction. This paper illustrates one instance in which following these principles can improve a measurement tool used widely in the social and behavioral sciences: the GSS's vocabulary test called "Wordsum". This ten-item test is composed of very difficult items and very easy items, and item response theory (IRT) suggests that the omission of moderately difficult items is likely to have handicapped Wordsum's effectiveness. Analyses of data from national samples of thousands of American adults show that after adding four moderately difficult items to create a 14-item battery, "Wordsumplus" (1) outperformed the original battery in terms of quality indicators suggested by classical test theory; (2) reduced the standard error of IRT ability estimates in the middle of the latent ability dimension; and (3) exhibited higher concurrent validity. These findings show how to improve Wordsum and suggest that analysts should use a score based on all 14 items instead of using the summary score provided by the GSS, which is based on only the original 10 items. These results also show more generally how surveys measuring abilities (and other constructs) can benefit from careful application of insights from the contemporary educational testing literature. Copyright © 2012 Elsevier Inc. All rights reserved.

  15. Applying Item Response Theory to the Development of a Screening Adaptation of the Goldman-Fristoe Test of Articulation-Second Edition

    Science.gov (United States)

    Brackenbury, Tim; Zickar, Michael J.; Munson, Benjamin; Storkel, Holly L.

    2017-01-01

    Purpose: Item response theory (IRT) is a psychometric approach to measurement that uses latent trait abilities (e.g., speech sound production skills) to model performance on individual items that vary by difficulty and discrimination. An IRT analysis was applied to preschoolers' productions of the words on the Goldman-Fristoe Test of…

  16. Item Response Theory Models for Wording Effects in Mixed-Format Scales

    Science.gov (United States)

    Wang, Wen-Chung; Chen, Hui-Fang; Jin, Kuan-Yu

    2015-01-01

    Many scales contain both positively and negatively worded items. Reverse recoding of negatively worded items might not be enough for them to function as positively worded items do. In this study, we commented on the drawbacks of existing approaches to wording effect in mixed-format scales and used bi-factor item response theory (IRT) models to…

  17. An emotional functioning item bank of 24 items for computerized adaptive testing (CAT) was established

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Gamper, Eva-Maria; Costantini, Anna

    2016-01-01

    of the widely used EORTC Quality of Life questionnaire (QLQ-C30). STUDY DESIGN AND SETTING: On the basis of literature search and evaluations by international samples of experts and cancer patients, 38 candidate items were developed. The psychometric properties of the items were evaluated in a large...... international sample of cancer patients. This included evaluations of dimensionality, item response theory (IRT) model fit, differential item functioning (DIF), and of measurement precision/statistical power. RESULTS: Responses were obtained from 1,023 cancer patients from four countries. The evaluations showed...... that 24 items could be included in a unidimensional IRT model. DIF did not seem to have any significant impact on the estimation of EF. Evaluations indicated that the CAT measure may reduce sample size requirements by up to 50% compared to the QLQ-C30 EF scale without reducing power. CONCLUSION...

  18. Comparison of Classical Test Theory and Item Response Theory in Individual Change Assessment

    NARCIS (Netherlands)

    Jabrayilov, Ruslan; Emons, Wilco H. M.; Sijtsma, Klaas

    2016-01-01

    Clinical psychologists are advised to assess clinical and statistical significance when assessing change in individual patients. Individual change assessment can be conducted using either the methodologies of classical test theory (CTT) or item response theory (IRT). Researchers have been optimistic

  19. Improved utilization of ADAS-cog assessment data through item response theory based pharmacometric modeling.

    Science.gov (United States)

    Ueckert, Sebastian; Plan, Elodie L; Ito, Kaori; Karlsson, Mats O; Corrigan, Brian; Hooker, Andrew C

    2014-08-01

    This work investigates improved utilization of ADAS-cog data (the primary outcome in Alzheimer's disease (AD) trials of mild and moderate AD) by combining pharmacometric modeling and item response theory (IRT). A baseline IRT model characterizing the ADAS-cog was built based on data from 2,744 individuals. Pharmacometric methods were used to extend the baseline IRT model to describe longitudinal ADAS-cog scores from an 18-month clinical study with 322 patients. Sensitivity of the ADAS-cog items in different patient populations as well as the power to detect a drug effect in relation to total score based methods were assessed with the IRT based model. IRT analysis was able to describe both total and item level baseline ADAS-cog data. Longitudinal data were also well described. Differences in the information content of the item level components could be quantitatively characterized and ranked for mild cognitively impairment and mild AD populations. Based on clinical trial simulations with a theoretical drug effect, the IRT method demonstrated a significantly higher power to detect drug effect compared to the traditional method of analysis. A combined framework of IRT and pharmacometric modeling permits a more effective and precise analysis than total score based methods and therefore increases the value of ADAS-cog data.

  20. A Comparison of Multidimensional Item Selection Methods in Simple and Complex Test Designs

    Directory of Open Access Journals (Sweden)

    Eren Halil ÖZBERK

    2017-03-01

    Full Text Available In contrast with the previous studies, this study employed various test designs (simple and complex which allow the evaluation of the overall ability score estimations across multiple real test conditions. In this study, four factors were manipulated, namely the test design, number of items per dimension, correlation between dimensions and item selection methods. Using the generated item and ability parameters, dichotomous item responses were generated in by using M3PL compensatory multidimensional IRT model with specified correlations. MCAT composite ability score accuracy was evaluated using absolute bias (ABSBIAS, correlation and the root mean square error (RMSE between true and estimated ability scores. The results suggest that the multidimensional test structure, number of item per dimension and correlation between dimensions had significant effect on item selection methods for the overall score estimations. For simple structure test design it was found that V1 item selection has the lowest absolute bias estimations for both long and short tests while estimating overall scores. As the model gets complex KL item selection method performed better than other two item selection method.

  1. Increasing the Number of Replications in Item Response Theory Simulations: Automation through SAS and Disk Operating System

    Science.gov (United States)

    Gagne, Phill; Furlow, Carolyn; Ross, Terris

    2009-01-01

    In item response theory (IRT) simulation research, it is often necessary to use one software package for data generation and a second software package to conduct the IRT analysis. Because this can substantially slow down the simulation process, it is sometimes offered as a justification for using very few replications. This article provides…

  2. A Comparison between Linear IRT Observed-Score Equating and Levine Observed-Score Equating under the Generalized Kernel Equating Framework

    Science.gov (United States)

    Chen, Haiwen

    2012-01-01

    In this article, linear item response theory (IRT) observed-score equating is compared under a generalized kernel equating framework with Levine observed-score equating for nonequivalent groups with anchor test design. Interestingly, these two equating methods are closely related despite being based on different methodologies. Specifically, when…

  3. Applying Item Response Theory methods to design a learning progression-based science assessment

    Science.gov (United States)

    Chen, Jing

    Learning progressions are used to describe how students' understanding of a topic progresses over time and to classify the progress of students into steps or levels. This study applies Item Response Theory (IRT) based methods to investigate how to design learning progression-based science assessments. The research questions of this study are: (1) how to use items in different formats to classify students into levels on the learning progression, (2) how to design a test to give good information about students' progress through the learning progression of a particular construct and (3) what characteristics of test items support their use for assessing students' levels. Data used for this study were collected from 1500 elementary and secondary school students during 2009--2010. The written assessment was developed in several formats such as the Constructed Response (CR) items, Ordered Multiple Choice (OMC) and Multiple True or False (MTF) items. The followings are the main findings from this study. The OMC, MTF and CR items might measure different components of the construct. A single construct explained most of the variance in students' performances. However, additional dimensions in terms of item format can explain certain amount of the variance in student performance. So additional dimensions need to be considered when we want to capture the differences in students' performances on different types of items targeting the understanding of the same underlying progression. Items in each item format need to be improved in certain ways to classify students more accurately into the learning progression levels. This study establishes some general steps that can be followed to design other learning progression-based tests as well. For example, first, the boundaries between levels on the IRT scale can be defined by using the means of the item thresholds across a set of good items. Second, items in multiple formats can be selected to achieve the information criterion at all

  4. Using Classical Test Theory and Item Response Theory to Evaluate the LSCI

    Science.gov (United States)

    Schlingman, Wayne M.; Prather, E. E.; Collaboration of Astronomy Teaching Scholars CATS

    2011-01-01

    Analyzing the data from the recent national study using the Light and Spectroscopy Concept Inventory (LSCI), this project uses both Classical Test Theory (CTT) and Item Response Theory (IRT) to investigate the LSCI itself in order to better understand what it is actually measuring. We use Classical Test Theory to form a framework of results that can be used to evaluate the effectiveness of individual questions at measuring differences in student understanding and provide further insight into the prior results presented from this data set. In the second phase of this research, we use Item Response Theory to form a theoretical model that generates parameters accounting for a student's ability, a question's difficulty, and estimate the level of guessing. The combined results from our investigations using both CTT and IRT are used to better understand the learning that is taking place in classrooms across the country. The analysis will also allow us to evaluate the effectiveness of individual questions and determine whether the item difficulties are appropriately matched to the abilities of the students in our data set. These results may require that some questions be revised, motivating the need for further development of the LSCI. This material is based upon work supported by the National Science Foundation under Grant No. 0715517, a CCLI Phase III Grant for the Collaboration of Astronomy Teaching Scholars (CATS). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

  5. Using item response theory to investigate the structure of anticipated affect: do self-reports about future affective reactions conform to typical or maximal models?

    OpenAIRE

    Zampetakis, Leonidas A.; Lerakis, Manolis; Kafetsios, Konstantinos; Moustakis, Vassilis

    2015-01-01

    In the present research we used item response theory (IRT) to examine whether effective predictions (anticipated affect) conforms to a typical (i.e., what people usually do) or a maximal behavior process (i.e., what people can do). The former, correspond to non-monotonic ideal point IRT models whereas the latter correspond to monotonic dominance IRT models. A convenience, cross-sectional student sample (N=1624) was used. Participants were asked to report on anticipated positive and negative a...

  6. A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

    Science.gov (United States)

    Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

    2018-04-10

    To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading .3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.

  7. Differential Item Functioning Analysis Using a Mixture 3-Parameter Logistic Model with a Covariate on the TIMSS 2007 Mathematics Test

    Science.gov (United States)

    Choi, Youn-Jeng; Alexeev, Natalia; Cohen, Allan S.

    2015-01-01

    The purpose of this study was to explore what may be contributing to differences in performance in mathematics on the Trends in International Mathematics and Science Study 2007. This was done by using a mixture item response theory modeling approach to first detect latent classes in the data and then to examine differences in performance on items…

  8. Item Banks for Substance Use from the Patient-Reported Outcomes Measurement Information System (PROMIS®): Severity of Use and Positive Appeal of Use*

    Science.gov (United States)

    Pilkonis, Paul A.; Yu, Lan; Dodds, Nathan E.; Johnston, Kelly L.; Lawrence, Suzanne; Hilton, Thomas F.; Daley, Dennis C.; Patkar, Ashwin A.; McCarty, Dennis

    2015-01-01

    Background Two item banks for substance use were developed as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®): severity of substance use and positive appeal of substance use. Methods Qualitative item analysis (including focus groups, cognitive interviewing, expert review, and item revision) reduced an initial pool of more than 5,300 items for substance use to 119 items included in field testing. Items were written in a first-person, past-tense format, with 5 response options reflecting frequency or severity. Both 30-day and 3-month time frames were tested. The calibration sample of 1,336 respondents included 875 individuals from the general population (ascertained through an internet panel) and 461patients from addiction treatment centers participating in the National Drug Abuse Treatment Clinical Trials Network. Results Final banks of 37 and 18 items were calibrated for severity of substance use and positive appeal of substance use, respectively, using the two-parameter graded response model from item response theory (IRT). Initial calibrations were similar for the 30-day and 3-month time frames, and final calibrations used data combined across the time frames, making the items applicable with either interval. Seven-item static short forms were also developed from each item bank. Conclusions Test information curves showed that the PROMIS item banks provided substantial information in a broad range of severity, making them suitable for treatment, observational, and epidemiological research in both clinical and community settings. PMID:26423364

  9. Looking Closer at the Effects of Framing on Risky Choice: An Item Response Theory Analysis.

    Science.gov (United States)

    Sickar; Highhouse

    1998-07-01

    Item response theory (IRT) methodology allowed an in-depth examination of several issues that would be difficult to explore using traditional methodology. IRT models were estimated for 4 risky-choice items, answered by students under either a gain or loss frame. Results supported the typical framing finding of risk-aversion for gains and risk-seeking for losses but also suggested that a latent construct we label preference for risk was influential in predicting risky choice. Also, the Asian Disease item, most often used in framing research, was found to have anomalous statistical properties when compared to other framing items. Copyright 1998 Academic Press.

  10. Bayesian inference in an item response theory model with a generalized student t link function

    Science.gov (United States)

    Azevedo, Caio L. N.; Migon, Helio S.

    2012-10-01

    In this paper we introduce a new item response theory (IRT) model with a generalized Student t-link function with unknown degrees of freedom (df), named generalized t-link (GtL) IRT model. In this model we consider only the difficulty parameter in the item response function. GtL is an alternative to the two parameter logit and probit models, since the degrees of freedom (df) play a similar role to the discrimination parameter. However, the behavior of the curves of the GtL is different from those of the two parameter models and the usual Student t link, since in GtL the curve obtained from different df's can cross the probit curves in more than one latent trait level. The GtL model has similar proprieties to the generalized linear mixed models, such as the existence of sufficient statistics and easy parameter interpretation. Also, many techniques of parameter estimation, model fit assessment and residual analysis developed for that models can be used for the GtL model. We develop fully Bayesian estimation and model fit assessment tools through a Metropolis-Hastings step within Gibbs sampling algorithm. We consider a prior sensitivity choice concerning the degrees of freedom. The simulation study indicates that the algorithm recovers all parameters properly. In addition, some Bayesian model fit assessment tools are considered. Finally, a real data set is analyzed using our approach and other usual models. The results indicate that our model fits the data better than the two parameter models.

  11. Time evolution of the energy confinement time, internal inductance and effective edge safety factor on IR-T1 tokamak

    International Nuclear Information System (INIS)

    Salar Elahi, A; Ghoranneviss, M

    2010-01-01

    An attempt is made to investigate the time evolution of the energy confinement time, internal inductance and effective edge safety factor on IR-T1 tokamak. For this purpose, four magnetic pickup coils were designed, constructed and installed on the outer surface of the IR-T1 and then the Shafranov parameter (asymmetry factor) was obtained from them. On the other hand, also a diamagnetic loop was designed and installed on IR-T1 and poloidal beta was determined from it. Therefore, the internal inductance and effective edge safety factor were measured. Also, the time evolution of the energy confinement time was measured using the diamagnetic loop. Experimental results on IR-T1 show that the maximum energy confinement time (which corresponds to minimum collisions, minimum microinstabilities and minimum transport) is at low values of the effective edge safety factor (2.5 eff (a) i <0.72). The results obtained are in agreement with those obtained with the theoretical approach [1-5].

  12. An item response theory analysis of Harter’s self-perception profile for children or why strong clinical scales should be distrusted

    NARCIS (Netherlands)

    Egberink, I.J.L.; Meijer, R.R.

    2011-01-01

    The authors investigated the psychometric properties of the subscales of the Self-Perception Profile for Children with item response theory (IRT) models using a sample of 611 children. Results from a nonparametric Mokken analysis and a parametric IRT approach for boys (n = 268) and girls (n = 343)

  13. An item response theory analysis of Harter's Self-Perception Profile for Children or why strong clinical scales should be distrusted

    NARCIS (Netherlands)

    Egberink, Iris J. L.; Meijer, Rob R.

    The authors investigated the psychometric properties of the subscales of the Self-Perception Profile for Children with item response theory (IRT) models using a sample of 611 children. Results from a nonparametric Mokken analysis and a parametric IRT approach for boys (n = 268) and girls (n = 343)

  14. Calibration of the PROMIS physical function item bank in Dutch patients with rheumatoid arthritis.

    Directory of Open Access Journals (Sweden)

    Martijn A H Oude Voshaar

    Full Text Available OBJECTIVE: To calibrate the Dutch-Flemish version of the PROMIS physical function (PF item bank in patients with rheumatoid arthritis (RA and to evaluate cross-cultural measurement equivalence with US general population and RA data. METHODS: Data were collected from RA patients enrolled in the Dutch DREAM registry. An incomplete longitudinal anchored design was used where patients completed all 121 items of the item bank over the course of three waves of data collection. Item responses were fit to a generalized partial credit model adapted for longitudinal data and the item parameters were examined for differential item functioning (DIF across country, age, and sex. RESULTS: In total, 690 patients participated in the study at time point 1 (T2, N = 489; T3, N = 311. The item bank could be successfully fitted to a generalized partial credit model, with the number of misfitting items falling within acceptable limits. Seven items demonstrated DIF for sex, while 5 items showed DIF for age in the Dutch RA sample. Twenty-five (20% items were flagged for cross-cultural DIF compared to the US general population. However, the impact of observed DIF on total physical function estimates was negligible. DISCUSSION: The results of this study showed that the PROMIS PF item bank adequately fit a unidimensional IRT model which provides support for applications that require invariant estimates of physical function, such as computer adaptive testing and targeted short forms. More studies are needed to further investigate the cross-cultural applicability of the US-based PROMIS calibration and standardized metric.

  15. Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations.

    Science.gov (United States)

    Teresi, Jeanne A; Ocepek-Welikson, Katja; Cook, Karon F; Kleinman, Marjorie; Ramirez, Mildred; Reid, M Carrington; Siu, Albert

    2016-01-01

    Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System ® (PROMIS ® ) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, "How much did pain interfere with enjoyment of social activities?" was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and sensitivity

  16. Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations

    Science.gov (United States)

    Teresi, Jeanne A.; Ocepek-Welikson, Katja; Cook, Karon F.; Kleinman, Marjorie; Ramirez, Mildred; Reid, M. Carrington; Siu, Albert

    2017-01-01

    Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System® (PROMIS®) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. Methods DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. Results The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, “How much did pain interfere with enjoyment of social activities?” was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and

  17. A neutron radiography facility on the IRT-2000 reactor

    International Nuclear Information System (INIS)

    Khadduri, I.Y.

    1976-01-01

    A neutron radiography facility has been constructed on the thermal neutron channel of the IRT-2000 reactor. A collimated thermal neutron beam exposure area of 10 cm diameter is obtained with an L/D ratio of 48.8. The film used is cellulose nitrate coated with lithium tetraborate which is insensitive to gamma and beta radiation. Some pictures with good contrast and resolution have been obtained. Pictures of parts of an IRT-2000 reactor fuel pin have also been recorded. (orig) [de

  18. An item response theory analysis of the Psychological Inventory of Criminal Thinking Styles: comparing male and female probationers and prisoners.

    Science.gov (United States)

    Walters, Glenn D

    2014-09-01

    An item response theory (IRT) analysis of the Psychological Inventory of Criminal Thinking Styles (PICTS) was performed on 26,831 (19,067 male and 7,764 female) federal probationers and compared with results obtained on 3,266 (3,039 male and 227 female) prisoners from previous research. Despite the fact male and female federal probationers scored significantly lower on the PICTS thinking style scales than male and female prisoners, discrimination and location parameter estimates for the individual PICTS items were comparable across sex and setting. Consistent with the results of a previous IRT analysis conducted on the PICTS, the current results did not support sentimentality as a component of general criminal thinking. Findings from this study indicate that the discriminative power of the individual PICTS items is relatively stable across sex (male, female) and correctional setting (probation, prison) and that the PICTS may be measuring the same criminal thinking construct in male and female probationers and prisoners. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  19. Response pattern of depressive symptoms among college students: What lies behind items of the Beck Depression Inventory-II?

    Science.gov (United States)

    de Sá Junior, Antonio Reis; de Andrade, Arthur Guerra; Andrade, Laura Helena; Gorenstein, Clarice; Wang, Yuan-Pang

    2018-07-01

    This study examines the response pattern of depressive symptoms in a nationwide student sample, through item analyses of a rating scale by both classical test theory (CTT) and item response theory (IRT). The 21-item Beck Depression Inventory-II (BDI-II) was administered to 12,711 college students. First, the psychometric properties of the scale were described. Thereafter, the endorsement probability of depressive symptom in each scale item was analyzed through CTT and IRT. Graphical plots depicted the endorsement probability of scale items and intensity of depression. Three items of different difficulty level were compared through CTT and IRT approach. Four in five students reported the presence of depressive symptoms. The BDI-II items presented good reliability and were distributed along the symptomatic continuum of depression. Similarly, in both CTT and IRT approaches, the item 'changes in sleep' was easily endorsed, 'loss of interest' moderately and 'suicidal thoughts' hardly. Graphical representation of BDI-II of both methods showed much equivalence in terms of item discrimination and item difficulty. The item characteristic curve of the IRT method provided informative evaluation of item performance. The inventory was applied only in college students. Depressive symptoms were frequent psychopathological manifestations among college students. The performance of the BDI-II items indicated convergent results from both methods of analysis. While the CTT was easy to understand and to apply, the IRT was more complex to understand and to implement. Comprehensive assessment of the functioning of each BDI-II item might be helpful in efficient detection of depressive conditions in college students. Copyright © 2018 Elsevier B.V. All rights reserved.

  20. Strategy for Sustainable Utilization of IRT-Sofia Research Reactor

    International Nuclear Information System (INIS)

    Mitev, M.; Apostolov, T.; Ilieva, K.; Belousov, S.; Nonova, T.

    2013-01-01

    The Research Reactor IRT-2000 in Sofia is in process of reconstruction into a low-power reactor of 200 kW under the decision of the Council of Ministers of Republic of Bulgaria from 2001. The reactor will be utilized for development and preservation of nuclear science, skills, and knowledge; implementation of applied methods and research; education of students and training of graduated physicists and engineers in the field of nuclear science and nuclear energy; development of radiation therapy facility. Nuclear energy has a strategic place within the structure of the country’s energy system. In that aspect, the research reactor as a material base, and its scientific and technical personnel, represent a solid basis for the development of nuclear energy in our country. The acquired scientific experience and qualification in reactor operation are a precondition for the equal in rights participation of the country in the international cooperation and the approaching to the European structures, and assurance of the national interests. Therefore, the operation and use of the research reactor brings significant economic benefits for the country. For education of students in nuclear energy, reactor physics experiments for measurements of static and kinetic reactor parameters will be carried out on the research reactor. The research reactor as a national base will support training and applied research, keep up the good practice and the preparation of specialists who are able to monitor radioactivity sources, to develop new methods for detection of low quantities of radioactive isotopes which are hard to find, for deactivation and personal protection. The reactor will be used for production of isotopes needed for medical therapy and diagnostics; it will be the neutron source in element activation analysis having a number of applications in industrial production, medicine, chemistry, criminology, etc. The reactor operation will increase the public understanding, confidence

  1. MCMC estimation of multidimensional IRT models

    NARCIS (Netherlands)

    Beguin, Anton; Glas, Cornelis A.W.

    1998-01-01

    A Bayesian procedure to estimate the three-parameter normal ogive model and a generalization to a model with multidimensional ability parameters are discussed. The procedure is a generalization of a procedure by J. Albert (1992) for estimating the two-parameter normal ogive model. The procedure will

  2. Bad Questions: An Essay Involving Item Response Theory

    Science.gov (United States)

    Thissen, David

    2016-01-01

    David Thissen, a professor in the Department of Psychology and Neuroscience, Quantitative Program at the University of North Carolina, has consulted and served on technical advisory committees for assessment programs that use item response theory (IRT) over the past couple decades. He has come to the conclusion that there are usually two purposes…

  3. Goodness-of-Fit Assessment of Item Response Theory Models

    Science.gov (United States)

    Maydeu-Olivares, Alberto

    2013-01-01

    The article provides an overview of goodness-of-fit assessment methods for item response theory (IRT) models. It is now possible to obtain accurate "p"-values of the overall fit of the model if bivariate information statistics are used. Several alternative approaches are described. As the validity of inferences drawn on the fitted model…

  4. Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients

    Science.gov (United States)

    Andersson, Björn; Xin, Tao

    2018-01-01

    In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…

  5. Item Response Theory as an Efficient Tool to Describe a Heterogeneous Clinical Rating Scale in De Novo Idiopathic Parkinson's Disease Patients.

    Science.gov (United States)

    Buatois, Simon; Retout, Sylvie; Frey, Nicolas; Ueckert, Sebastian

    2017-10-01

    This manuscript aims to precisely describe the natural disease progression of Parkinson's disease (PD) patients and evaluate approaches to increase the drug effect detection power. An item response theory (IRT) longitudinal model was built to describe the natural disease progression of 423 de novo PD patients followed during 48 months while taking into account the heterogeneous nature of the MDS-UPDRS. Clinical trial simulations were then used to compare drug effect detection power from IRT and sum of item scores based analysis under different analysis endpoints and drug effects. The IRT longitudinal model accurately describes the evolution of patients with and without PD medications while estimating different progression rates for the subscales. When comparing analysis methods, the IRT-based one consistently provided the highest power. IRT is a powerful tool which enables to capture the heterogeneous nature of the MDS-UPDRS.

  6. Development and Standardization of the Diagnostic Adaptive Behavior Scale: Application of Item Response Theory to the Assessment of Adaptive Behavior

    Science.gov (United States)

    Tassé, Marc J.; Schalock, Robert L.; Thissen, David; Balboni, Giulia; Bersani, Henry, Jr.; Borthwick-Duffy, Sharon A.; Spreat, Scott; Widaman, Keith F.; Zhang, Dalun; Navas, Patricia

    2016-01-01

    The Diagnostic Adaptive Behavior Scale (DABS) was developed using item response theory (IRT) methods and was constructed to provide the most precise and valid adaptive behavior information at or near the cutoff point of making a decision regarding a diagnosis of intellectual disability. The DABS initial item pool consisted of 260 items. Using IRT…

  7. A Mixture IRT Analysis of Risky Youth Behavior

    Directory of Open Access Journals (Sweden)

    Holmes eFinch

    2011-05-01

    Full Text Available The study reported in this manuscript used a mixture item response model with data from the Youth Risk Behavior Survey 2009 (N = 16,410 to identify subtypes of adolescents at-risk for engaging in unhealthy behaviors, and to find individual survey items that were most effective at identifying such students within each subtype. The goal of the manuscript is twofold: 1 To demonstrate the utility of the mixture item response theory model for identifying subgroups in the population and for highlighting the use of group specific item response parameters and 2 To identify typologies of adolescents based on their propensity for engaging in sexually and substance use risky behaviors. Results indicate that 4 classes of youth exist in the population, with differences in risky sexual behaviors and substance use. The first group had a greater propensity to engage in risky sexual behavior, while group 2 was more likely to smoke tobacco and drink alcohol. Group 3 was the most likely to use other substances, such as marijuana, methamphetamine, and other mind altering drugs, and group 4 had the lowest propensity for engaging in any of the sexual or substance use behaviors included in the survey. Finally, individual items were identified for each group that can be most effective at identifying individuals at greatest risk. Further proposed directions of research and the contribution of this analysis to the existing literature are discussed.

  8. Item response analysis on an examination in anesthesiology for medical students in Taiwan: A comparison of one- and two-parameter logistic models

    Directory of Open Access Journals (Sweden)

    Yu-Feng Huang

    2013-06-01

    Conclusion: Item response models are useful for medical test analyses and provide valuable information about model comparisons and identification of differential items other than test reliability, item difficulty, and examinee's ability.

  9. Parent Ratings of ADHD Symptoms: Generalized Partial Credit Model Analysis of Differential Item Functioning across Gender

    Science.gov (United States)

    Gomez, Rapson

    2012-01-01

    Objective: Generalized partial credit model, which is based on item response theory (IRT), was used to test differential item functioning (DIF) for the "Diagnostic and Statistical Manual of Mental Disorders" (4th ed.), inattention (IA), and hyperactivity/impulsivity (HI) symptoms across boys and girls. Method: To accomplish this, parents completed…

  10. An Explanatory Item Response Theory Approach for a Computer-Based Case Simulation Test

    Science.gov (United States)

    Kahraman, Nilüfer

    2014-01-01

    Problem: Practitioners working with multiple-choice tests have long utilized Item Response Theory (IRT) models to evaluate the performance of test items for quality assurance. The use of similar applications for performance tests, however, is often encumbered due to the challenges encountered in working with complicated data sets in which local…

  11. Stochastic order in dichotomous item response models for fixed tests, research adaptive tests, or multiple abilities

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1995-01-01

    Dichotomous item response theory (IRT) models can be viewed as families of stochastically ordered distributions of responses to test items. This paper explores several properties of such distributiom. The focus is on the conditions under which stochastic order in families of conditional

  12. A Comparison of Item Exposure Control Procedures with the Generalized Partial Credit Model

    Science.gov (United States)

    Sanchez, Edgar Isaac

    2008-01-01

    To enhance test security of high stakes tests, it is vital to understand the way various exposure control strategies function under various IRT models. To that end the present dissertation focused on the performance of several exposure control strategies under the generalized partial credit model with an item pool of 100 and 200 items. These…

  13. Stepwise Analysis of Differential Item Functioning Based on Multiple-Group Partial Credit Model.

    Science.gov (United States)

    Muraki, Eiji

    1999-01-01

    Extended an Item Response Theory (IRT) method for detection of differential item functioning to the partial credit model and applied the method to simulated data using a stepwise procedure. Then applied the stepwise DIF analysis based on the multiple-group partial credit model to writing trend data from the National Assessment of Educational…

  14. The e-MSWS-12: improving the multiple sclerosis walking scale using item response theory.

    Science.gov (United States)

    Engelhard, Matthew M; Schmidt, Karen M; Engel, Casey E; Brenton, J Nicholas; Patek, Stephen D; Goldman, Myla D

    2016-12-01

    The Multiple Sclerosis Walking Scale (MSWS-12) is the predominant patient-reported measure of multiple sclerosis (MS) -elated walking ability, yet it had not been analyzed using item response theory (IRT), the emerging standard for patient-reported outcome (PRO) validation. This study aims to reduce MSWS-12 measurement error and facilitate computerized adaptive testing by creating an IRT model of the MSWS-12 and distributing it online. MSWS-12 responses from 284 subjects with MS were collected by mail and used to fit and compare several IRT models. Following model selection and assessment, subpopulations based on age and sex were tested for differential item functioning (DIF). Model comparison favored a one-dimensional graded response model (GRM). This model met fit criteria and explained 87 % of response variance. The performance of each MSWS-12 item was characterized using category response curves (CRCs) and item information. IRT-based MSWS-12 scores correlated with traditional MSWS-12 scores (r = 0.99) and timed 25-foot walk (T25FW) speed (r =  -0.70). Item 2 showed DIF based on age (χ 2  = 19.02, df = 5, p Item 11 showed DIF based on sex (χ 2  = 13.76, df = 5, p = 0.02). MSWS-12 measurement error depends on walking ability, but could be lowered by improving or replacing items with low information or DIF. The e-MSWS-12 includes IRT-based scoring, error checking, and an estimated T25FW derived from MSWS-12 responses. It is available at https://ms-irt.shinyapps.io/e-MSWS-12 .

  15. Mathematical Ability and Socio-Economic Background: IRT Modeling to Estimate Genotype by Environment Interaction.

    Science.gov (United States)

    Schwabe, Inga; Boomsma, Dorret I; van den Berg, Stéphanie M

    2017-12-01

    Genotype by environment interaction in behavioral traits may be assessed by estimating the proportion of variance that is explained by genetic and environmental influences conditional on a measured moderating variable, such as a known environmental exposure. Behavioral traits of interest are often measured by questionnaires and analyzed as sum scores on the items. However, statistical results on genotype by environment interaction based on sum scores can be biased due to the properties of a scale. This article presents a method that makes it possible to analyze the actually observed (phenotypic) item data rather than a sum score by simultaneously estimating the genetic model and an item response theory (IRT) model. In the proposed model, the estimation of genotype by environment interaction is based on an alternative parametrization that is uniquely identified and therefore to be preferred over standard parametrizations. A simulation study shows good performance of our method compared to analyzing sum scores in terms of bias. Next, we analyzed data of 2,110 12-year-old Dutch twin pairs on mathematical ability. Genetic models were evaluated and genetic and environmental variance components estimated as a function of a family's socio-economic status (SES). Results suggested that common environmental influences are less important in creating individual differences in mathematical ability in families with a high SES than in creating individual differences in mathematical ability in twin pairs with a low or average SES.

  16. A multidimensional assessment of the validity and utility of alcohol use disorder severity as determined by item response theory models.

    Science.gov (United States)

    Dawson, Deborah A; Saha, Tulshi D; Grant, Bridget F

    2010-02-01

    The relative severity of the 11 DSM-IV alcohol use disorder (AUD) criteria are represented by their severity threshold scores, an item response theory (IRT) model parameter inversely proportional to their prevalence. These scores can be used to create a continuous severity measure comprising the total number of criteria endorsed, each weighted by its relative severity. This paper assesses the validity of the severity ranking of the 11 criteria and the overall severity score with respect to known AUD correlates, including alcohol consumption, psychological functioning, family history, antisociality, and early initiation of drinking, in a representative population sample of U.S. past-year drinkers (n=26,946). The unadjusted mean values for all validating measures increased steadily with the severity threshold score, except that legal problems, the criterion with the highest score, was associated with lower values than expected. After adjusting for the total number of criteria endorsed, this direct relationship was no longer evident. The overall severity score was no more highly correlated with the validating measures than a simple count of criteria endorsed, nor did the two measures yield different risk curves. This reflects both within-criterion variation in severity and the fact that the number of criteria endorsed and their severity are so highly correlated that severity is essentially redundant. Attempts to formulate a scalar measure of AUD will do as well by relying on simple counts of criteria or symptom items as by using scales weighted by IRT measures of severity. Published by Elsevier Ireland Ltd.

  17. Item difficulty of multiple choice tests dependant on different item response formats – An experiment in fundamental research on psychological assessment

    Directory of Open Access Journals (Sweden)

    KLAUS D. KUBINGER

    2007-12-01

    Full Text Available Multiple choice response formats are problematical as an item is often scored as solved simply because the test-taker is a lucky guesser. Instead of applying pertinent IRT models which take guessing effects into account, a pragmatic approach of re-conceptualizing multiple choice response formats to reduce the chance of lucky guessing is considered. This paper compares the free response format with two different multiple choice formats. A common multiple choice format with a single correct response option and five distractors (“1 of 6” is used, as well as a multiple choice format with five response options, of which any number of the five is correct and the item is only scored as mastered if all the correct response options and none of the wrong ones are marked (“x of 5”. An experiment was designed, using pairs of items with exactly the same content but different response formats. 173 test-takers were randomly assigned to two test booklets of 150 items altogether. Rasch model analyses adduced a fitting item pool, after the deletion of 39 items. The resulting item difficulty parameters were used for the comparison of the different formats. The multiple choice format “1 of 6” differs significantly from “x of 5”, with a relative effect of 1.63, while the multiple choice format “x of 5” does not significantly differ from the free response format. Therefore, the lower degree of difficulty of items with the “1 of 6” multiple choice format is an indicator of relevant guessing effects. In contrast the “x of 5” multiple choice format can be seen as an appropriate substitute for free response format.

  18. Data Visualization of Item-Total Correlation by Median Smoothing

    Directory of Open Access Journals (Sweden)

    Chong Ho Yu

    2016-02-01

    Full Text Available This paper aims to illustrate how data visualization could be utilized to identify errors prior to modeling, using an example with multi-dimensional item response theory (MIRT. MIRT combines item response theory and factor analysis to identify a psychometric model that investigates two or more latent traits. While it may seem convenient to accomplish two tasks by employing one procedure, users should be cautious of problematic items that affect both factor analysis and IRT. When sample sizes are extremely large, reliability analyses can misidentify even random numbers as meaningful patterns. Data visualization, such as median smoothing, can be used to identify problematic items in preliminary data cleaning.

  19. Item information and discrimination functions for trinary PCM items

    NARCIS (Netherlands)

    Akkermans, Wies; Muraki, Eiji

    1997-01-01

    For trinary partial credit items the shape of the item information and the item discrimination function is examined in relation to the item parameters. In particular, it is shown that these functions are unimodal if δ2 – δ1 < 4 ln 2 and bimodal otherwise. The locations and values of the maxima are

  20. The Infrared-Optical Telescope (IRT) of the Exist Observatory

    Science.gov (United States)

    Kutyrev, Alexander; Bloom, Joshua; Gehrels, Neil; Golisano, Craig; Gong, Quan; Grindlay, Jonathan; Moseley, Samuel; Woodgate, Bruce

    2010-01-01

    The IRT is a 1.1m visible and infrared passively cooled telescope, which can locate, identify and obtain spectra of GRB afterglows at redshifts up to z 20. It will also acquire optical-IR, imaging and spectroscopy of AGN and transients discovered by the EXIST (The Energetic X-ray Imaging Survey Telescope). The IRT imaging and spectroscopic capabilities cover a broad spectral range from 0.32.2m in four bands. The identical fields of view in the four instrument bands are each split in three subfields: imaging, objective prism slitless for the field and objective prism single object slit low resolution spectroscopy, and high resolution long slit on single object. This allows the instrument, to do simultaneous broadband photometry or spectroscopy of the same object over the full spectral range, thus greatly improving the efficiency of the observatory and its detection limits. A prompt follow up (within three minutes) of the transient discovered by the EXIST makes IRT a unique tool for detection and study of these events, which is particularly valuable at wavelengths unavailable to the ground based observatories.

  1. Item response theory analyses of the Delis-Kaplan Executive Function System card sorting subtest.

    Science.gov (United States)

    Spencer, Mercedes; Cho, Sun-Joo; Cutting, Laurie E

    2018-02-02

    In the current study, we examined the dimensionality of the 16-item Card Sorting subtest of the Delis-Kaplan Executive Functioning System assessment in a sample of 264 native English-speaking children between the ages of 9 and 15 years. We also tested for measurement invariance for these items across age and gender groups using item response theory (IRT). Results of the exploratory factor analysis indicated that a two-factor model that distinguished between verbal and perceptual items provided the best fit to the data. Although the items demonstrated measurement invariance across age groups, measurement invariance was violated for gender groups, with two items demonstrating differential item functioning for males and females. Multigroup analysis using all 16 items indicated that the items were more effective for individuals whose IRT scale scores were relatively high. A single-group explanatory IRT model using 14 non-differential item functioning items showed that for perceptual ability, females scored higher than males and that scores increased with age for both males and females; for verbal ability, the observed increase in scores across age differed for males and females. The implications of these findings are discussed.

  2. Construction of a memory battery for computerized administration, using item response theory.

    Science.gov (United States)

    Ferreira, Aristides I; Almeida, Leandro S; Prieto, Gerardo

    2012-10-01

    In accordance with Item Response Theory, a computer memory battery with six tests was constructed for use in the Portuguese adult population. A factor analysis was conducted to assess the internal structure of the tests (N = 547 undergraduate students). According to the literature, several confirmatory factor models were evaluated. Results showed better fit of a model with two independent latent variables corresponding to verbal and non-verbal factors, reproducing the initial battery organization. Internal consistency reliability for the six tests were alpha = .72 to .89. IRT analyses (Rasch and partial credit models) yielded good Infit and Outfit measures and high precision for parameter estimation. The potential utility of these memory tasks for psychological research and practice willbe discussed.

  3. Determination of a Differential Item Functioning Procedure Using the Hierarchical Generalized Linear Model

    Directory of Open Access Journals (Sweden)

    Tülin Acar

    2012-01-01

    Full Text Available The aim of this research is to compare the result of the differential item functioning (DIF determining with hierarchical generalized linear model (HGLM technique and the results of the DIF determining with logistic regression (LR and item response theory–likelihood ratio (IRT-LR techniques on the test items. For this reason, first in this research, it is determined whether the students encounter DIF with HGLM, LR, and IRT-LR techniques according to socioeconomic status (SES, in the Turkish, Social Sciences, and Science subtest items of the Secondary School Institutions Examination. When inspecting the correlations among the techniques in terms of determining the items having DIF, it was discovered that there was significant correlation between the results of IRT-LR and LR techniques in all subtests; merely in Science subtest, the results of the correlation between HGLM and IRT-LR techniques were found significant. DIF applications can be made on test items with other DIF analysis techniques that were not taken to the scope of this research. The analysis results, which were determined by using the DIF techniques in different sample sizes, can be compared.

  4. Using item response theory to investigate the structure of anticipated affect: do self-reports about future affective reactions conform to typical or maximal models?

    Science.gov (United States)

    Zampetakis, Leonidas A; Lerakis, Manolis; Kafetsios, Konstantinos; Moustakis, Vassilis

    2015-01-01

    In the present research, we used item response theory (IRT) to examine whether effective predictions (anticipated affect) conforms to a typical (i.e., what people usually do) or a maximal behavior process (i.e., what people can do). The former, correspond to non-monotonic ideal point IRT models, whereas the latter correspond to monotonic dominance IRT models. A convenience, cross-sectional student sample (N = 1624) was used. Participants were asked to report on anticipated positive and negative affect around a hypothetical event (emotions surrounding the start of a new business). We carried out analysis comparing graded response model (GRM), a dominance IRT model, against generalized graded unfolding model, an unfolding IRT model. We found that the GRM provided a better fit to the data. Findings suggest that the self-report responses to anticipated affect conform to dominance response process (i.e., maximal behavior). The paper also discusses implications for a growing literature on anticipated affect.

  5. Teoria da Resposta ao Item Teoria de la respuesta al item Item response theory

    Directory of Open Access Journals (Sweden)

    Eutalia Aparecida Candido de Araujo

    2009-12-01

    Full Text Available A preocupação com medidas de traços psicológicos é antiga, sendo que muitos estudos e propostas de métodos foram desenvolvidos no sentido de alcançar este objetivo. Entre os trabalhos propostos, destaca-se a Teoria da Resposta ao Item (TRI que, a princípio, veio completar limitações da Teoria Clássica de Medidas, empregada em larga escala até hoje na medida de traços psicológicos. O ponto principal da TRI é que ela leva em consideração o item particularmente, sem relevar os escores totais; portanto, as conclusões não dependem apenas do teste ou questionário, mas de cada item que o compõe. Este artigo propõe-se a apresentar esta Teoria que revolucionou a teoria de medidas.La preocupación con las medidas de los rasgos psicológicos es antigua y muchos estudios y propuestas de métodos fueron desarrollados para lograr este objetivo. Entre estas propuestas de trabajo se incluye la Teoría de la Respuesta al Ítem (TRI que, en principio, vino a completar las limitaciones de la Teoría Clásica de los Tests, ampliamente utilizada hasta hoy en la medida de los rasgos psicológicos. El punto principal de la TRI es que se tiene en cuenta el punto concreto, sin relevar las puntuaciones totales; por lo tanto, los resultados no sólo dependen de la prueba o cuestionario, sino que de cada ítem que lo compone. En este artículo se propone presentar la Teoría que revolucionó la teoría de medidas.The concern with measures of psychological traits is old and many studies and proposals of methods were developed to achieve this goal. Among these proposed methods highlights the Item Response Theory (IRT that, in principle, came to complete limitations of the Classical Test Theory, which is widely used until nowadays in the measurement of psychological traits. The main point of IRT is that it takes into account the item in particular, not relieving the total scores; therefore, the findings do not only depend on the test or questionnaire

  6. Designing a Sine-Coil for Measurement of Plasma Displacements in IR-T1 Tokamak

    International Nuclear Information System (INIS)

    Khorshid, Pejman; Razavi, M.; Molaii, M.; Ghoranneviss, M.; TalebiTaher, A.; Arvin, R.; Mohammadi, S.; NikMohammadi, A.

    2008-01-01

    A method for the measurement of the plasma position in the IR-T1 tokamak in toroidal coordinates is developed. A sine-coil, which is a Rogowski coil with a variable wiring density is designed and fabricated for this purpose. An analytic solution of the Biot-Savart law, which is used to calculate magnetic fields created by toroidal plasma current, is presented. Results of calculations are compared with the experimental data obtained in no-plasma shots with a toroidal current-carrying coil positioned inside the vessel to simulate the plasma movements. The results are shown a good linear behavior of plasma position measurements. The error is less than 2.5% and it is compared with other methods of measurements of the plasma position. This method will be used in the feedback position control system and tests of feedback controller parameters are ongoing

  7. Assessing the Equivalence of Paper, Mobile Phone, and Tablet Survey Responses at a Community Mental Health Center Using Equivalent Halves of a 'Gold-Standard' Depression Item Bank.

    Science.gov (United States)

    Brodey, Benjamin B; Gonzalez, Nicole L; Elkin, Kathryn Ann; Sasiela, W Jordan; Brodey, Inger S

    2017-09-06

    The computerized administration of self-report psychiatric diagnostic and outcomes assessments has risen in popularity. If results are similar enough across different administration modalities, then new administration technologies can be used interchangeably and the choice of technology can be based on other factors, such as convenience in the study design. An assessment based on item response theory (IRT), such as the Patient-Reported Outcomes Measurement Information System (PROMIS) depression item bank, offers new possibilities for assessing the effect of technology choice upon results. To create equivalent halves of the PROMIS depression item bank and to use these halves to compare survey responses and user satisfaction among administration modalities-paper, mobile phone, or tablet-with a community mental health care population. The 28 PROMIS depression items were divided into 2 halves based on content and simulations with an established PROMIS response data set. A total of 129 participants were recruited from an outpatient public sector mental health clinic based in Memphis. All participants took both nonoverlapping halves of the PROMIS IRT-based depression items (Part A and Part B): once using paper and pencil, and once using either a mobile phone or tablet. An 8-cell randomization was done on technology used, order of technologies used, and order of PROMIS Parts A and B. Both Parts A and B were administered as fixed-length assessments and both were scored using published PROMIS IRT parameters and algorithms. All 129 participants received either Part A or B via paper assessment. Participants were also administered the opposite assessment, 63 using a mobile phone and 66 using a tablet. There was no significant difference in item response scores for Part A versus B. All 3 of the technologies yielded essentially identical assessment results and equivalent satisfaction levels. Our findings show that the PROMIS depression assessment can be divided into 2 equivalent

  8. Experiments in power distribution control on the IRT-2000 reactor

    International Nuclear Information System (INIS)

    Filipchuk, E.V.; Potapenko, P.T.; Trofimov, A.P.; Kosilov, A.N.; Neboyan, V.T.; Timokhin, E.S.

    1975-01-01

    The results from the experimental investigations of a system for regulating the neutron field on a research reactor IRT-2000 are shown. The right of such experiments on a reactor with a little active zone is substantiated. A successful attempt was made in this work to apply primary elements of straight charging in the neutron field regulating system. A system with independent instrumentally local regulators, a system with hard cross connections and a structure with a ''floating'' installation are studied. Serial common industrial regulators BRT-2 were used

  9. Evaluating the validity of the Work Role Functioning Questionnaire (Canadian French version) using classical test theory and item response theory.

    Science.gov (United States)

    Hong, Quan Nha; Coutu, Marie-France; Berbiche, Djamal

    2017-01-01

    The Work Role Functioning Questionnaire (WRFQ) was developed to assess workers' perceived ability to perform job demands and is used to monitor presenteeism. Still few studies on its validity can be found in the literature. The purpose of this study was to assess the items and factorial composition of the Canadian French version of the WRFQ (WRFQ-CF). Two measurement approaches were used to test the WRFQ-CF: Classical Test Theory (CTT) and non-parametric Item Response Theory (IRT). A total of 352 completed questionnaires were analyzed. A four-factor and three-factor model models were tested and shown respectively good fit with 14 items (Root Mean Square Error of Approximation (RMSEA) = 0.06, Standardized Root Mean Square Residual (SRMR) = 0.04, Bentler Comparative Fit Index (CFI) = 0.98) and with 17 items (RMSEA = 0.059, SRMR = 0.048, CFI = 0.98). Using IRT, 13 problematic items were identified, of which 9 were common with CTT. This study tested different models with fewer problematic items found in a three-factor model. Using a non-parametric IRT and CTT for item purification gave complementary results. IRT is still scarcely used and can be an interesting alternative method to enhance the quality of a measurement instrument. More studies are needed on the WRFQ-CF to refine its items and factorial composition.

  10. Evaluation of psychometric properties and differential item functioning of 8-item Child Perceptions Questionnaires using item response theory.

    Science.gov (United States)

    Yau, David T W; Wong, May C M; Lam, K F; McGrath, Colman

    2015-08-19

    Four-factor structure of the two 8-item short forms of Child Perceptions Questionnaire CPQ11-14 (RSF:8 and ISF:8) has been confirmed. However, the sum scores are typically reported in practice as a proxy of Oral health-related Quality of Life (OHRQoL), which implied a unidimensional structure. This study first assessed the unidimensionality of 8-item short forms of CPQ11-14. Item response theory (IRT) was employed to offer an alternative and complementary approach of validation and to overcome the limitations of classical test theory assumptions. A random sample of 649 12-year-old school children in Hong Kong was analyzed. Unidimensionality of the scale was tested by confirmatory factor analysis (CFA), principle component analysis (PCA) and local dependency (LD) statistic. Graded response model was fitted to the data. Contribution of each item to the scale was assessed by item information function (IIF). Reliability of the scale was assessed by test information function (TIF). Differential item functioning (DIF) across gender was identified by Wald test and expected score functions. Both CPQ11-14 RSF:8 and ISF:8 did not deviate much from the unidimensionality assumption. Results from CFA indicated acceptable fit of the one-factor model. PCA indicated that the first principle component explained >30 % of the total variation with high factor loadings for both RSF:8 and ISF:8. Almost all LD statistic items suggesting little contribution of information to the scale and item removal caused little practical impact. Comparing the TIFs, RSF:8 showed slightly better information than ISF:8. In addition to oral symptoms items, the item "Concerned with what other people think" demonstrated a uniform DIF (p Items related to oral symptoms were not informative to OHRQoL and deletion of these items is suggested. The impact of DIF across gender on the overall score was minimal. CPQ11-14 RSF:8 performed slightly better than ISF:8 in measurement precision. The 6-item short forms

  11. Harmonizing Measures of Cognitive Performance Across International Surveys of Aging Using Item Response Theory.

    Science.gov (United States)

    Chan, Kitty S; Gross, Alden L; Pezzin, Liliana E; Brandt, Jason; Kasper, Judith D

    2015-12-01

    To harmonize measures of cognitive performance using item response theory (IRT) across two international aging studies. Data for persons ≥65 years from the Health and Retirement Study (HRS, N = 9,471) and the English Longitudinal Study of Aging (ELSA, N = 5,444). Cognitive performance measures varied (HRS fielded 25, ELSA 13); 9 were in common. Measurement precision was examined for IRT scores based on (a) common items, (b) common items adjusted for differential item functioning (DIF), and (c) DIF-adjusted all items. Three common items (day of date, immediate word recall, and delayed word recall) demonstrated DIF by survey. Adding survey-specific items improved precision but mainly for HRS respondents at lower cognitive levels. IRT offers a feasible strategy for harmonizing cognitive performance measures across other surveys and for other multi-item constructs of interest in studies of aging. Practical implications depend on sample distribution and the difficulty mix of in-common and survey-specific items. © The Author(s) 2015.

  12. Comparing of four IRT models when analyzing two tests for inductive reasoning

    NARCIS (Netherlands)

    de Koning, E.; Sijtsma, K.; Hamers, J.H.M.

    2002-01-01

    This article discusses the use of the nonparametric IRT Mokken models of monotone homogeneity and double monotonicity and the parametric Rasch and Verhelst models for the analysis of binary test data. First, the four IRT models are discussed and compared at the theoretical level, and for each model,

  13. Modeling Composite Assessment Data Using Item Response Theory

    Science.gov (United States)

    Ueckert, Sebastian

    2018-01-01

    Composite assessments aim to combine different aspects of a disease in a single score and are utilized in a variety of therapeutic areas. The data arising from these evaluations are inherently discrete with distinct statistical properties. This tutorial presents the framework of the item response theory (IRT) for the analysis of this data type in a pharmacometric context. The article considers both conceptual (terms and assumptions) and practical questions (modeling software, data requirements, and model building). PMID:29493119

  14. Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.

    Science.gov (United States)

    Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald

    2006-11-01

    We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.

  15. Comparison of the parameters of the IR-8 reactor with different fuel assembly designs with LEU fuel

    International Nuclear Information System (INIS)

    Vatulin, A.; Stetsky, Y.; Dobrikova, I.

    1999-01-01

    The estimation of neutron-physical, heat and hydraulic parameters of the IR-8 research reactor with low enriched uranium (LEU) fuel was performed. Two fuel assembly (FA) designs were reviewed: IRT-4M with the tubular type fuel elements and IRT-MR with the rod type fuel elements. UO 2 -Al dispersion 19.75% enrichment fuel is used in both cases. The results of the calculations were compared with main parameters of the reactor, using the current IRT-3M FA with 90% high enriched uranium (HEU) fuel. The results of these comparisons showed that during the LEU conversion of the reactor the cycle length, excess reactivity and peak power of the IRT-MR type FA are higher than for the IRT-3M type FA and IRT-4M type FA. (author)

  16. Caffeine use disorder: An item-response theory analysis of proposed DSM-5 criteria.

    Science.gov (United States)

    Ágoston, Csilla; Urbán, Róbert; Richman, Mara J; Demetrovics, Zsolt

    2018-06-01

    Caffeine is a common psychoactive substance with a documented addictive potential. Caffeine withdrawal has been included in the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), but caffeine use disorder (CUD) is considered to be a condition for further study. The aim of the current study is (1) to test the psychometric properties of the Caffeine Use Disorder Questionnaire (CUDQ) by using a confirmatory factor analysis and an item response theory (IRT) approach, (2) to compare IRT models with varying numbers of parameters and models with or without caffeine consumption criteria, and (3) to examine if the total daily caffeine consumption and the use of different caffeinated products can predict the magnitude of CUD symptomatology. A cross-sectional study was conducted on an adult sample (N = 2259). Participants answered several questions regarding their caffeine consumption habits and completed the CUDQ, which incorporates the nine proposed criteria of the DSM-5 as well as one additional item regarding the suffering caused by the symptoms. Factor analyses demonstrated the unidimensionality of the CUDQ. The suffering criterion had the highest discriminative value at a higher degree of latent trait. The criterion of failure to fulfill obligations and social/interpersonal problems discriminate only at the higher value of CUD latent factor, while endorsement the consumption of more caffeine or longer than intended and craving criteria were discriminative at a lower level of CUD. Total daily caffeine intake was related to a higher level of CUD. Daily coffee, energy drink, and cola intake as dummy variables were associated with the presence of more CUD symptoms, while daily tea consumption as a dummy variable was related to less CUD symptoms. Regular smoking was associated with more CUD symptoms, which was explained by a larger caffeine consumption. The IRT approach helped to determine which CUD symptoms indicate more severity and have a greater

  17. Negative affectivity in cardiovascular disease: Evaluating Type D personality assessment using item response theory

    NARCIS (Netherlands)

    Emons, Wilco H.M.; Meijer, R.R.; Denollet, Johan

    2007-01-01

    Objective: Individuals with increased levels of both negative affectivity (NA) and social inhibition (SI)—referred to as type-D personality—are at increased risk of adverse cardiac events. We used item response theory (IRT) to evaluate NA, SI, and type-D personality as measured by the DS14. The

  18. Comparison of examination grades using item response theory : a case study

    NARCIS (Netherlands)

    Korobko, O.B.

    2007-01-01

    In item response theory (IRT), mathematical models are applied to analyze data from tests and questionnaires used to measure abilities, proficiency, personality traits and attitudes. This thesis is concerned with comparison of subjects, students and schools based on average examination grades using

  19. Bayesian modeling of measurement error in predictor variables using item response theory

    NARCIS (Netherlands)

    Fox, Gerardus J.A.; Glas, Cornelis A.W.

    2000-01-01

    This paper focuses on handling measurement error in predictor variables using item response theory (IRT). Measurement error is of great important in assessment of theoretical constructs, such as intelligence or the school climate. Measurement error is modeled by treating the predictors as unobserved

  20. A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.

    Science.gov (United States)

    Glas, Cees A. W.; Meijer, Rob R.

    A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…

  1. Application of Item Response Theory to Modeling of Expanded Disability Status Scale in Multiple Sclerosis.

    NARCIS (Netherlands)

    Novakovic, A.M.; Krekels, E.H.; Munafo, A.; Ueckert, S.; Karlsson, M.O.

    2016-01-01

    In this study, we report the development of the first item response theory (IRT) model within a pharmacometrics framework to characterize the disease progression in multiple sclerosis (MS), as measured by Expanded Disability Status Score (EDSS). Data were collected quarterly from a 96-week phase III

  2. Mokken scale analysis : Between the Guttman scale and parametric item response theory

    NARCIS (Netherlands)

    van Schuur, Wijbrandt H.

    2003-01-01

    This article introduces a model of ordinal unidimensional measurement known as Mokken scale analysis. Mokken scaling is based on principles of Item Response Theory (IRT) that originated in the Guttman scale. I compare the Mokken model with both Classical Test Theory (reliability or factor analysis)

  3. Application of item response theory to achieve cross-cultural comparability of occupational stress measurement

    NARCIS (Netherlands)

    Tsutsumi, A.; Iwata, N.; Watanabe, N.; Jonge, de J.; Pikhart, H.; Férnandez-López, J.A.; Xu, Liying; Peter, R.; Knutsson, A.; Niedhammer, I.; Kawakami, N.; Siegrist, J.

    2009-01-01

    Our objective was to examine cross-cultural comparability of standard scales of the Effort-Reward Imbalance occupational stress scales by item response theory (IRT) analyses. Data were from 20,256 Japanese employees, 1464 Dutch nurses and nurses' aides, 2128 representative employees from

  4. General mixture item response models with different item response structures: Exposition with an application to Likert scales.

    Science.gov (United States)

    Tijmstra, Jesper; Bolsinova, Maria; Jeon, Minjeong

    2018-01-10

    This article proposes a general mixture item response theory (IRT) framework that allows for classes of persons to differ with respect to the type of processes underlying the item responses. Through the use of mixture models, nonnested IRT models with different structures can be estimated for different classes, and class membership can be estimated for each person in the sample. If researchers are able to provide competing measurement models, this mixture IRT framework may help them deal with some violations of measurement invariance. To illustrate this approach, we consider a two-class mixture model, where a person's responses to Likert-scale items containing a neutral middle category are either modeled using a generalized partial credit model, or through an IRTree model. In the first model, the middle category ("neither agree nor disagree") is taken to be qualitatively similar to the other categories, and is taken to provide information about the person's endorsement. In the second model, the middle category is taken to be qualitatively different and to reflect a nonresponse choice, which is modeled using an additional latent variable that captures a person's willingness to respond. The mixture model is studied using simulation studies and is applied to an empirical example.

  5. Lawton IADL scale in dementia: can item response theory make it more informative?

    Science.gov (United States)

    McGrory, Sarah; Shenkin, Susan D; Austin, Elizabeth J; Starr, John M

    2014-07-01

    impairment of functional abilities represents a crucial component of dementia diagnosis. Current functional measures rely on the traditional aggregate method of summing raw scores. While this summary score provides a quick representation of a person's ability, it disregards useful information on the item level. to use item response theory (IRT) methods to increase the interpretive power of the Lawton Instrumental Activities of Daily Living (IADL) scale by establishing a hierarchy of item 'difficulty' and 'discrimination'. this cross-sectional study applied IRT methods to the analysis of IADL outcomes. Participants were 202 members of the Scottish Dementia Research Interest Register (mean age = 76.39, range = 56-93, SD = 7.89 years) with complete itemised data available. a Mokken scale with good reliability (Molenaar Sijtsama statistic 0.79) was obtained, satisfying the IRT assumption that the items comprise a single unidimensional scale. The eight items in the scale could be placed on a hierarchy of 'difficulty' (H coefficient = 0.55), with 'Shopping' being the most 'difficult' item and 'Telephone use' being the least 'difficult' item. 'Shopping' was the most discriminatory item differentiating well between patients of different levels of ability. IRT methods are capable of providing more information about functional impairment than a summed score. 'Shopping' and 'Telephone use' were identified as items that reveal key information about a patient's level of ability, and could be useful screening questions for clinicians. © The Author 2013. Published by Oxford University Press on behalf of the British Geriatrics Society. All rights reserved. For Permissions, please email: journals.permissions@ oup.com.

  6. Item response theory in the production of indicators of socioeconomic metropolitan region of Maringá, Paraná State, Brazil - doi: 10.4025/actascitechnol.v34i4.10478

    Directory of Open Access Journals (Sweden)

    Vanessa Rufino da Silva

    2012-10-01

    Full Text Available This study aimed to identify and produce through models of Item Response Theory (IRT a socio-economic indicator based in the items observed in 2000 Census, following the methodology by Soares (2005. By the IRT Methodology, this indicator, as a latent variable, is obtained through the construction of specific models and scales, making it possible to measure this variable, which according to Andrade et al. (2000, IRT analyzes each item which compose the measuring instrument. This case consists of binary or dichotomous items, which assess the possession of certain assets of domestic comfort. The characteristics of each item were analyzed, as the ability to discrimination and income necessary for the possession of certain property. It was concluded that with 13 items, a trustworthy questionnaire can be done for the construction of a socioeconomic index of Maringa’s metropolitan region.

  7. Magnetic evaluation of hydrogen pressures changes on MHD fluctuations in IR-T1 tokamak plasma

    Science.gov (United States)

    Alipour, Ramin; Ghanbari, Mohamad R.

    2018-04-01

    Identification of tokamak plasma parameters and investigation on the effects of each parameter on the plasma characteristics is important for the better understanding of magnetohydrodynamic (MHD) activities in the tokamak plasma. The effect of different hydrogen pressures of 1.9, 2.5 and 2.9 Torr on MHD fluctuations of the IR-T1 tokamak plasma was investigated by using of 12 Mirnov coils, singular value decomposition and wavelet analysis. The parameters such as plasma current, loop voltage, power spectrum density, energy percent of poloidal modes, dominant spatial structures and temporal structures of poloidal modes at different plasma pressures are plotted. The results indicate that the MHD activities at the pressure of 2.5 Torr are less than them at other pressures. It also has been shown that in the stable area of plasma and at the pressure of 2.5 Torr, the magnetic force and the force of plasma pressure are in balance with each other and the MHD activities are at their lowest level.

  8. Sequential Objective Structured Clinical Examination based on item response theory in Iran

    Directory of Open Access Journals (Sweden)

    Sara Mortaz Hejri

    2017-09-01

    Full Text Available Purpose In a sequential objective structured clinical examination (OSCE, all students initially take a short screening OSCE. Examinees who pass are excused from further testing, but an additional OSCE is administered to the remaining examinees. Previous investigations of sequential OSCE were based on classical test theory. We aimed to design and evaluate screening OSCEs based on item response theory (IRT. Methods We carried out a retrospective observational study. At each station of a 10-station OSCE, the students’ performance was graded on a Likert-type scale. Since the data were polytomous, the difficulty parameters, discrimination parameters, and students’ ability were calculated using a graded response model. To design several screening OSCEs, we identified the 5 most difficult stations and the 5 most discriminative ones. For each test, 5, 4, or 3 stations were selected. Normal and stringent cut-scores were defined for each test. We compared the results of each of the 12 screening OSCEs to the main OSCE and calculated the positive and negative predictive values (PPV and NPV, as well as the exam cost. Results A total of 253 students (95.1% passed the main OSCE, while 72.6% to 94.4% of examinees passed the screening tests. The PPV values ranged from 0.98 to 1.00, and the NPV values ranged from 0.18 to 0.59. Two tests effectively predicted the results of the main exam, resulting in financial savings of 34% to 40%. Conclusion If stations with the highest IRT-based discrimination values and stringent cut-scores are utilized in the screening test, sequential OSCE can be an efficient and convenient way to conduct an OSCE.

  9. Sequential Objective Structured Clinical Examination based on item response theory in Iran.

    Science.gov (United States)

    Hejri, Sara Mortaz; Jalili, Mohammad

    2017-01-01

    In a sequential objective structured clinical examination (OSCE), all students initially take a short screening OSCE. Examinees who pass are excused from further testing, but an additional OSCE is administered to the remaining examinees. Previous investigations of sequential OSCE were based on classical test theory. We aimed to design and evaluate screening OSCEs based on item response theory (IRT). We carried out a retrospective observational study. At each station of a 10-station OSCE, the students' performance was graded on a Likert-type scale. Since the data were polytomous, the difficulty parameters, discrimination parameters, and students' ability were calculated using a graded response model. To design several screening OSCEs, we identified the 5 most difficult stations and the 5 most discriminative ones. For each test, 5, 4, or 3 stations were selected. Normal and stringent cut-scores were defined for each test. We compared the results of each of the 12 screening OSCEs to the main OSCE and calculated the positive and negative predictive values (PPV and NPV), as well as the exam cost. A total of 253 students (95.1%) passed the main OSCE, while 72.6% to 94.4% of examinees passed the screening tests. The PPV values ranged from 0.98 to 1.00, and the NPV values ranged from 0.18 to 0.59. Two tests effectively predicted the results of the main exam, resulting in financial savings of 34% to 40%. If stations with the highest IRT-based discrimination values and stringent cut-scores are utilized in the screening test, sequential OSCE can be an efficient and convenient way to conduct an OSCE.

  10. Item response theory scoring and the detection of curvilinear relationships.

    Science.gov (United States)

    Carter, Nathan T; Dalal, Dev K; Guan, Li; LoPilato, Alexander C; Withrow, Scott A

    2017-03-01

    Psychologists are increasingly positing theories of behavior that suggest psychological constructs are curvilinearly related to outcomes. However, results from empirical tests for such curvilinear relations have been mixed. We propose that correctly identifying the response process underlying responses to measures is important for the accuracy of these tests. Indeed, past research has indicated that item responses to many self-report measures follow an ideal point response process-wherein respondents agree only to items that reflect their own standing on the measured variable-as opposed to a dominance process, wherein stronger agreement, regardless of item content, is always indicative of higher standing on the construct. We test whether item response theory (IRT) scoring appropriate for the underlying response process to self-report measures results in more accurate tests for curvilinearity. In 2 simulation studies, we show that, regardless of the underlying response process used to generate the data, using the traditional sum-score generally results in high Type 1 error rates or low power for detecting curvilinearity, depending on the distribution of item locations. With few exceptions, appropriate power and Type 1 error rates are achieved when dominance-based and ideal point-based IRT scoring are correctly used to score dominance and ideal point response data, respectively. We conclude that (a) researchers should be theory-guided when hypothesizing and testing for curvilinear relations; (b) correctly identifying whether responses follow an ideal point versus dominance process, particularly when items are not extreme is critical; and (c) IRT model-based scoring is crucial for accurate tests of curvilinearity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  11. A new Integrated Negative Symptom structure of the Positive and Negative Syndrome Scale (PANSS) in schizophrenia using item response analysis.

    Science.gov (United States)

    Khan, Anzalee; Lindenmayer, Jean-Pierre; Opler, Mark; Yavorsky, Christian; Rothman, Brian; Lucic, Luka

    2013-10-01

    Debate persists with regard to how best to categorize the syndromal dimension of negative symptoms in schizophrenia. The aim was to first review published Principle Components Analysis (PCA) of the PANSS, and extract items most frequently included in the negative domain, and secondly, to examine the quality of items using Item Response Theory (IRT) to select items that best represent a measurable dimension (or dimensions) of negative symptoms. First, 22 factor analyses and PCA met were included. Second, using a large dataset (n=7187) of participants in clinical trials with chronic schizophrenia, we extracted items loading on one or more PCA. Third, items not loading with a value of ≥ 0.5, or loading on more than one component with values of ≥ 0.5 were discarded. Fourth, resulting items were included in a non-parametric IRT and retained based on Option Characteristic Curves (OCCs) and Item Characteristic Curves (ICCs). 15 items loaded on a negative domain in at least one study, with Emotional Withdrawal loading on all studies. Non-parametric IRT retained nine items as an Integrated Negative Factor: Emotional Withdrawal, Blunted Affect, Passive/Apathetic Social Withdrawal, Poor Rapport, Lack of Spontaneity/Conversation Flow, Active Social Avoidance, Disturbance of Volition, Stereotyped Thinking and Difficulty in Abstract Thinking. This is the first study to use a psychometric IRT process to arrive at a set of negative symptom items. Future steps will include further examination of these nine items in terms of their stability, sensitivity to change, and correlations with functional and cognitive outcomes. © 2013 Elsevier B.V. All rights reserved.

  12. Alzheimer's Disease Assessment: A Review and Illustrations Focusing on Item Response Theory Techniques.

    Science.gov (United States)

    Balsis, Steve; Choudhury, Tabina K; Geraci, Lisa; Benge, Jared F; Patrick, Christopher J

    2018-04-01

    Alzheimer's disease (AD) affects neurological, cognitive, and behavioral processes. Thus, to accurately assess this disease, researchers and clinicians need to combine and incorporate data across these domains. This presents not only distinct methodological and statistical challenges but also unique opportunities for the development and advancement of psychometric techniques. In this article, we describe relatively recent research using item response theory (IRT) that has been used to make progress in assessing the disease across its various symptomatic and pathological manifestations. We focus on applications of IRT to improve scoring, test development (including cross-validation and adaptation), and linking and calibration. We conclude by describing potential future multidimensional applications of IRT techniques that may improve the precision with which AD is measured.

  13. Decommissioning of the research nuclear reactor IRT-M and problems connected with radioactive waste

    International Nuclear Information System (INIS)

    Abramidze, S.P.; Katamadze, N.M.; Kiknadze, G.G.; Saralidze, Z.K.

    2000-01-01

    The nuclear research reactor IRT-2000 is described, along with modifications and upgrades made over the past three decades. Considerations are outlined which followed a decision to shut-down the reactor and to dismantle it. (author)

  14. Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.

    Science.gov (United States)

    Cappelleri, Joseph C; Jason Lundy, J; Hays, Ron D

    2014-05-01

    The US Food and Drug Administration's guidance for industry document on patient-reported outcomes (PRO) defines content validity as "the extent to which the instrument measures the concept of interest" (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity" (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. We review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized "difficulty" (severity) order of items is represented by observed responses. If a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. Classical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures. Copyright © 2014 Elsevier HS Journals, Inc. All rights reserved.

  15. A new network of faint calibration stars from the near infrared spectrometer (NIRS) on the IRTS

    Science.gov (United States)

    Freund, Minoru M.; Matsuura, Mikako; Murakami, Hiroshi; Cohen, Martin; Noda, Manabu; Matsuura, Shuji; Matsumoto, Toshio

    1997-01-01

    The point source extraction and calibration of the near infrared spectrometer (NIRS) onboard the Infrared Telescope in Space (IRTS) is described. About 7 percent of the sky was observed during a one month mission in the range of 1.4 micrometers to 4 micrometers. The accuracy of the spectral shape and absolute values of calibration stars provided by the NIRS/IRTS were validated.

  16. Development and psychometric evaluation of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions.

    Science.gov (United States)

    Forrest, Christopher B; Devine, Janine; Bevans, Katherine B; Becker, Brandon D; Carle, Adam C; Teneralli, Rachel E; Moon, JeanHee; Tucker, Carole A; Ravens-Sieberer, Ulrike

    2018-01-01

    To describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions. A pool of 55 life satisfaction items was administered to 1992 children 8-17 years old and 964 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and assessment of construct validity. Thirteen items were deleted because of poor psychometric performance. An 8-item short form was administered to a national sample of 996 children 8-17 years old, and 1294 parents of children 5-17 years old. The combined sample (2988 children and 2258 parents) was used in item response theory (IRT) calibration analyses. The final item banks were unidimensional, the items were locally independent, and the items were free from impactful differential item functioning. The 8-item and 4-item short form scales showed excellent reliability, convergent validity, and discriminant validity. Life satisfaction decreased with declining socio-economic status, presence of a special health care need, and increasing age for girls, but not boys. After IRT calibration, we found that 4- and 8-item short forms had a high degree of precision (reliability) across a wide range (>4 SD units) of the latent variable. The PROMIS Pediatric Life Satisfaction item banks and their short forms provide efficient, precise, and valid assessments of life satisfaction in children and youth.

  17. Development and Evaluation of the PROMIS® Pediatric Positive Affect Item Bank, Child-Report and Parent-Proxy Editions.

    Science.gov (United States)

    Forrest, Christopher B; Ravens-Sieberer, Ulrike; Devine, Janine; Becker, Brandon D; Teneralli, Rachel; Moon, JeanHee; Carle, Adam; Tucker, Carole A; Bevans, Katherine B

    2018-03-01

    The purpose of this study is to describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Positive Affect item bank, child-report and parent-proxy editions. The initial item pool comprising 53 items, previously developed using qualitative methods, was administered to 1,874 children 8-17 years old and 909 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and construct validity. A total of 14 items were deleted, because of poor psychometric performance, and an 8-item short form constructed from the remaining 39 items was administered to a national sample of 1,004 children 8-17 years old, and 1,306 parents of children 5-17 years old. The combined sample was used in item response theory (IRT) calibration analyses. The final item bank appeared unidimensional, the items appeared locally independent, and the items were free from differential item functioning. The scales showed excellent reliability and convergent and discriminant validity. Positive affect decreased with children's age and was lower for those with a special health care need. After IRT calibration, we found that 4 and 8 item short forms had a high degree of precision (reliability) across a wide range of the latent trait (>4 SD units). The PROMIS Pediatric Positive Affect item bank and its short forms provide an efficient, precise, and valid assessment of positive affect in children and youth.

  18. Item response theory, computerized adaptive testing, and PROMIS: assessment of physical function.

    Science.gov (United States)

    Fries, James F; Witter, James; Rose, Matthias; Cella, David; Khanna, Dinesh; Morgan-DeWitt, Esi

    2014-01-01

    Patient-reported outcome (PRO) questionnaires record health information directly from research participants because observers may not accurately represent the patient perspective. Patient-reported Outcomes Measurement Information System (PROMIS) is a US National Institutes of Health cooperative group charged with bringing PRO to a new level of precision and standardization across diseases by item development and use of item response theory (IRT). With IRT methods, improved items are calibrated on an underlying concept to form an item bank for a "domain" such as physical function (PF). The most informative items can be combined to construct efficient "instruments" such as 10-item or 20-item PF static forms. Each item is calibrated on the basis of the probability that a given person will respond at a given level, and the ability of the item to discriminate people from one another. Tailored forms may cover any desired level of the domain being measured. Computerized adaptive testing (CAT) selects the best items to sharpen the estimate of a person's functional ability, based on prior responses to earlier questions. PROMIS item banks have been improved with experience from several thousand items, and are calibrated on over 21,000 respondents. In areas tested to date, PROMIS PF instruments are superior or equal to Health Assessment Questionnaire and Medical Outcome Study Short Form-36 Survey legacy instruments in clarity, translatability, patient importance, reliability, and sensitivity to change. Precise measures, such as PROMIS, efficiently incorporate patient self-report of health into research, potentially reducing research cost by lowering sample size requirements. The advent of routine IRT applications has the potential to transform PRO measurement.

  19. Measuring stigma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Stigma item bank and short form.

    Science.gov (United States)

    Kisala, Pamela A; Tulsky, David S; Pace, Natalie; Victorson, David; Choi, Seung W; Heinemann, Allen W

    2015-05-01

    To develop a calibrated item bank and computer adaptive test (CAT) to assess the effects of stigma on health-related quality of life in individuals with spinal cord injury (SCI). Grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, and item response theory (IRT)-based psychometric analyses. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. SCI-QOL Stigma Item Bank A sample of 611 individuals with traumatic SCI completed 30 items assessing SCI-related stigma. After 7 items were iteratively removed, factor analyses confirmed a unidimensional pool of items. Graded Response Model IRT analyses were used to estimate slopes and thresholds for the final 23 items. The SCI-QOL Stigma item bank is unique not only in the assessment of SCI-related stigma but also in the inclusion of individuals with SCI in all phases of its development. Use of confirmatory factor analytic and IRT methods provide flexibility and precision of measurement. The item bank may be administered as a CAT or as a 10-item fixed-length short form and can be used for research and clinical applications.

  20. Item response theory analysis of the Lichtenberg Financial Decision Screening Scale.

    Science.gov (United States)

    Teresi, Jeanne A; Ocepek-Welikson, Katja; Lichtenberg, Peter A

    2017-01-01

    The focus of these analyses was to examine the psychometric properties of the Lichtenberg Financial Decision Screening Scale (LFDSS). The purpose of the screen was to evaluate the decisional abilities and vulnerability to exploitation of older adults. Adults aged 60 and over were interviewed by social, legal, financial, or health services professionals who underwent in-person training on the administration and scoring of the scale. Professionals provided a rating of the decision-making abilities of the older adult. The analytic sample included 213 individuals with an average age of 76.9 (SD = 10.1). The majority (57%) were female. Data were analyzed using item response theory (IRT) methodology. The results supported the unidimensionality of the item set. Several IRT models were tested. Ten ordinal and binary items evidenced a slightly higher reliability estimate (0.85) than other versions and better coverage in terms of the range of reliable measurement across the continuum of financial incapacity.

  1. Recommendations to improve the positive and negative syndrome scale (PANSS) based on item response theory.

    Science.gov (United States)

    Levine, Stephen Z; Rabinowitz, Jonathan; Rizopoulos, Dimitris

    2011-08-15

    The adequacy of the Positive and Negative Syndrome Scale (PANSS) items in measuring symptom severity in schizophrenia was examined using Item Response Theory (IRT). Baseline PANSS assessments were analyzed from two multi-center clinical trials of antipsychotic medication in chronic schizophrenia (n=1872). Generally, the results showed that the PANSS (a) item ratings discriminated symptom severity best for the negative symptoms; (b) has an excess of "Severe" and "Extremely severe" rating options; and (c) assessments are more reliable at medium than very low or high levels of symptom severity. Analysis also showed that the detection of statistically and non-statistically significant differences in treatment were highly similar for the original and IRT-modified PANSS. In clinical trials of chronic schizophrenia, the PANSS appears to require the following modifications: fewer rating options, adjustment of 'Lack of judgment and insight', and improved severe symptom assessment. 2011 Elsevier Ltd. All rights reserved.

  2. Use of NON-PARAMETRIC Item Response Theory to develop a shortened version of the Positive and Negative Syndrome Scale (PANSS)

    Science.gov (United States)

    2011-01-01

    Background Nonparametric item response theory (IRT) was used to examine (a) the performance of the 30 Positive and Negative Syndrome Scale (PANSS) items and their options ((levels of severity), (b) the effectiveness of various subscales to discriminate among differences in symptom severity, and (c) the development of an abbreviated PANSS (Mini-PANSS) based on IRT and a method to link scores to the original PANSS. Methods Baseline PANSS scores from 7,187 patients with Schizophrenia or Schizoaffective disorder who were enrolled between 1995 and 2005 in psychopharmacology trials were obtained. Option characteristic curves (OCCs) and Item Characteristic Curves (ICCs) were constructed to examine the probability of rating each of seven options within each of 30 PANSS items as a function of subscale severity, and summed-score linking was applied to items selected for the Mini-PANSS. Results The majority of items forming the Positive and Negative subscales (i.e. 19 items) performed very well and discriminate better along symptom severity compared to the General Psychopathology subscale. Six of the seven Positive Symptom items, six of the seven Negative Symptom items, and seven out of the 16 General Psychopathology items were retained for inclusion in the Mini-PANSS. Summed score linking and linear interpolation was able to produce a translation table for comparing total subscale scores of the Mini-PANSS to total subscale scores on the original PANSS. Results show scores on the subscales of the Mini-PANSS can be linked to scores on the original PANSS subscales, with very little bias. Conclusions The study demonstrated the utility of non-parametric IRT in examining the item properties of the PANSS and to allow selection of items for an abbreviated PANSS scale. The comparisons between the 30-item PANSS and the Mini-PANSS revealed that the shorter version is comparable to the 30-item PANSS, but when applying IRT, the Mini-PANSS is also a good indicator of illness severity

  3. Use of non-parametric item response theory to develop a shortened version of the Positive and Negative Syndrome Scale (PANSS).

    Science.gov (United States)

    Khan, Anzalee; Lewis, Charles; Lindenmayer, Jean-Pierre

    2011-11-16

    Nonparametric item response theory (IRT) was used to examine (a) the performance of the 30 Positive and Negative Syndrome Scale (PANSS) items and their options ((levels of severity), (b) the effectiveness of various subscales to discriminate among differences in symptom severity, and (c) the development of an abbreviated PANSS (Mini-PANSS) based on IRT and a method to link scores to the original PANSS. Baseline PANSS scores from 7,187 patients with Schizophrenia or Schizoaffective disorder who were enrolled between 1995 and 2005 in psychopharmacology trials were obtained. Option characteristic curves (OCCs) and Item Characteristic Curves (ICCs) were constructed to examine the probability of rating each of seven options within each of 30 PANSS items as a function of subscale severity, and summed-score linking was applied to items selected for the Mini-PANSS. The majority of items forming the Positive and Negative subscales (i.e. 19 items) performed very well and discriminate better along symptom severity compared to the General Psychopathology subscale. Six of the seven Positive Symptom items, six of the seven Negative Symptom items, and seven out of the 16 General Psychopathology items were retained for inclusion in the Mini-PANSS. Summed score linking and linear interpolation was able to produce a translation table for comparing total subscale scores of the Mini-PANSS to total subscale scores on the original PANSS. Results show scores on the subscales of the Mini-PANSS can be linked to scores on the original PANSS subscales, with very little bias. The study demonstrated the utility of non-parametric IRT in examining the item properties of the PANSS and to allow selection of items for an abbreviated PANSS scale. The comparisons between the 30-item PANSS and the Mini-PANSS revealed that the shorter version is comparable to the 30-item PANSS, but when applying IRT, the Mini-PANSS is also a good indicator of illness severity.

  4. Bayesian Analysis of Multidimensional Item Response Theory Models: A Discussion and Illustration of Three Response Style Models

    Science.gov (United States)

    Leventhal, Brian C.; Stone, Clement A.

    2018-01-01

    Interest in Bayesian analysis of item response theory (IRT) models has grown tremendously due to the appeal of the paradigm among psychometricians, advantages of these methods when analyzing complex models, and availability of general-purpose software. Possible models include models which reflect multidimensionality due to designed test structure,…

  5. Item Response Theory Analyses of the Parent and Teacher Ratings of the DSM-IV ADHD Rating Scale

    Science.gov (United States)

    Gomez, Rapson

    2008-01-01

    The graded response model (GRM), which is based on item response theory (IRT), was used to evaluate the psychometric properties of the inattention and hyperactivity/impulsivity symptoms in an ADHD rating scale. To accomplish this, parents and teachers completed the DSM-IV ADHD Rating Scale (DARS; Gomez et al., "Journal of Child Psychology and…

  6. Negative affectivity and social inhibition in cardiovascular disease: evaluating type-D personality and its assessment using item response theory.

    Science.gov (United States)

    Emons, Wilco H M; Meijer, Rob R; Denollet, Johan

    2007-07-01

    Individuals with increased levels of both negative affectivity (NA) and social inhibition (SI)-referred to as type-D personality-are at increased risk of adverse cardiac events. We used item response theory (IRT) to evaluate NA, SI, and type-D personality as measured by the DS14. The objectives of this study were (a) to evaluate the relative contribution of individual items to the measurement precision at the cutoff to distinguish type-D from non-type-D personality and (b) to investigate the comparability of NA, SI, and type-D constructs across the general population and clinical populations. Data from representative samples including 1316 respondents from the general population, 427 respondents diagnosed with coronary heart disease, and 732 persons suffering from hypertension were analyzed using the graded response IRT model. In Study 1, the information functions obtained in the IRT analysis showed that (a) all items had highest measurement precision around the cutoff and (b) items are most informative at the higher end of the scale. In Study 2, the IRT analysis showed that measurements were fairly comparable across the general population and clinical populations. The DS14 adequately measures NA and SI, with highest reliability in the trait range around the cutoff. The DS14 is a valid instrument to assess and compare type-D personality across clinical groups.

  7. Adult Attachment Ratings (AAR): an item response theory analysis.

    Science.gov (United States)

    Pilkonis, Paul A; Kim, Yookyung; Yu, Lan; Morse, Jennifer Q

    2014-01-01

    The Adult Attachment Ratings (AAR) include 3 scales for anxious, ambivalent attachment (excessive dependency, interpersonal ambivalence, and compulsive care-giving), 3 for avoidant attachment (rigid self-control, defensive separation, and emotional detachment), and 1 for secure attachment. The scales include items (ranging from 6-16 in their original form) scored by raters using a 3-point format (0 = absent, 1 = present, and 2 = strongly present) and summed to produce a total score. Item response theory (IRT) analyses were conducted with data from 414 participants recruited from psychiatric outpatient, medical, and community settings to identify the most informative items from each scale. The IRT results allowed us to shorten the scales to 5-item versions that are more precise and easier to rate because of their brevity. In general, the effective range of measurement for the scales was 0 to +2 SDs for each of the attachment constructs; that is, from average to high levels of attachment problems. Evidence for convergent and discriminant validity of the scales was investigated by comparing them with the Experiences of Close Relationships-Revised (ECR-R) scale and the Kobak Attachment Q-sort. The best consensus among self-reports on the ECR-R, informant ratings on the ECR-R, and expert judgments on the Q-sort and the AAR emerged for anxious, ambivalent attachment. Given the good psychometric characteristics of the scale for secure attachment, however, this measure alone might provide a simple alternative to more elaborate procedures for some measurement purposes. Conversion tables are provided for the 7 scales to facilitate transformation from raw scores to IRT-calibrated (theta) scores.

  8. Using item response theory to address vulnerabilities in FFQ.

    Science.gov (United States)

    Kazman, Josh B; Scott, Jonathan M; Deuster, Patricia A

    2017-09-01

    The limitations for self-reporting of dietary patterns are widely recognised as a major vulnerability of FFQ and the dietary screeners/scales derived from FFQ. Such instruments can yield inconsistent results to produce questionable interpretations. The present article discusses the value of psychometric approaches and standards in addressing these drawbacks for instruments used to estimate dietary habits and nutrient intake. We argue that a FFQ or screener that treats diet as a 'latent construct' can be optimised for both internal consistency and the value of the research results. Latent constructs, a foundation for item response theory (IRT)-based scales (e.g. Patient Reported Outcomes Measurement Information System) are typically introduced in the design stage of an instrument to elicit critical factors that cannot be observed or measured directly. We propose an iterative approach that uses such modelling to refine FFQ and similar instruments. To that end, we illustrate the benefits of psychometric modelling by using items and data from a sample of 12 370 Soldiers who completed the 2012 US Army Global Assessment Tool (GAT). We used factor analysis to build the scale incorporating five out of eleven survey items. An IRT-driven assessment of response category properties indicates likely problems in the ordering or wording of several response categories. Group comparisons, examined with differential item functioning (DIF), provided evidence of scale validity across each Army sub-population (sex, service component and officer status). Such an approach holds promise for future FFQ.

  9. Quality of life in the Danish general population--normative data and validity of WHOQOL-BREF using Rasch and item response theory models

    DEFF Research Database (Denmark)

    Noerholm, V; Groenvold, M; Watt, T

    2004-01-01

    BACKGROUND: The main objective of this study was to investigate the construct validity of the WHOQOL-BREF by use of Rasch and Item Response Theory models and to examine the stability of the model across high/low scoring individuals, gender, education, and depressive illness. Furthermore......, the objective of the study was to estimate the reference data for the quality of life questionnaire WHOQOL-BREF in the general Danish population and in subgroups defined by age, gender, and education. METHODS: Mail-out-mail-back questionnaires were sent to a randomly selected sample of the Danish general...... population. The response rate was 68.5%, and the sample reported here contained 1101 respondents: 578 women and 519 men (four respondents did not indicate their genders). RESULTS: Each of the four domains of the WHOQOL-BREF scale fitted a two-parameter IRT model, but did not fit the Rasch model. Due...

  10. Clinical Effect of IRT-5 Probiotics on Immune Modulation of Autoimmunity or Alloimmunity in the Eye

    Directory of Open Access Journals (Sweden)

    Jaeyoung Kim

    2017-10-01

    Full Text Available Background: Although the relation of the gut microbiota to a development of autoimmune and inflammatory diseases has been investigated in various animal models, there are limited studies that evaluate the effect of probiotics in the autoimmune eye disease. Therefore, we aimed to investigate the effect of IRT-5 probiotics consisting of Lactobacillus casei, Lactobacillus acidophilus, Lactobacillus reuteri, Bifidobacterium bifidum, and Streptococcus thermophilus on the autoimmunity of uveitis and dry eye and alloimmunity of corneal transplantation. Methods: Experimental autoimmune uveitis was induced by subcutaneous immunization with interphotoreceptor-binding protein and intraperitoneal injection of pertussis toxin in C57BL/6 (B6 mice. For an autoimmune dry eye model, 12-weeks-old NOD.B10.H2b mice were used. Donor cornea of B6 mice was transplanted into BALB/C mice. IRT-5 probiotics or phosphate buffered saline (PBS were administered for three weeks immediately after induction of uveitis or transplantation. The inflammation score of the retinal tissues, dry eye manifestations (corneal staining and tear secretion, and graft survival were measured in each model. The changes of T cells were evaluated in drainage lymph nodes using fluorescence-activated cell sorting. Results: Retinal histology score in IRT-5 group of uveitis was lower than that in PBS group (p = 0.045. Ocular staining score was lower (p < 0.0001 and tear secretion was higher (p < 0.0001 in the IRT-5 group of NOD.B10.H2b mice than that in the PBS group. However, the graft survival in the IRT-5 group was not different from those of PBS group. The percentage of regulatory T cells was increased in the IRT-5-treated dry eye models (p = 0.032. The percentage of CD8+IL-17hi (p = 0.027 and CD8+ interferon gamma (IFNγhi cells (p = 0.022 were significantly decreased in the IRT-5-treated uveitis models and the percentage of CD8+IFNγhi cells was markedly reduced (p = 0.036 in IRT-5-treated dry

  11. An item response theory analysis of the Olweus Bullying scale.

    Science.gov (United States)

    Breivik, Kyrre; Olweus, Dan

    2014-12-02

    In the present article, we used IRT (graded response) modeling as a useful technology for a detailed and refined study of the psychometric properties of the various items of the Olweus Bullying scale and the scale itself. The sample consisted of a very large number of Norwegian 4th-10th grade students (n = 48 926). The IRT analyses revealed that the scale was essentially unidimensional and had excellent reliability in the upper ranges of the latent bullying tendency trait, as intended and desired. Gender DIF effects were identified with regard to girls' use of indirect bullying by social exclusion and boys' use of physical bullying by hitting and kicking but these effects were small and worked in opposite directions, having negligible effects at the scale level. Also scale scores adjusted for DIF effects differed very little from non-adjusted scores. In conclusion, the empirical data were well characterized by the chosen IRT model and the Olweus Bullying scale was considered well suited for the conduct of fair and reliable comparisons involving different gender-age groups. Information Aggr. Behav. 9999:XX-XX, 2014. © 2014 Wiley Periodicals, Inc. © 2014 Wiley Periodicals, Inc.

  12. Preliminary Analysis on the Management Options of IRT-DPRK Research Reactor

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Jung-Hyun; Kim, Minsoo; Hwang, Yongsoo [Korea Institute of Nuclear Nonproliferation and Control, Daejeon (Korea, Republic of)

    2015-05-15

    Although IRT-DPRK was upgraded several times, operation lifetime was already exhausted and thus management policy is needed to deal with the aging of IRT-DPRK. For example, IRT- 2000 type nuclear reactors in Georgia and Bulgaria had been shut down to refurbish or decommissioned to establish new low power facilities. However, the existing negotiations and agreements related to the nuclear issues on North Korea have been focused on the 'denuclearization', and thus the issues on the IRTDPRK were not handled. In recent, a group of USA scientists has suggested that IRT-DPRK should be refurbished to establish the 'Scientific cent for excellence' like the Cooperative Threat Reduction program applied in Russia and the former Soviet Union (FSU). In this paper, we examined the several options to manage the IRT-DPRK through the study of similar foreign cases. Due to the lack of the detailed and standardized information, it is impossible to suggest the best option at this moment. In order to do that, the further research on the detailed procedures, radioactive wastes, the standards of safety and security are needed.

  13. Software Note: Using BILOG for Fixed-Anchor Item Calibration

    Science.gov (United States)

    DeMars, Christine E.; Jurich, Daniel P.

    2012-01-01

    The nonequivalent groups anchor test (NEAT) design is often used to scale item parameters from two different test forms. A subset of items, called the anchor items or common items, are administered as part of both test forms. These items are used to adjust the item calibrations for any differences in the ability distributions of the groups taking…

  14. Item response theory analysis to evaluate reliability and minimal clinically important change of the Roland-Morris Disability Questionnaire in patients with severe disability due to back pain from vertebral compression fractures.

    Science.gov (United States)

    Lee, Minji K; Yost, Kathleen J; McDonald, Jennifer S; Dougherty, Ryne W; Vine, Roanna L; Kallmes, David F

    2017-06-01

    The majority of validation done on the Roland-Morris Disability Questionnaire (RMDQ) has been in patients with mild or moderate disability. There is paucity of research focusing on the psychometric quality of the RMDQ in patients with severe disability. To evaluate the psychometric quality of the RMDQ in patients with severe disability. Observational clinical study. The sample consisted of 214 patients with painful vertebral compression fractures who underwent vertebroplasty or kyphoplasty. The 23-item version of the RMDQ was completed at two time points: baseline and 30-day postintervention follow-up. With the two-parameter logistic unidimensional item response theory (IRT) analyses, we derived the range of scores that produced reliable measurement and investigated the minimal clinically important difference (MCID). Scores for 214 (100%) patients at baseline and 108 (50%) patients at follow-up did not meet the reliability criterion of 0.90 or higher, with the majority of patients having disability due to back pain that was too severe to be reliably measured by the RMDQ. Depending on methodology, MCID estimates ranged from 2 to 8 points and the proportion of patients classified as having experienced meaningful improvement ranged from 26% to 68%. A greater change in score was needed at the extreme ends of the score scale to be classified as having achieved MCID using IRT methods. Replacing items measuring moderate disability with items measuring severe disability could yield a version of the RMDQ that better targets patients with severe disability due to back pain. Improved precision in measuring disability would be valuable to clinicians who treat patients with greater functional impairments. Caution is needed when choosing criteria for interpreting meaningful change using the RMDQ. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Development and validation of an item response theory-based Social Responsiveness Scale short form.

    Science.gov (United States)

    Sturm, Alexandra; Kuhfeld, Megan; Kasari, Connie; McCracken, James T

    2017-09-01

    Research and practice in autism spectrum disorder (ASD) rely on quantitative measures, such as the Social Responsiveness Scale (SRS), for characterization and diagnosis. Like many ASD diagnostic measures, SRS scores are influenced by factors unrelated to ASD core features. This study further interrogates the psychometric properties of the SRS using item response theory (IRT), and demonstrates a strategy to create a psychometrically sound short form by applying IRT results. Social Responsiveness Scale analyses were conducted on a large sample (N = 21,426) of youth from four ASD databases. Items were subjected to item factor analyses and evaluation of item bias by gender, age, expressive language level, behavior problems, and nonverbal IQ. Item selection based on item psychometric properties, DIF analyses, and substantive validity produced a reduced item SRS short form that was unidimensional in structure, highly reliable (α = .96), and free of gender, age, expressive language, behavior problems, and nonverbal IQ influence. The short form also showed strong relationships with established measures of autism symptom severity (ADOS, ADI-R, Vineland). Degree of association between all measures varied as a function of expressive language. Results identified specific SRS items that are more vulnerable to non-ASD-related traits. The resultant 16-item SRS short form may possess superior psychometric properties compared to the original scale and emerge as a more precise measure of ASD core symptom severity, facilitating research and practice. Future research using IRT is needed to further refine existing measures of autism symptomatology. © 2017 Association for Child and Adolescent Mental Health.

  16. The effect of differential motivation on IRT linking

    NARCIS (Netherlands)

    Mittelhaëuser, M.A.; Béguin, A.A.; Sijtsma, K.

    2015-01-01

    The purpose of this study was to investigate whether simulated differential motivation between the stakes for operational tests and anchor items produces an invalid linking result if the Rasch model is used to link the operational tests. This was done for an external anchor design and a variation of

  17. An Aggregate IRT Procedure for Exploratory Factor Analysis

    NARCIS (Netherlands)

    Camilli, Gregory; Fox, Gerardus J.A.

    2015-01-01

    An aggregation strategy is proposed to potentially address practical limitation related to computing resources for two-level multidimensional item response theory (MIRT) models with large data sets. The aggregate model is derived by integration of the normal ogive model, and an adaptation of the

  18. An Aggregate IRT Procedure for Exploratory Factor Analysis

    Science.gov (United States)

    Camilli, Gregory; Fox, Jean-Paul

    2015-01-01

    An aggregation strategy is proposed to potentially address practical limitation related to computing resources for two-level multidimensional item response theory (MIRT) models with large data sets. The aggregate model is derived by integration of the normal ogive model, and an adaptation of the stochastic approximation expectation maximization…

  19. A New Extension of the Binomial Error Model for Responses to Items of Varying Difficulty in Educational Testing and Attitude Surveys.

    Directory of Open Access Journals (Sweden)

    James A Wiley

    Full Text Available We put forward a new item response model which is an extension of the binomial error model first introduced by Keats and Lord. Like the binomial error model, the basic latent variable can be interpreted as a probability of responding in a certain way to an arbitrarily specified item. For a set of dichotomous items, this model gives predictions that are similar to other single parameter IRT models (such as the Rasch model but has certain advantages in more complex cases. The first is that in specifying a flexible two-parameter Beta distribution for the latent variable, it is easy to formulate models for randomized experiments in which there is no reason to believe that either the latent variable or its distribution vary over randomly composed experimental groups. Second, the elementary response function is such that extensions to more complex cases (e.g., polychotomous responses, unfolding scales are straightforward. Third, the probability metric of the latent trait allows tractable extensions to cover a wide variety of stochastic response processes.

  20. Developing a short version of the Toronto Structured Interview for Alexithymia using item response theory.

    Science.gov (United States)

    Sekely, Angela; Taylor, Graeme J; Bagby, R Michael

    2018-03-17

    The Toronto Structured Interview for Alexithymia (TSIA) was developed to provide a structured interview method for assessing alexithymia. One drawback of this instrument is the amount of time it takes to administer and score. The current study used item response theory (IRT) methods to analyze data from a large heterogeneous multi-language sample (N = 842) to investigate whether a subset of items could be selected to create a short version of the instrument. Samejima's (1969) graded response model was used to fit the item responses. Items providing maximum information were retained in the short model, resulting in the elimination of 12-items from the original 24-items. Despite the 50% reduction in the number of items, 65.22% of the information was retained. Further studies are needed to validate the short version. A short version of the TSIA is potentially of practical value to clinicians and researchers with time constraints. Copyright © 2018. Published by Elsevier B.V.

  1. Developing an African youth psychosocial assessment: an application of item response theory.

    Science.gov (United States)

    Betancourt, Theresa S; Yang, Frances; Bolton, Paul; Normand, Sharon-Lise

    2014-06-01

    This study aimed to refine a dimensional scale for measuring psychosocial adjustment in African youth using item response theory (IRT). A 60-item scale derived from qualitative data was administered to 667 war-affected adolescents (55% female). Exploratory factor analysis (EFA) determined the dimensionality of items based on goodness-of-fit indices. Items with loadings less than 0.4 were dropped. Confirmatory factor analysis (CFA) was used to confirm the scale's dimensionality found under the EFA. Item discrimination and difficulty were estimated using a graded response model for each subscale using weighted least squares means and variances. Predictive validity was examined through correlations between IRT scores (θ) for each subscale and ratings of functional impairment. All models were assessed using goodness-of-fit and comparative fit indices. Fisher's Information curves examined item precision at different underlying ranges of each trait. Original scale items were optimized and reconfigured into an empirically-robust 41-item scale, the African Youth Psychosocial Assessment (AYPA). Refined subscales assess internalizing and externalizing problems, prosocial attitudes/behaviors and somatic complaints without medical cause. The AYPA is a refined dimensional assessment of emotional and behavioral problems in African youth with good psychometric properties. Validation studies in other cultures are recommended. Copyright © 2014 John Wiley & Sons, Ltd.

  2. General plan for the partial dismantling of the IRT-Sofia research reactor

    Directory of Open Access Journals (Sweden)

    Apostolov Tihomir G.

    2006-01-01

    Full Text Available After the decision of the Bulgarian Government to reconstruct it, the strategy concerning the IRT-Sofia Research Reactor is to partially dismantle the old systems and equipment. The removal of the reactor core and replacement of old equipment will not pose any significant problems. For a more efficient use of existing resources, there is a need for an engineering project which has been already prepared under the title "General Plan for the Partial Dismantling of Equipment at the IRT-Sofia as a Part of the Reconstruction into a Low Power RR".

  3. BayesTwin: An R Package for Bayesian Inference of Item-Level Twin Data

    Directory of Open Access Journals (Sweden)

    Inga Schwabe

    2017-11-01

    Full Text Available BayesTwin is an open-source R package that serves as a pipeline to the MCMC program JAGS to perform Bayesian inference on genetically-informative hierarchical twin data. Simultaneously to the biometric model, an item response theory (IRT measurement model is estimated, allowing analysis of the raw phenotypic (item-level data. The integration of such a measurement model is important since earlier research has shown that an analysis based on an aggregated measure (e.g., a sum-score based analysis can lead to an underestimation of heritability and the spurious finding of genotype-environment interactions. The package includes all common biometric and IRT models as well as functions that help plot relevant information or determine whether the analysis was performed well. Funding statement: Partly funded by the PROO grant 411-12-623 from the Netherlands Organisation for Scientific Research (NWO.

  4. Using classical test theory, item response theory, and Rasch measurement theory to evaluate patient-reported outcome measures: a comparison of worked examples.

    Science.gov (United States)

    Petrillo, Jennifer; Cano, Stefan J; McLeod, Lori D; Coon, Cheryl D

    2015-01-01

    To provide comparisons and a worked example of item- and scale-level evaluations based on three psychometric methods used in patient-reported outcome development-classical test theory (CTT), item response theory (IRT), and Rasch measurement theory (RMT)-in an analysis of the National Eye Institute Visual Functioning Questionnaire (VFQ-25). Baseline VFQ-25 data from 240 participants with diabetic macular edema from a randomized, double-masked, multicenter clinical trial were used to evaluate the VFQ at the total score level. CTT, RMT, and IRT evaluations were conducted, and results were assessed in a head-to-head comparison. Results were similar across the three methods, with IRT and RMT providing more detailed diagnostic information on how to improve the scale. CTT led to the identification of two problematic items that threaten the validity of the overall scale score, sets of redundant items, and skewed response categories. IRT and RMT additionally identified poor fit for one item, many locally dependent items, poor targeting, and disordering of over half the response categories. Selection of a psychometric approach depends on many factors. Researchers should justify their evaluation method and consider the intended audience. If the instrument is being developed for descriptive purposes and on a restricted budget, a cursory examination of the CTT-based psychometric properties may be all that is possible. In a high-stakes situation, such as the development of a patient-reported outcome instrument for consideration in pharmaceutical labeling, however, a thorough psychometric evaluation including IRT or RMT should be considered, with final item-level decisions made on the basis of both quantitative and qualitative results. Copyright © 2015. Published by Elsevier Inc.

  5. An Introduction to Item Response Theory for Patient-Reported Outcome Measurement

    Science.gov (United States)

    Nguyen, Tam H.; Han, Hae-Ra; Kim, Miyong T.

    2015-01-01

    The growing emphasis on patient-centered care has accelerated the demand for high-quality data from patient-reported outcome (PRO) measures. Traditionally, the development and validation of these measures has been guided by classical test theory. However, item response theory (IRT), an alternate measurement framework, offers promise for addressing practical measurement problems found in health-related research that have been difficult to solve through classical methods. This paper introduces foundational concepts in IRT, as well as commonly used models and their assumptions. Existing data on a combined sample (n = 636) of Korean American and Vietnamese American adults who responded to the High Blood Pressure Health Literacy Scale and the Patient Health Questionnaire-9 are used to exemplify typical applications of IRT. These examples illustrate how IRT can be used to improve the development, refinement, and evaluation of PRO measures. Greater use of methods based on this framework can increase the accuracy and efficiency with which PROs are measured. PMID:24403095

  6. Distinguishing Continuous and Discrete Approaches to Multilevel Mixture IRT Models: A Model Comparison Perspective

    Science.gov (United States)

    Zhu, Xiaoshu

    2013-01-01

    The current study introduced a general modeling framework, multilevel mixture IRT (MMIRT) which detects and describes characteristics of population heterogeneity, while accommodating the hierarchical data structure. In addition to introducing both continuous and discrete approaches to MMIRT, the main focus of the current study was to distinguish…

  7. Disparity between General Symptom Relief and Remission Criteria in the Positive and Negative Syndrome Scale (PANSS): A Post-treatment Bifactor Item Response Theory Model.

    Science.gov (United States)

    Anderson, Ariana E; Reise, Steven P; Marder, Stephen R; Mansolf, Maxwell; Han, Carol; Bilder, Robert M

    2017-12-01

    Objective: Total scale scores derived by summing ratings from the 30-item PANSS are commonly used in clinical trial research to measure overall symptom severity, and percentage reductions in the total scores are sometimes used to document the efficacy of treatment. Acknowledging that some patients may have substantial changes in PANSS total scores but still be sufficiently symptomatic to warrant diagnosis, ratings on a subset of 8 items, referred to here as the "Remission set," are sometimes used to determine if patients' symptoms no longer satisfy diagnostic criteria. An unanswered question remains: is the goal of treatment better conceptualized as reduction in overall symptom severity, or reduction in symptoms below the threshold for diagnosis? We evaluated the psychometric properties of PANSS total scores, to assess whether having low symptom severity post-treatment is equivalent to attaining Remission. Design: We applied a bifactor item response theory (IRT) model to post-treatment PANSS ratings of 3,647 subjects diagnosed with schizophrenia assessed at the termination of 11 clinical trials. The bifactor model specified one general dimension to reflect overall symptom severity, and five domain-specific dimensions. We assessed how PANSS item discrimination and information parameters varied across the range of overall symptom severity (θ), with a special focus on low levels of symptoms (i.e., θexpected PANSS item score of 1.83, a rating between "Absent" and "Minimal" for a PANSS symptom. Results: The application of the bifactor IRT model revealed: (1) 88% of total score variation was attributable to variation in general symptom severity, and only 8% reflected secondary domain factors. This implies that a general factor may provide a good indicator of symptom severity, and that interpretation is not overly complicated by multidimensionality; (2) Post-treatment, 534 individuals (about 15% of the whole sample) scored in the "Relief" range of general symptom

  8. Application of Item Response Theory to Modeling of Expanded Disability Status Scale in Multiple Sclerosis.

    Science.gov (United States)

    Novakovic, A M; Krekels, E H J; Munafo, A; Ueckert, S; Karlsson, M O

    2017-01-01

    In this study, we report the development of the first item response theory (IRT) model within a pharmacometrics framework to characterize the disease progression in multiple sclerosis (MS), as measured by Expanded Disability Status Score (EDSS). Data were collected quarterly from a 96-week phase III clinical study by a blinder rater, involving 104,206 item-level observations from 1319 patients with relapsing-remitting MS (RRMS), treated with placebo or cladribine. Observed scores for each EDSS item were modeled describing the probability of a given score as a function of patients' (unobserved) disability using a logistic model. Longitudinal data from placebo arms were used to describe the disease progression over time, and the model was then extended to cladribine arms to characterize the drug effect. Sensitivity with respect to patient disability was calculated as Fisher information for each EDSS item, which were ranked according to the amount of information they contained. The IRT model was able to describe baseline and longitudinal EDSS data on item and total level. The final model suggested that cladribine treatment significantly slows disease-progression rate, with a 20% decrease in disease-progression rate compared to placebo, irrespective of exposure, and effects an additional exposure-dependent reduction in disability progression. Four out of eight items contained 80% of information for the given range of disabilities. This study has illustrated that IRT modeling is specifically suitable for accurate quantification of disease status and description and prediction of disease progression in phase 3 studies on RRMS, by integrating EDSS item-level data in a meaningful manner.

  9. Long-Term Impact of Valid Case Criterion on Capturing Population-Level Growth under Item Response Theory Equating. Research Report. ETS RR-17-17

    Science.gov (United States)

    Deng, Weiling; Monfils, Lora

    2017-01-01

    Using simulated data, this study examined the impact of different levels of stringency of the valid case inclusion criterion on item response theory (IRT)-based true score equating over 5 years in the context of K-12 assessment when growth in student achievement is expected. Findings indicate that the use of the most stringent inclusion criterion…

  10. An Item Response Theory-Based, Computerized Adaptive Testing Version of the MacArthur-Bates Communicative Development Inventory: Words & Sentences (CDI:WS)

    Science.gov (United States)

    Makransky, Guido; Dale, Philip S.; Havmose, Philip; Bleses, Dorthe

    2016-01-01

    Purpose: This study investigated the feasibility and potential validity of an item response theory (IRT)-based computerized adaptive testing (CAT) version of the MacArthur-Bates Communicative Development Inventory: Words & Sentences (CDI:WS; Fenson et al., 2007) vocabulary checklist, with the objective of reducing length while maintaining…

  11. Item Response Theory. Research Report. ETS RR-13-28. ETS R&D Scientific and Policy Contributions Series. ETS SPC-13-05

    Science.gov (United States)

    Carlson, James E.; von Davier, Matthias

    2013-01-01

    Few would doubt that ETS researchers have contributed more to the general topic of item response theory (IRT) than individuals from any other institution. In this report, we briefly review most of those contributions, dividing them into sections by decades of publication, beginning with early work by Fred Lord and Bert Green in the 1950s and…

  12. PENENTUAN STANDARD SETTING MATA PELAJARAN KIMIA DENGAN METODE ANGOFF, IRT (ITEM RESPONSE THEORY, DAN SPLINES CUBIC HERMIT FUNCTION

    Directory of Open Access Journals (Sweden)

    s suwahono

    2016-03-01

    Pada metode diatas, butir-butir tes ditentukan tingkat kesulitannya, ke- mudian butir-butir tersebut diurutkan berdasarkan tingkat kesulitannya yang selanjutnya menjadi nomor halaman. Pelaksanaan metode ini melibatkan guru kimia berpengala- man sebagai panelis yang menentukan pada halaman bera- pa peserta mulai tidak bisa mengerjakan, dan memerlukan suatu tes/perangkat ujian mata pelajaran kimia yang ter- standar, dan instrumen sederhana untuk menuliskan hasil tiap panelis. Tahap pelaksanaan yaitu pelatihan, putaran 1, dan putaran 2. Rerata hasil putaran 1 dan 2 merupakan hasil penentuan batas kelulusan mata pelajaran kimia

  13. The development of the nuclear physics in Latvia II. The building of the Research Nuclear Reactor IRT

    International Nuclear Information System (INIS)

    Ulmanis, U.

    2004-01-01

    Nuclear research reactor IRT of the Academy of Sciences was built near Riga in Salaspils. IRT is pool aqueous - aqueous reactor with nuclear fuel U-235 contained elements, located in the core at a depth of ∼ 7 m under distilled water. Ten horizontal and 10-15 vertical experimental channels are employed in experimental research with the use of neutron fluxes. For the research with gamma rays is constructed radiation loop facility with liquid In-Ga-SN solid solution as intensive gamma-ray sources. Main activities of IRT are to conduct research in nuclear spectroscopy, neutron activation analysis, neutron diffraction and radiation physics, chemistry and biology. (authors)

  14. Item Response Theory Analyses of the Cambridge Face Memory Test (CFMT)

    Science.gov (United States)

    Cho, Sun-Joo; Wilmer, Jeremy; Herzmann, Grit; McGugin, Rankin; Fiset, Daniel; Van Gulick, Ana E.; Ryan, Katie; Gauthier, Isabel

    2014-01-01

    We evaluated the psychometric properties of the Cambridge face memory test (CFMT; Duchaine & Nakayama, 2006). First, we assessed the dimensionality of the test with a bi-factor exploratory factor analysis (EFA). This EFA analysis revealed a general factor and three specific factors clustered by targets of CFMT. However, the three specific factors appeared to be minor factors that can be ignored. Second, we fit a unidimensional item response model. This item response model showed that the CFMT items could discriminate individuals at different ability levels and covered a wide range of the ability continuum. We found the CFMT to be particularly precise for a wide range of ability levels. Third, we implemented item response theory (IRT) differential item functioning (DIF) analyses for each gender group and two age groups (Age ≤ 20 versus Age > 21). This DIF analysis suggested little evidence of consequential differential functioning on the CFMT for these groups, supporting the use of the test to compare older to younger, or male to female, individuals. Fourth, we tested for a gender difference on the latent facial recognition ability with an explanatory item response model. We found a significant but small gender difference on the latent ability for face recognition, which was higher for women than men by 0.184, at age mean 23.2, controlling for linear and quadratic age effects. Finally, we discuss the practical considerations of the use of total scores versus IRT scale scores in applications of the CFMT. PMID:25642930

  15. An item response theory analysis of Harter's Self-Perception Profile for children or why strong clinical scales should be distrusted.

    Science.gov (United States)

    Egberink, Iris J L; Meijer, Rob R

    2011-06-01

    The authors investigated the psychometric properties of the subscales of the Self-Perception Profile for Children with item response theory (IRT) models using a sample of 611 children. Results from a nonparametric Mokken analysis and a parametric IRT approach for boys (n = 268) and girls (n = 343) were compared. The authors found that most scales formed weak scales and that measurement precision was relatively low and only present for latent trait values indicating low self-perception. The subscales Physical Appearance and Global Self-Worth formed one strong scale. Children seem to interpret Global Self-Worth items as if they measure Physical Appearance. Furthermore, the authors found that strong Mokken scales (such as Global Self-Worth) consisted mostly of items that repeat the same item content. They conclude that researchers should be very careful in interpreting the total scores on the different Self-Perception Profile for Children scales. Finally, implications for further research are discussed.

  16. Methodology for the development and calibration of the SCI-QOL item banks.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Victorson, David; Choi, Seung W; Gershon, Richard; Heinemann, Allen W; Cella, David

    2015-05-01

    To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Individual interviews (n=44) and focus groups (n=65 individuals with SCI and n=42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n=877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n=245) to assess test-retest reliability and stability. A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury--Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM.

  17. Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients.

    Science.gov (United States)

    Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J B; Conroy, Thierry; Tomaszewski, Krzysztof A; Young, Teresa; Petersen, Morten Aa

    2017-11-01

    The European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Group is developing computerized adaptive testing (CAT) versions of all EORTC Quality of Life Questionnaire (QLQ-C30) scales with the aim to enhance measurement precision. Here we present the results on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients. This evaluation included an assessment of dimensionality, fit to the item response theory (IRT) model, differential item functioning (DIF), and measurement properties. A total of 1030 cancer patients completed the 44 candidate items on CF. Of these, 34 items could be included in a unidimensional IRT model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient assessment of HRQOL of cancer patients, without loss of comparability of results.

  18. Proposta de um instrumento de medida para avaliar a satisfação de clientes de bancos utilizando a Teoria da Resposta ao Item Proposal of tool to assess the satisfaction of bank customers using the Item Response Theory

    Directory of Open Access Journals (Sweden)

    Alceu Balbim Junior

    2011-01-01

    satisfaction with the bank with which they maintain a closer relationship. Using Item Response Theory (IRT, parameters of the items and the information curve of the tool proposed were identified. The analysis of the degree of discrimination of the items indicated that they all are appropriate. The information curve obtained showed the range in which the tool proposed produces the best estimates of satisfaction levels. The study revealed the average satisfaction level of the sample and the concentration of customers in different levels of satisfaction in the scale used.

  19. New tests of the common calibration context for ISO, IRTS, and MSX

    Science.gov (United States)

    Cohen, Martin

    1997-01-01

    The work carried out in order to test, verify and validate the accuracy of the calibration spectra provided to the Infrared Space Observatory (ISO), to the Infrared Telescope in Space (IRTS) and to the Midcourse Space Experiment (MSX) for external calibration support of instruments, is reviewed. The techniques, used to vindicate the accuracy of the absolute spectra, are discussed. The work planned for comparing far infrared spectra of Mars and some of the bright stellar calibrators with long wavelength spectrometer data are summarized.

  20. Tests to control the power distribution in the IRT-2,000 reactor

    International Nuclear Information System (INIS)

    Filipcuk, E.V.; Potapenko, P.T.; Krjukov, A.P.; Trofimov, A.P.; Kosilov, A.N.; Nebojan, V.T.; Timochin, E.S.

    1976-01-01

    Results of the investigations of a few structures of such control systems carried out with the help of the IRT 2,000 MIFI reactor in the years 1973/74 are presented in the present work. Within the framework of this study, the successful test of using the transmitter of the direct loading in equipment to control the neutron field was carried out. (orig./TK) [de

  1. Management and inspection of integrity of spent fuel from IRT MEPhI research reactor

    International Nuclear Information System (INIS)

    Aden, V.G.; Bulkin, S.Y.; Sokolov, A.V.; Bushuev, A.V.; Redkin, A.F.; Portnov, A.A.

    2002-01-01

    The information on wet storage and dry storage of the spent nuclear fuel (SNF) of the IRT MEPhI reactor and experience from SNF shipment for reprocessing are presented. The procedure and a facility for nondestructive inspection of local power density fields and the burnup of fuel assemblies based on studying the γ-activity of some fission products generated in U 235 and procedure for inspection of the fuel element cladding leak tightness are described. (author)

  2. Structural versus electronic distortions of symmetry-broken IrTe$_2$

    OpenAIRE

    Kim, Hyo Sung; Kim, Tae-Hwan; Yang, Junjie; Cheong, Sang-Wook; Yeom, Han Woong

    2014-01-01

    We investigate atomic and electronic structures of the intriguing low temperature phase of IrTe2 using high-resolution scanning tunneling microscopy and spectroscopy. We confirm various stripe superstructures such as $\\times$3, $\\times$5, and $\\times$8. The strong vertical and lateral distortions of the lattice for the stripe structures are observed in agreement with recent calculations. The spatial modulations of electronic density of states are clearly identified as separated from the struc...

  3. Licensing activities for the partial decommissioning of IRT-2000 research reactor in Sofia

    International Nuclear Information System (INIS)

    Apostolov, T.; Ilieva, Kr.; Papukchiev, A.; Kalchev, B.

    2001-01-01

    The project for refurbishment of IRT-2000 research reactor in Sofia into low-power reactor (200 kW) is based on the retention of some IRT-2000 buildings, facilities and equipment. The activities, which determine the partial decommissioning should be realized in accordance with preliminary developed licensing documents as General Plan, Safety Analysis Report and Environment Impact assessment Report. The goal of these documents is to provide and guarantee safe and effective activities with radioactive materials, to define strictly the dismantling procedures, and in the same time to minimize their influence on the environment. The Technical Tasks for General Plan, Safety Analysis Report and Environment Impact Assessment Report have been prepared and will be presented as preliminary licensing documents to the National Regulatory Body for approval before their application. A Quality Management system is being developed nowadays at INRNE. After its certification some requirements of the regulatory body will be completed. This certified QA system is a major part of the licensing procedure for the reconstruction of IRT-2000 research reactor. (author)

  4. The design, construction, and operation of the Integrated Radwaste Treatment System (IRTS) Drum Cell

    International Nuclear Information System (INIS)

    Landau, B.; Russillo, A.; Frank, D.; Garland, D.

    1989-12-01

    This report describes the design, construction, and the operation of the Integrated Radwaste Treatment Systems (IRTS) Drum Cell at the West Valley Demonstration Project (WVDP), West Valley, New York. The IRTS Drum Cell was designed to provide a shielded, secure storage area for the remote handling and placement of low-level Class C radioactive waste produced in the IRTS. The Drum Cell was designed to contain up to approximately 8,804 drums from decontaminated supernatant processing. This waste is to be poured into 0.27m 3 in a temperature controlled environment to ensure the cement will not be subjected to freezing and thawing cycles. A Temporary Weather Structure (TWS), a pre-engineered building, now encloses the Drum Cell and associated equipment so that remote waste-handling and placement operations can continue without regard to weather conditions. The Drum Cell was designed so that this TWS could be removed and the low-level waste entombed in place. Final disposition of this low-level waste is currently being evaluated in an Environmental Impact Statement (EIS). 10 refs., 11 figs., 1 tab

  5. Indium-Gallium Radiation Contour of the IRT Nuclear Reactor; Circuit d'activation d'indium-gallium dans le reacteur nucleaire IRT; Indij-gallievyj radiatsionnyj kontur yadernogo reaktora IRT; Circuito de radiaciones de indio-galio del reactor IRT

    Energy Technology Data Exchange (ETDEWEB)

    Breger, A K; Ryabukin, Y S; Tulkes, S G; Volkov, E N

    1960-07-15

    Following on theoretical work already published, an indium-gallium radiation contour of the IRT nuclear reactor has been prepared, and represents a powerful new source of gamma-radiation. The first contour of this type ''RK-1'' was prepared on the IRT reactor at the Physics Institute of the Academy of Sciences of the Georgian SSR. The paper gives the activation calculations for indium-gallium alloy; the structural components of RK-1 and their arrangement in the reactor tank and the hot cell; the devise for feeding liquid and gaseous substances into the irradiation zone; and the conveyor for solid substances to be irradiated. When the IRT reactor is at a power of 2000 kW, the radiation strength of the contour is equivalent to that of a gamma-emitter having an activity of 20,000 g. Ra equivalent. The prospects for the use of the indium-gallium radiation contour for research and semi-industrial purposes are discussed. (author) [French] A la suite de la publication d'un ouvrage theorique, on a etabli autour du reacteur nucleaire IRT un circuit d'activation d'indium-gallium qui represente une nouvelle source de rayonnements gamma de grande intensite. Le premier circuit de ce type ''RK-1'' a ete etabli sur le reacteur IRT a l'Institut de physique de l'Academie des sciences de la RSS de Georgie. Les auteurs donnent les calculs de l'activation pour l'alliage indium-gallium; ils indiquent les elements structurels du RK-1 et leur disposition dans le reservoir et dans la cellule de haute activite du reacteur; ils decrivent le dispositif permettant d'introduire des substances liquides et gazeuses dans la zone d'irradiation et le systeme qui transporte les substances solides a irradier. Lorsque le reacteur IRT fonctionne a 2 000 kW, la puissance de rayonnement du circuit equivaut a celle d'un emetteur gamma ayant une activite equivalente a 20 000 grammes de radium. Les auteurs examinent les perspectives d'emploi de ce processus pour la recherche et a des fins semi

  6. Results and Conclusions from the NASA Isokinetic Total Water Content Probe 2009 IRT Test

    Science.gov (United States)

    Reehorst, Andrew; Brinker, David

    2010-01-01

    The NASA Glenn Research Center has developed and tested a Total Water Content Isokinetic Sampling Probe. Since, by its nature, it is not sensitive to cloud water particle phase nor size, it is particularly attractive to support super-cooled large droplet and high ice water content aircraft icing studies. The instrument comprises the Sampling Probe, Sample Flow Control, and Water Vapor Measurement subsystems. Results and conclusions are presented from probe tests in the NASA Glenn Icing Research Tunnel (IRT) during January and February 2009. The use of reference probe heat and the control of air pressure in the water vapor measurement subsystem are discussed. Several run-time error sources were found to produce identifiable signatures that are presented and discussed. Some of the differences between measured Isokinetic Total Water Content Probe and IRT calibration seems to be caused by tunnel humidification and moisture/ice crystal blow around. Droplet size, airspeed, and liquid water content effects also appear to be present in the IRT calibration. Based upon test results, the authors provide recommendations for future Isokinetic Total Water Content Probe development.

  7. Solution of operational problems utilization of an EX-IRT-2000 heat exchanger

    International Nuclear Information System (INIS)

    Razak, Abdu

    1986-01-01

    The Bandung TRIGA Mark II Reactor has been successfully operated for 21 years, especially in low power operation or as neutron sources for various experiments. Most of the operating time, approximately 80% of routine operation, was dedicated for radio-isotope production. During routine operation for radio-isotope production, the reactor could not be operated at full power. The reactor was operated at 60% of the maximum power (1 MWth) due to the inability of the original heat exchanger to operate properly. The reason is that slack deposition was built-up on the secondary side of the heat exchanger. Therefore, it reduced the coefficient of heat transfer considerably. To solve the problems, a set of heat exchanger including the pump was installed In parallel with the original unit. The heat exchanger was an IRT-2000 Reactor Heat exchanger which was collected from the abandoned IRT-2000 Project. The heat exchanger has capacity of 1.25 MW. The new heat exchanger could reduced the outlet temperature of the primary coolant Into 42 deg. C. While the original-heat exchanger at the worst condition and at 600 kW of power reach outlet temperature 49 deg. C. The IRT Heat Exchanger is a counter flow heat exchanger. (author)

  8. Solution of operational problems utilization of an EX-IRT-2000 heat exchanger

    Energy Technology Data Exchange (ETDEWEB)

    Razak, Abdu [Research Centre for Nuclear Techniques, National Atomic Energy Agency (Indonesia)

    1986-07-01

    The Bandung TRIGA Mark II Reactor has been successfully operated for 21 years, especially in low power operation or as neutron sources for various experiments. Most of the operating time, approximately 80% of routine operation, was dedicated for radio-isotope production. During routine operation for radio-isotope production, the reactor could not be operated at full power. The reactor was operated at 60% of the maximum power (1 MWth) due to the inability of the original heat exchanger to operate properly. The reason is that slack deposition was built-up on the secondary side of the heat exchanger. Therefore, it reduced the coefficient of heat transfer considerably. To solve the problems, a set of heat exchanger including the pump was installed In parallel with the original unit. The heat exchanger was an IRT-2000 Reactor Heat exchanger which was collected from the abandoned IRT-2000 Project. The heat exchanger has capacity of 1.25 MW. The new heat exchanger could reduced the outlet temperature of the primary coolant Into 42 deg. C. While the original-heat exchanger at the worst condition and at 600 kW of power reach outlet temperature 49 deg. C. The IRT Heat Exchanger is a counter flow heat exchanger. (author)

  9. Using item response theory to investigate the structure of anticipated affect: Do self-reports about future affective reactions conform to typical or maximal models?

    Directory of Open Access Journals (Sweden)

    Leonidas A Zampetakis

    2015-09-01

    Full Text Available In the present research we used item response theory (IRT to examine whether effective predictions (anticipated affect conforms to a typical (i.e., what people usually do or a maximal behavior process (i.e., what people can do. The former, correspond to non-monotonic ideal point IRT models whereas the latter correspond to monotonic dominance IRT models. A convenience, cross-sectional student sample (N=1624 was used. Participants were asked to report on anticipated positive and negative affect around a hypothetical event (emotions surrounding the start of a new business. We carried out analysis comparing Graded Response Model (GRM, a dominance IRT model, against Generalized Graded Unfolding Model (GGUM, an unfolding IRT model. We found that the GRM provided a better fit to the data. Findings suggest that the self-report responses to anticipated affect conform to dominance response process (i.e. maximal behavior. The paper also discusses implications for a growing literature on anticipated affect.

  10. FINANCIAL LITERACY: A STUDY USING THE APPLICATION OF ITEM RESPONSE THEORY

    Directory of Open Access Journals (Sweden)

    João Carlos Hipólito Bernardes do Nascimento

    2016-04-01

    Full Text Available This study aimed to measure the level of financial literacy of Business Administration course students at a federal Higher Education Institution (HEI. To this end, a survey was conducted on 307 students. The Item Response Theory (IRT was employed for data analysis and the findings support the conclusion that the students show a low level of financial literacy, as well as the existence of a conservative investment profile among students. This scenario, in line with previous empirical studies conducted in the Brazil, is worrying given the potential negative externalities resulting from poor financial decisions, especially those related to home financing and retirement preparations. This study contributes to the empirical evaluation, within the national context, of the use of IRT in estimating financial literacy, and shows that it is, indeed, an important methodological option in the estimation of this latent trait. Furthermore, this enables financial knowledge to be compared through consistent and reliable means, using studies, populations, realities and separate programs.

  11. Item response theory analysis of Working Alliance Inventory, revised response format, and new Brief Alliance Inventory.

    Science.gov (United States)

    Mallinckrodt, Brent; Tekie, Yacob T

    2016-11-01

    The Working Alliance Inventory (WAI) has made great contributions to psychotherapy research. However, studies suggest the 7-point response format and 3-factor structure of the client version may have psychometric problems. This study used Rasch item response theory (IRT) to (a) improve WAI response format, (b) compare two brief 12-item versions (WAI-sr; WAI-s), and (c) develop a new 16-item Brief Alliance Inventory (BAI). Archival data from 1786 counseling center and community clients were analyzed. IRT findings suggested problems with crossed category thresholds. A rescoring scheme that combines neighboring responses to create 5- and 4-point scales sharply reduced these problems. Although subscale variance was reduced by 11-26%, rescoring yielded improved reliability and generally higher correlations with therapy process (session depth and smoothness) and outcome measures (residual gain symptom improvement). The 16-item BAI was designed to maximize "bandwidth" of item difficulty and preserve a broader range of WAI sensitivity than WAI-s or WAI-sr. Comparisons suggest the BAI performed better in several respects than the WAI-s or WAI-sr and equivalent to the full WAI on several performance indicators.

  12. Cross-cultural validity of the Spanish version of PHQ-9 among pregnant Peruvian women: a Rasch item response theory analysis.

    Science.gov (United States)

    Zhong, Qiuyue; Gelaye, Bizu; Fann, Jesse R; Sanchez, Sixto E; Williams, Michelle A

    2014-04-01

    We sought to evaluate the validity of the Spanish language version of the patient health questionnaire-9 (PHQ-9) depression scale in a large sample of pregnant Peruvian women using Rasch item response theory (IRT) approaches. We further sought to examine the appropriateness of the response formats, reliability and potential differential item functioning (DIF) by maternal age, educational attainment and employment status. This cross-sectional study was conducted among 1520 pregnant women in Lima, Peru. A structured interview was used to collect information on demographic characteristics and PHQ-9 items. Data from the PHQ-9 were fitted to the Rasch IRT model and tested for appropriate category ordering, the assumptions of unidimensionality and local independence, item fit, reliability and presence of DIF. The Spanish language version of PHQ-9 demonstrated unidimensionality, local independence, and acceptable fit for the Rasch IRT model. However, we detected disordered response categories for the original four response categories. After collapsing "more than half the days" and "nearly every day", the response categories ordered properly and the PHQ-9 fit the Rasch IRT model. The PHQ-9 had moderate internal consistency (person separation index, PSI=0.72). Additionally, the items of PHQ-9 were free of DIF with regard to age, educational attainment, and employment status. The Spanish language version of the PHQ-9 was shown to have item properties of an effective screening instrument. Collapsing rating scale categories and reconstructing three-point Likert scale for all items improved the fit of the instrument. Future studies are warranted to establish new cutoff scores and criterion validity of the three-point Likert scale response options for the Spanish language version of the PHQ-9. Copyright © 2014 Elsevier B.V. All rights reserved.

  13. Psychometric properties of the Epworth Sleepiness Scale: A factor analysis and item-response theory approach.

    Science.gov (United States)

    Pilcher, June J; Switzer, Fred S; Munc, Alec; Donnelly, Janet; Jellen, Julia C; Lamm, Claus

    2018-04-01

    The purpose of this study is to examine the psychometric properties of the Epworth Sleepiness Scale (ESS) in two languages, German and English. Students from a university in Austria (N = 292; 55 males; mean age = 18.71 ± 1.71 years; 237 females; mean age = 18.24 ± 0.88 years) and a university in the US (N = 329; 128 males; mean age = 18.71 ± 0.88 years; 201 females; mean age = 21.59 ± 2.27 years) completed the ESS. An exploratory-factor analysis was completed to examine dimensionality of the ESS. Item response theory (IRT) analyses were used to provide information about the response rates on the items on the ESS and provide differential item functioning (DIF) analyses to examine whether the items were interpreted differently between the two languages. The factor analyses suggest that the ESS measures two distinct sleepiness constructs. These constructs indicate that the ESS is probing sleepiness in settings requiring active versus passive responding. The IRT analyses found that overall, the items on the ESS perform well as a measure of sleepiness. However, Item 8 and to a lesser extent Item 6 were being interpreted differently by respondents in comparison to the other items. In addition, the DIF analyses showed that the responses between German and English were very similar indicating that there are only minor measurement differences between the two language versions of the ESS. These findings suggest that the ESS provides a reliable measure of propensity to sleepiness; however, it does convey a two-factor approach to sleepiness. Researchers and clinicians can use the German and English versions of the ESS but may wish to exclude Item 8 when calculating a total sleepiness score.

  14. Gibberellic acid alleviates cadmium toxicity by reducing nitric oxide accumulation and expression of IRT1 in Arabidopsis thaliana

    International Nuclear Information System (INIS)

    Zhu, Xiao Fang; Jiang, Tao; Wang, Zhi Wei; Lei, Gui Jie; Shi, Yuan Zhi; Li, Gui Xin; Zheng, Shao Jian

    2012-01-01

    Highlights: ► Cd reduces endogenous GA levels in Arabidopsis. ► GA exogenous applied decreases Cd accumulation in plant. ► GA suppresses the Cd-induced accumulation of NO. ► Decreased NO level downregulates the expression of IRT1. ► Suppressed IRT1 expression reduces Cd transport across plasma membrane. - Abstract: Gibberellic acid (GA) is involved in not only plant growth and development but also plant responses to abiotic stresses. Here it was found that treating the plants with GA concentrations from 0.1 to 5 μM for 24 h had no obvious effect on root elongation in the absence of cadmium (Cd), whereas in the presence of Cd 2+ , GA at 5 μM improved root growth, reduced Cd content and lipid peroxidation in the roots, indicating that GA can partially alleviate Cd toxicity. Cd 2+ increased nitric oxide (NO) accumulation in the roots, but GA remarkably reduced it, and suppressed the up-regulation of the expression of IRT1. In contrary, the beneficial effect of GA on alleviating Cd toxicity was not observed in an IRT1 knock-out mutant irt1, suggesting the involvement of IRT1 in Cd 2+ absorption. Furthermore, the GA-induced reduction of NO and Cd content can also be partially reversed by the application of a NO donor (S-nitrosoglutathione [GSNO]). Taken all these together, the results showed that GA-alleviated Cd toxicity is mediated through the reduction of the Cd-dependent NO accumulation and expression of Cd 2+ uptake related gene-IRT1 in Arabidopsis.

  15. Gibberellic acid alleviates cadmium toxicity by reducing nitric oxide accumulation and expression of IRT1 in Arabidopsis thaliana

    Energy Technology Data Exchange (ETDEWEB)

    Zhu, Xiao Fang [State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058 (China); Jiang, Tao [Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou 310058 (China); Wang, Zhi Wei [State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058 (China); Lei, Gui Jie [Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou 310058 (China); Shi, Yuan Zhi [The Key Laboratory of Tea Chemical Engineering, Ministry of Agriculture, Yunqi Road 1, Hangzhou 310008 (China); Li, Gui Xin, E-mail: guixinli@zju.edu.cn [College of Agronomy and Biotechnology, Zhejiang University, Hangzhou 310058 (China); Zheng, Shao Jian [State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058 (China); Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou 310058 (China)

    2012-11-15

    Highlights: Black-Right-Pointing-Pointer Cd reduces endogenous GA levels in Arabidopsis. Black-Right-Pointing-Pointer GA exogenous applied decreases Cd accumulation in plant. Black-Right-Pointing-Pointer GA suppresses the Cd-induced accumulation of NO. Black-Right-Pointing-Pointer Decreased NO level downregulates the expression of IRT1. Black-Right-Pointing-Pointer Suppressed IRT1 expression reduces Cd transport across plasma membrane. - Abstract: Gibberellic acid (GA) is involved in not only plant growth and development but also plant responses to abiotic stresses. Here it was found that treating the plants with GA concentrations from 0.1 to 5 {mu}M for 24 h had no obvious effect on root elongation in the absence of cadmium (Cd), whereas in the presence of Cd{sup 2+}, GA at 5 {mu}M improved root growth, reduced Cd content and lipid peroxidation in the roots, indicating that GA can partially alleviate Cd toxicity. Cd{sup 2+} increased nitric oxide (NO) accumulation in the roots, but GA remarkably reduced it, and suppressed the up-regulation of the expression of IRT1. In contrary, the beneficial effect of GA on alleviating Cd toxicity was not observed in an IRT1 knock-out mutant irt1, suggesting the involvement of IRT1 in Cd{sup 2+} absorption. Furthermore, the GA-induced reduction of NO and Cd content can also be partially reversed by the application of a NO donor (S-nitrosoglutathione [GSNO]). Taken all these together, the results showed that GA-alleviated Cd toxicity is mediated through the reduction of the Cd-dependent NO accumulation and expression of Cd{sup 2+} uptake related gene-IRT1 in Arabidopsis.

  16. Overexpression of ZmIRT1 and ZmZIP3 Enhances Iron and Zinc Accumulation in Transgenic Arabidopsis.

    Directory of Open Access Journals (Sweden)

    Suzhen Li

    Full Text Available Iron and zinc are important micronutrients for both the growth and nutrient availability of crop plants, and their absorption is tightly controlled by a metal uptake system. Zinc-regulated transporters, iron-regulated transporter-like proteins (ZIP, is considered an essential metal transporter for the acquisition of Fe and Zn in graminaceous plants. Several ZIPs have been identified in maize, although their physiological function remains unclear. In this report, ZmIRT1 was shown to be specifically expressed in silk and embryo, whereas ZmZIP3 was a leaf-specific gene. Both ZmIRT1 and ZmZIP3 were shown to be localized to the plasma membrane and endoplasmic reticulum. In addition, transgenic Arabidopsis plants overexpressing ZmIRT1 or ZmZIP3 were generated, and the metal contents in various tissues of transgenic and wild-type plants were examined based on ICP-OES and Zinpyr-1 staining. The Fe and Zn concentration increased in roots and seeds of ZmIRT1-overexpressing plants, while the Fe content in shoots decreased. Overexpressing ZmZIP3 enhanced Zn accumulation in the roots of transgenic plants, while that in shoots was repressed. In addition, the transgenic plants showed altered tolerance to various Fe and Zn conditions compared with wild-type plants. Furthermore, the genes associated with metal uptake were stimulated in ZmIRT1 transgenic plants, while those involved in intra- and inter- cellular translocation were suppressed. In conclusion, ZmIRT1 and ZmZIP3 are functional metal transporters with different ion selectivities. Ectopic overexpression of ZmIRT1 may stimulate endogenous Fe uptake mechanisms, which may facilitate metal uptake and homeostasis. Our results increase our understanding of the functions of ZIP family transporters in maize.

  17. Investigating Robustness of Item Response Theory Proficiency Estimators to Atypical Response Behaviors under Two-Stage Multistage Testing. ETS GRE® Board Research Report. ETS GRE®-16-03. ETS Research Report No. RR-16-22

    Science.gov (United States)

    Kim, Sooyeon; Moses, Tim

    2016-01-01

    The purpose of this study is to evaluate the extent to which item response theory (IRT) proficiency estimation methods are robust to the presence of aberrant responses under the "GRE"® General Test multistage adaptive testing (MST) design. To that end, a wide range of atypical response behaviors affecting as much as 10% of the test items…

  18. Use of differential item functioning (DIF analysis for bias analysis in test construction

    Directory of Open Access Journals (Sweden)

    Marié De Beer

    2004-10-01

    Opsomming Waar differensiële itemfunksioneringsprosedures (DIF-prosedures vir itemontleding gebaseer op itemresponsteorie (IRT tydens toetskonstruksie gebruik word, is dit moontlik om itemkarakteristiekekrommes vir dieselfde item vir verskillende subgroepe voor te stel. Hierdie krommes dui aan hoe elke item vir die verskillende subgroepe op verskillende vermoënsvlakke te funksioneer. DIF word aangetoon deur die area tussen die krommes. DIF is in die konstruksie van die 'Learning Potential Computerised Adaptive test (LPCAT' gebruik om die items te identifiseer wat sydigheid ten opsigte van geslag, kultuur, taal of opleidingspeil geopenbaar het. Items wat ’n voorafbepaalde vlak van DIF oorskry het, is uit die finale itembank weggelaat, ongeag die subgroep wat bevoordeel of benadeel is. Die proses en resultate van die DIF-ontleding word bespreek.

  19. Item Information in the Rasch Model

    NARCIS (Netherlands)

    Engelen, Ron J.H.; van der Linden, Willem J.; Oosterloo, Sebe J.

    1988-01-01

    Fisher's information measure for the item difficulty parameter in the Rasch model and its marginal and conditional formulations are investigated. It is shown that expected item information in the unconditional model equals information in the marginal model, provided the assumption of sampling

  20. The Protective Behavioral Strategies for Marijuana Scale: Further examination using item response theory.

    Science.gov (United States)

    Pedersen, Eric R; Huang, Wenjing; Dvorak, Robert D; Prince, Mark A; Hummer, Justin F

    2017-08-01

    Given recent state legislation legalizing marijuana for recreational purposes and majority popular opinion favoring these laws, we developed the Protective Behavioral Strategies for Marijuana scale (PBSM) to identify strategies that may mitigate the harms related to marijuana use among those young people who choose to use the drug. In the current study, we expand on the initial exploratory study of the PBSM to further validate the measure with a large and geographically diverse sample (N = 2,117; 60% women, 30% non-White) of college students from 11 different universities across the United States. We sought to develop a psychometrically sound item bank for the PBSM and to create a short assessment form that minimizes respondent burden and time. Quantitative item analyses, including exploratory and confirmatory factor analyses with item response theory (IRT) and evaluation of differential item functioning (DIF), revealed an item bank of 36 items that was examined for unidimensionality and good content coverage, as well as a short form of 17 items that is free of bias in terms of gender (men vs. women), race (White vs. non-White), ethnicity (Hispanic vs. non-Hispanic), and recreational marijuana use legal status (state recreational marijuana was legal for 25.5% of participants). We also provide a scoring table for easy transformation from sum scores to IRT scale scores. The PBSM item bank and short form associated strongly and negatively with past month marijuana use and consequences. The measure may be useful to researchers and clinicians conducting intervention and prevention programs with young adults. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  1. Limited information estimation of the diffusion-based item response theory model for responses and response times.

    Science.gov (United States)

    Ranger, Jochen; Kuhn, Jörg-Tobias; Szardenings, Carsten

    2016-05-01

    Psychological tests are usually analysed with item response models. Recently, some alternative measurement models have been proposed that were derived from cognitive process models developed in experimental psychology. These models consider the responses but also the response times of the test takers. Two such models are the Q-diffusion model and the D-diffusion model. Both models can be calibrated with the diffIRT package of the R statistical environment via marginal maximum likelihood (MML) estimation. In this manuscript, an alternative approach to model calibration is proposed. The approach is based on weighted least squares estimation and parallels the standard estimation approach in structural equation modelling. Estimates are determined by minimizing the discrepancy between the observed and the implied covariance matrix. The estimator is simple to implement, consistent, and asymptotically normally distributed. Least squares estimation also provides a test of model fit by comparing the observed and implied covariance matrix. The estimator and the test of model fit are evaluated in a simulation study. Although parameter recovery is good, the estimator is less efficient than the MML estimator. © 2016 The British Psychological Society.

  2. Inoculation with Bacillus subtilis and Azospirillum brasilense produces abscisic acid that reduces IRT1-mediated cadmium uptake of roots.

    Science.gov (United States)

    Xu, Qianru; Pan, Wei; Zhang, Ranran; Lu, Qi; Xue, Wanlei; Wu, Cainan; Song, Bixiu; Du, Shaoting

    2018-05-08

    Cadmium (Cd) contamination of agricultural soils represents a serious risk to crop safety. A new strategy using abscisic acid (ABA)-generating bacteria, Bacillus subtilis or Azospirillum brasilense, was developed to reduce the Cd accumulation in plants grown in Cd-contaminated soil. Inoculation with either bacterium resulted in a pronounced increase in the ABA level in wild-type Arabidopsis Col-0 plants, accompanied by a decrease in Cd levels in plant tissues, which mitigated the Cd toxicity. As a consequence, the growth of plants exposed to Cd was improved. Nevertheless, B. subtilis and A. brasilense inoculation had little effect on Cd levels and toxicity in the ABA-insensitive mutant snrk 2.2/2.3, indicating that the action of ABA is required for these bacteria to reduce Cd accumulation in plants. Furthermore, inoculation with either B. subtilis or A. brasilense down-regulated the expression of IRT1 (IRON-REGULATED TRANSPORTER 1) in the roots of wild-type plants and had little effect on Cd levels in the IRT1-knockout mutants irt1-1 and irt1-2. In summary, we conclude that B. subtilis and A. brasilense can reduce Cd levels in plants via an IRT1-dependent ABA-mediated mechanism.

  3. Investigation of radial propagation of electrostatic fluctuations in the IR-T1 tokamak plasma edge

    Energy Technology Data Exchange (ETDEWEB)

    Shariatzadeh, R; Ghoranneviss, M; Salem, M K [Plasma Physics Research Center, Science and Research Branch, Islamic Azad University (IAU), PO Box 14665-678, Tehran (Iran, Islamic Republic of); Emami, M, E-mail: rezashariatzadeh@gmail.com [Laser and Optics Research School, NSTRI, AEOI, PO Box 14155-1339, Tehran (Iran, Islamic Republic of)

    2011-01-15

    The radial propagation of electrostatic fluctuation is considered extremely important for understanding cross-field anomalous transport. In this paper, two arrays of Langmuir probes are used to analyze electrostatic fluctuations in the edge of IR-T1 tokamak plasma in both the radial and the poloidal directions. The propagation characteristics of the floating potential fluctuations are analyzed by the two-point correlation technique. The wavenumber spectrum shows that there is a net radially outward propagation of turbulent fluctuations in the edge and scrape-off layer (SOL) regions. Hence, edge turbulence presumably originates from core fluctuations.

  4. Investigation of radial propagation of electrostatic fluctuations in the IR-T1 tokamak plasma edge

    International Nuclear Information System (INIS)

    Shariatzadeh, R; Ghoranneviss, M; Salem, M K; Emami, M

    2011-01-01

    The radial propagation of electrostatic fluctuation is considered extremely important for understanding cross-field anomalous transport. In this paper, two arrays of Langmuir probes are used to analyze electrostatic fluctuations in the edge of IR-T1 tokamak plasma in both the radial and the poloidal directions. The propagation characteristics of the floating potential fluctuations are analyzed by the two-point correlation technique. The wavenumber spectrum shows that there is a net radially outward propagation of turbulent fluctuations in the edge and scrape-off layer (SOL) regions. Hence, edge turbulence presumably originates from core fluctuations.

  5. RIBD-IRT, Isotope Buildup and Isotope Decay from Fission Source

    International Nuclear Information System (INIS)

    1990-01-01

    1 - Description of problem or function: RIBD-IRT calculates isotopic concentrations resulting from two fission sources with normal down- chain decay by beta emission and isomeric transfers and inter-chain coupling resulting from (n,gamma) reactions. Calculations can be made to follow an irradiation history through an unlimited number of step changes of unrestricted duration and variability including shutdown periods, restarts at different power levels and/or any other level changes. In addition, the program permits to track and modify the concentration of individual elements as they decay with time following reactor shutdown. Tracking individual elements enables one to estimate time-dependent source terms for a hypothetical LOCA based on known or postulated fission product release mechanisms. 2 - Method of solution: RIBD-IRT is a grid processor. It organizes the various members described by the fission product library data into a grid with the various linkages established from chain branching data, yield data, and neutron capture cross sections with their branching ratios. Radioactive decay includes not only the simple member-to-member cascade but also the more complex forms where branching may be partially or completely skip one or two intervening members

  6. A new look at the psychometrics of the parenting scale through the lens of item response theory.

    Science.gov (United States)

    Lorber, Michael F; Xu, Shu; Slep, Amy M Smith; Bulling, Lisanne; O'Leary, Susan G

    2014-01-01

    The psychometrics of the Parenting Scale's Overreactivity and Laxness subscales were evaluated using item response theory (IRT) techniques. The IRT analyses were based on 2 community samples of cohabiting parents of 3- to 8-year-old children, combined to yield a total sample size of 852 families. The results supported the utility of the Overreactivity and Laxness subscales, particularly in discriminating among parents in the mid to upper reaches of each construct. The original versions of the Overreactivity and Laxness subscales were more reliable than alternative, shorter versions identified in replicated factor analyses from previously published research and in IRT analyses in the present research. Moreover, in several cases, the original versions of these subscales, in comparison with the shortened versions, exhibited greater 6-month stabilities and correlations with child externalizing behavior and couple relationship satisfaction. Reliability was greater for the Laxness than for the Overreactivity subscale. Item performance on each subscale was highly variable. Together, the present findings are generally supportive of the psychometrics of the Parenting Scale, particularly for clinical research and practice. They also suggest areas for further development.

  7. Validation of Sustainable Development Practices Scale Using the Bayesian Approach to Item Response Theory

    Directory of Open Access Journals (Sweden)

    Martin Hernani Merino

    2014-12-01

    Full Text Available There has been growing recognition of the importance of creating performance measurement tools for the economic, social and environmental management of micro and small enterprise (MSE. In this context, this study aims to validate an instrument to assess perceptions of sustainable development practices by MSEs by means of a Graded Response Model (GRM with a Bayesian approach to Item Response Theory (IRT. The results based on a sample of 506 university students in Peru, suggest that a valid measurement instrument was achieved. At the end of the paper, methodological and managerial contributions are presented.

  8. An introduction to Item Response Theory and Rasch Analysis of the Eating Assessment Tool (EAT-10).

    Science.gov (United States)

    Kean, Jacob; Brodke, Darrel S; Biber, Joshua; Gross, Paul

    2018-03-01

    Item response theory has its origins in educational measurement and is now commonly applied in health-related measurement of latent traits, such as function and symptoms. This application is due in large part to gains in the precision of measurement attributable to item response theory and corresponding decreases in response burden, study costs, and study duration. The purpose of this paper is twofold: introduce basic concepts of item response theory and demonstrate this analytic approach in a worked example, a Rasch model (1PL) analysis of the Eating Assessment Tool (EAT-10), a commonly used measure for oropharyngeal dysphagia. The results of the analysis were largely concordant with previous studies of the EAT-10 and illustrate for brain impairment clinicians and researchers how IRT analysis can yield greater precision of measurement.

  9. Assessing the Straightforwardly-Worded Brief Fear of Negative Evaluation Scale for Differential Item Functioning Across Gender and Ethnicity.

    Science.gov (United States)

    Harpole, Jared K; Levinson, Cheri A; Woods, Carol M; Rodebaugh, Thomas L; Weeks, Justin W; Brown, Patrick J; Heimberg, Richard G; Menatti, Andrew R; Blanco, Carlos; Schneier, Franklin; Liebowitz, Michael

    2015-06-01

    The Brief Fear of Negative Evaluation Scale (BFNE; Leary Personality and Social Psychology Bulletin , 9, 371-375, 1983) assesses fear and worry about receiving negative evaluation from others. Rodebaugh et al. Psychological Assessment, 16 , 169-181, (2004) found that the BFNE is composed of a reverse-worded factor (BFNE-R) and straightforwardly-worded factor (BFNE-S). Further, they found the BFNE-S to have better psychometric properties and provide more information than the BFNE-R. Currently there is a lack of research regarding the measurement invariance of the BFNE-S across gender and ethnicity with respect to item thresholds. The present study uses item response theory (IRT) to test the BFNE-S for differential item functioning (DIF) related to gender and ethnicity (White, Asian, and Black). Six data sets consisting of clinical, community, and undergraduate participants were utilized ( N =2,109). The factor structure of the BFNE-S was confirmed using categorical confirmatory factor analysis, IRT model assumptions were tested, and the BFNE-S was evaluated for DIF. Item nine demonstrated significant non-uniform DIF between White and Black participants. No other items showed significant uniform or non-uniform DIF across gender or ethnicity. Results suggest the BFNE-S can be used reliably with men and women and Asian and White participants. More research is needed to understand the implications of using the BFNE-S with Black participants.

  10. The therapeutic factor inventory-8: Using item response theory to create a brief scale for continuous process monitoring for group psychotherapy.

    Science.gov (United States)

    Tasca, Giorgio A; Cabrera, Christine; Kristjansson, Elizabeth; MacNair-Semands, Rebecca; Joyce, Anthony S; Ogrodniczuk, John S

    2016-01-01

    We tested a very brief version of the 23-item Therapeutic Factors Inventory-Short Form (TFI-S), and describe the use of Item Response Theory (IRT) for the purpose of developing short and reliable scales for group psychotherapy. Group therapy patients (N = 578) completed the TFI-S on one occasion, and their data were used for the IRT analysis. Of those, 304 completed the TFI-S and other measures on more than one occasion to assess sensitivity to change, concurrent, and predictive validity of the brief version. Results suggest that the new TFI-8 is a brief, reliable, and valid measure of a higher-order group therapeutic factor. The TFI-8 may be used for continuous process measurement and feedback to improve the functioning of therapy groups.

  11. IRT models with relaxed assumptions in eRm: A manual-like instruction

    Directory of Open Access Journals (Sweden)

    REINHOLD HATZINGER

    2009-03-01

    Full Text Available Linear logistic models with relaxed assumptions (LLRA as introduced by Fischer (1974 are a flexible tool for the measurement of change for dichotomous or polytomous responses. As opposed to the Rasch model, assumptions on dimensionality of items, their mutual dependencies and the distribution of the latent trait in the population of subjects are relaxed. Conditional maximum likelihood estimation allows for inference about treatment, covariate or trend effect parameters without taking the subjects' latent trait values into account. In this paper we will show how LLRAs based on the LLTM, LRSM and LPCM can be used to answer various questions about the measurement of change and how they can be fitted in R using the eRm package. A number of small didactic examples is provided that can easily be used as templates for real data sets. All datafiles used in this paper are available from http://eRm.R-Forge.R-project.org/

  12. Using Rasch Analysis to Evaluate the Reliability and Validity of the Swallowing Quality of Life Questionnaire: An Item Response Theory Approach.

    Science.gov (United States)

    Cordier, Reinie; Speyer, Renée; Schindler, Antonio; Michou, Emilia; Heijnen, Bas Joris; Baijens, Laura; Karaduman, Ayşe; Swan, Katina; Clavé, Pere; Joosten, Annette Veronica

    2018-02-01

    The Swallowing Quality of Life questionnaire (SWAL-QOL) is widely used clinically and in research to evaluate quality of life related to swallowing difficulties. It has been described as a valid and reliable tool, but was developed and tested using classic test theory. This study describes the reliability and validity of the SWAL-QOL using item response theory (IRT; Rasch analysis). SWAL-QOL data were gathered from 507 participants at risk of oropharyngeal dysphagia (OD) across four European countries. OD was confirmed in 75.7% of participants via videofluoroscopy and/or fiberoptic endoscopic evaluation, or a clinical diagnosis based on meeting selected criteria. Patients with esophageal dysphagia were excluded. Data were analysed using Rasch analysis. Item and person reliability was good for all the items combined. However, person reliability was poor for 8 subscales and item reliability was poor for one subscale. Eight subscales exhibited poor person separation and two exhibited poor item separation. Overall item and person fit statistics were acceptable. However, at an individual item fit level results indicated unpredictable item responses for 28 items, and item redundancy for 10 items. The item-person dimensionality map confirmed these findings. Results from the overall Rasch model fit and Principal Component Analysis were suggestive of a second dimension. For all the items combined, none of the item categories were 'category', 'threshold' or 'step' disordered; however, all subscales demonstrated category disordered functioning. Findings suggest an urgent need to further investigate the underlying structure of the SWAL-QOL and its psychometric characteristics using IRT.

  13. Evaluation properties of the French version of the OUT-PATSAT35 satisfaction with care questionnaire according to classical and item response theory analyses.

    Science.gov (United States)

    Panouillères, M; Anota, A; Nguyen, T V; Brédart, A; Bosset, J F; Monnier, A; Mercier, M; Hardouin, J B

    2014-09-01

    The present study investigates the properties of the French version of the OUT-PATSAT35 questionnaire, which evaluates the outpatients' satisfaction with care in oncology using classical analysis (CTT) and item response theory (IRT). This cross-sectional multicenter study includes 692 patients who completed the questionnaire at the end of their ambulatory treatment. CTT analyses tested the main psychometric properties (convergent and divergent validity, and internal consistency). IRT analyses were conducted separately for each OUT-PATSAT35 domain (the doctors, the nurses or the radiation therapists and the services/organization) by models from the Rasch family. We examined the fit of the data to the model expectations and tested whether the model assumptions of unidimensionality, monotonicity and local independence were respected. A total of 605 (87.4%) respondents were analyzed with a mean age of 64 years (range 29-88). Internal consistency for all scales separately and for the three main domains was good (Cronbach's α 0.74-0.98). IRT analyses were performed with the partial credit model. No disordered thresholds of polytomous items were found. Each domain showed high reliability but fitted poorly to the Rasch models. Three items in particular, the item about "promptness" in the doctors' domain and the items about "accessibility" and "environment" in the services/organization domain, presented the highest default of fit. A correct fit of the Rasch model can be obtained by dropping these items. Most of the local dependence concerned items about "information provided" in each domain. A major deviation of unidimensionality was found in the nurses' domain. CTT showed good psychometric properties of the OUT-PATSAT35. However, the Rasch analysis revealed some misfitting and redundant items. Taking the above problems into consideration, it could be interesting to refine the questionnaire in a future study.

  14. Development of a subjective cognitive decline questionnaire using item response theory: a pilot study.

    Science.gov (United States)

    Gifford, Katherine A; Liu, Dandan; Romano, Raymond; Jones, Richard N; Jefferson, Angela L

    2015-12-01

    Subjective cognitive decline (SCD) may indicate unhealthy cognitive changes, but no standardized SCD measurement exists. This pilot study aims to identify reliable SCD questions. 112 cognitively normal (NC, 76±8 years, 63% female), 43 mild cognitive impairment (MCI; 77±7 years, 51% female), and 33 diagnostically ambiguous participants (79±9 years, 58% female) were recruited from a research registry and completed 57 self-report SCD questions. Psychometric methods were used for item-reduction. Factor analytic models assessed unidimensionality of the latent trait (SCD); 19 items were removed with extreme response distribution or trait-fit. Item response theory (IRT) provided information about question utility; 17 items with low information were dropped. Post-hoc simulation using computerized adaptive test (CAT) modeling selected the most commonly used items (n=9 of 21 items) that represented the latent trait well (r=0.94) and differentiated NC from MCI participants (F(1,146)=8.9, p=0.003). Item response theory and computerized adaptive test modeling identified nine reliable SCD items. This pilot study is a first step toward refining SCD assessment in older adults. Replication of these findings and validation with Alzheimer's disease biomarkers will be an important next step for the creation of a SCD screener.

  15. The cultural fairness of the 12-item General Health Questionnaire among diverse adolescents.

    Science.gov (United States)

    Bowe, Anica

    2017-01-01

    The 12-item general health questionnaire (GHQ-12) was used in the Longitudinal Study of Young People in England (LSYPE; N = 15,770) to collect measures on adolescent mental health. Given the debate in current literature regarding the dimensionality of the GHQ-12, this study examined the cultural sensitivity of the instrument at the item level for each of the 7 major ethnic groups within the database. This study used a hybrid approach of ordinal logistic regression and item response theory (IRT) to examine the presence of differential item functioning (DIF) on the questionnaire. Results demonstrated that uniform, nonuniform, and overall DIF were present on items between White and Asian adolescents (7 items), White and Black Caribbean adolescents (1 item), and White and Black African adolescents (7 items), however all McFadden's pseudo R² effect size estimates indicated that the DIF was negligible. Overall, there were cumulative small scale level effects for the Mixed/Biracial, Asian, and Black African groups, but in each case the bias was only marginal. Findings demonstrate that the GHQ-12 can be considered culturally sensitive for adolescents from diverse ethnic groups in England, but follow-up studies are necessary. Implications for future education and health policies as well as the use of IR-based approaches for psychological instruments are discussed. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  16. Pretest-Posttest-Posttest Multilevel IRT Modeling of Competence Growth of Students in Higher Education in Germany

    NARCIS (Netherlands)

    Schmidt, Susanne; Zlatkin-Troitschanskaia, Olga; Fox, Gerardus J.A.

    2016-01-01

    Longitudinal research in higher education faces several challenges. Appropriate methods of analyzing competence growth of students are needed to deal with those challenges and thereby obtain valid results. In this article, a pretest-posttest-posttest multivariate multilevel IRT model for repeated

  17. An Introduction to the DA-T Gibbs Sampler for the Two-Parameter Logistic (2PL Model and Beyond

    Directory of Open Access Journals (Sweden)

    Gunter Maris

    2005-01-01

    Full Text Available The DA-T Gibbs sampler is proposed by Maris and Maris (2002 as a Bayesian estimation method for a wide variety of Item Response Theory (IRT models. The present paper provides an expository account of the DAT Gibbs sampler for the 2PL model. However, the scope is not limited to the 2PL model. It is demonstrated how the DA-T Gibbs sampler for the 2PL may be used to build, quite easily, Gibbs samplers for other IRT models. Furthermore, the paper contains a novel, intuitive derivation of the Gibbs sampler and could be read for a graduate course on sampling.

  18. Modeling Local Item Dependence in Cloze and Reading Comprehension Test Items Using Testlet Response Theory

    Science.gov (United States)

    Baghaei, Purya; Ravand, Hamdollah

    2016-01-01

    In this study the magnitudes of local dependence generated by cloze test items and reading comprehension items were compared and their impact on parameter estimates and test precision was investigated. An advanced English as a foreign language reading comprehension test containing three reading passages and a cloze test was analyzed with a…

  19. Radiation conditions at the training IRT-2000 and IR-100 reactors

    International Nuclear Information System (INIS)

    Fedorin, Eh.V.; Bronshtejn, I.Eh; Martynov, Yu.N.; Chistyakov, N.I.

    1978-01-01

    The experience is reviewed of radiation hygiene surveys and radiation safety provision during instructional processes on two training and research nuclear reactors of the IRT-2000 type (No. 1 and No. 2) and on an IR-200 reactor. From an analysis of individual dosimetry data the conclusion is made that the trainees and personnel are exposed mainly to external gamma-radiation and also, to a minor degree, to thermal neutrons and beta-radiation. It has been found that a high level of radiation safety is ensured on the training and research so that research and instruction activities are conducted at annual levels of exposure substantially lower than 0.5 rem in the case of trainees and 5 rem in the case of personnel

  20. Neutron polarizing set-up of the Sofia IRT research reactor

    International Nuclear Information System (INIS)

    Krezhov, K.; Mikhajlova, V.; Okorokov, A.

    1990-01-01

    Neutron polarizing set-up of one of the horizontal beam tubes of the IRT-200 research reactor of the Bulgarian Institute of Nuclear Research and Nuclear Energy is presented. Neutron mirrors are extensively used in an effort to compensate the moderate reactor beam intensity by the high reflected intensity and wide-band transmittance of the mirror neutron guides. Time-to-flight technique using a slotted neutron absorbing chopper with a horizontal rotation axis has been applied to obtain the exit neutron spectra. Beam polarization and flipping ratios have been determined. Cadmium ratio in the polarized beam has been found almost 10 4 and the average polarization has been measured to be higher than 96%. 3 figs, 3 refs

  1. Item response theory analysis of the Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised in the Pooled Resource Open-Access ALS Clinical Trials Database.

    Science.gov (United States)

    Bacci, Elizabeth D; Staniewska, Dorota; Coyne, Karin S; Boyer, Stacey; White, Leigh Ann; Zach, Neta; Cedarbaum, Jesse M

    2016-01-01

    Our objective was to examine dimensionality and item-level performance of the Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised (ALSFRS-R) across time using classical and modern test theory approaches. Confirmatory factor analysis (CFA) and Item Response Theory (IRT) analyses were conducted using data from patients with amyotrophic lateral sclerosis (ALS) Pooled Resources Open-Access ALS Clinical Trials (PRO-ACT) database with complete ALSFRS-R data (n = 888) at three time-points (Time 0, Time 1 (6-months), Time 2 (1-year)). Results demonstrated that in this population of 888 patients, mean age was 54.6 years, 64.4% were male, and 93.7% were Caucasian. The CFA supported a 4* individual-domain structure (bulbar, gross motor, fine motor, and respiratory domains). IRT analysis within each domain revealed misfitting items and overlapping item response category thresholds at all time-points, particularly in the gross motor and respiratory domain items. Results indicate that many of the items of the ALSFRS-R may sub-optimally distinguish among varying levels of disability assessed by each domain, particularly in patients with less severe disability. Measure performance improved across time as patient disability severity increased. In conclusion, modifications to select ALSFRS-R items may improve the instrument's specificity to disability level and sensitivity to treatment effects.

  2. The SF-8 Spanish Version for Health-Related Quality of Life Assessment: Psychometric Study with IRT and CFA Models.

    Science.gov (United States)

    Tomás, José M; Galiana, Laura; Fernández, Irene

    2018-03-22

    The aim of current research is to analyze the psychometric properties of the Spanish version of the SF-8, overcoming previous shortcomings. A double line of analyses was used: competitive structural equations models to establish factorial validity, and Item Response theory to analyze item psychometric characteristics and information. 593 people aged 60 years or older, attending long life learning programs at the University were surveyed. Their age ranged from 60 to 92 years old. 67.6% were women. The survey included scales on personality dimensions, attitudes, perceptions, and behaviors related to aging. Competitive confirmatory models pointed out two-factors (physical and mental health) as the best representation of the data: χ2(13) = 72.37 (p < .01); CFI = .99; TLI = .98; RMSEA = .08 (.06, .10). Item 5 was removed because of unreliability and cross-loading. Graded response models showed appropriate fit for two-parameter logistic model both the physical and the mental dimensions. Item Information Curves and Test Information Functions pointed out that the SF-8 was more informative for low levels of health. The Spanish SF-8 has adequate psychometric properties, being better represented by two dimensions, once Item 5 is removed. Gathering evidence on patient-reported outcome measures is of crucial importance, as this type of measurement instruments are increasingly used in clinical arena.

  3. The emotion dysregulation inventory: Psychometric properties and item response theory calibration in an autism spectrum disorder sample.

    Science.gov (United States)

    Mazefsky, Carla A; Yu, Lan; White, Susan W; Siegel, Matthew; Pilkonis, Paul A

    2018-04-06

    Individuals with autism spectrum disorder (ASD) often present with prominent emotion dysregulation that requires treatment but can be difficult to measure. The Emotion Dysregulation Inventory (EDI) was created using methods developed by the Patient-Reported Outcomes Measurement Information System (PROMIS ® ) to capture observable indicators of poor emotion regulation. Caregivers of 1,755 youth with ASD completed 66 candidate EDI items, and the final 30 items were selected based on classical test theory and item response theory (IRT) analyses. The analyses identified two factors: (a) Reactivity, characterized by intense, rapidly escalating, sustained, and poorly regulated negative emotional reactions, and (b) Dysphoria, characterized by anhedonia, sadness, and nervousness. The final items did not show differential item functioning (DIF) based on gender, age, intellectual ability, or verbal ability. Because the final items were calibrated using IRT, even a small number of items offers high precision, minimizing respondent burden. IRT co-calibration of the EDI with related measures demonstrated its superiority in assessing the severity of emotion dysregulation with as few as seven items. Validity of the EDI was supported by expert review, its association with related constructs (e.g., anxiety and depression symptoms, aggression), higher scores in psychiatric inpatients with ASD compared to a community ASD sample, and demonstration of test-retest stability and sensitivity to change. In sum, the EDI provides an efficient and sensitive method to measure emotion dysregulation for clinical assessment, monitoring, and research in youth with ASD of any level of cognitive or verbal ability. Autism Res 2018. © 2018 International Society for Autism Research, Wiley Periodicals, Inc. This paper describes a new measure of poor emotional control called the Emotion Dysregulation Inventory (EDI). Caregivers of 1,755 youth with ASD completed candidate items, and advanced statistical

  4. Location Indices for Ordinal Polytomous Items Based on Item Response Theory. Research Report. ETS RR-15-20

    Science.gov (United States)

    Ali, Usama S.; Chang, Hua-Hua; Anderson, Carolyn J.

    2015-01-01

    Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index is needed to describe an item's location along a latent trait continuum. Situations in which a single index would be needed include item selection in computerized adaptive testing or test assembly. Therefore single…

  5. Evolution of a Test Item

    Science.gov (United States)

    Spaan, Mary

    2007-01-01

    This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…

  6. Evaluating and Refining the Construct of Sexual Quality With Item Response Theory: Development of the Quality of Sex Inventory.

    Science.gov (United States)

    Shaw, Amanda M; Rogge, Ronald D

    2016-02-01

    This study took a critical look at the construct of sexual quality. The 65 items of four well-validated self-report measures of sexual satisfaction (the Index of Sexual Satisfaction [ISS], Hudson, Harrison, & Crosscup, 1981; the Global Measure of Sexual Satisfaction [GMSEX], Lawrance & Byers, 1995; the Pinney Sexual Satisfaction Inventory [PSSI], Pinney, Gerrard, & Denney, 1987; the Young Sexual Satisfaction Scale [YSSS], Young, Denny, Luquis, & Young, 1998) and an additional 74 potential sexual quality items were given to 3060 online participants. Using Item Response Theory (IRT), we demonstrated that the ISS, YSSS, and PSSI scales provided suboptimal levels of precision in assessing sexual quality, particularly given the length of those scales. Exploratory factor analyses, IRT, differential item functioning analyses, and longitudinal responsiveness analyses were used to develop and evaluate the Quality of Sex Inventory. Results suggested that, in comparison to existing scales, the QSI (1) offers investigators and clinicians more theoretically focused scales, (2) distinguishes sexual satisfaction from sexual dissatisfaction, and (3) offers greater precision and power for detecting differences with (4) comparably high levels of responsiveness for detecting change over time despite being notably shorter than most of the existing scales. The QSI-satisfaction subscales demonstrated strong convergent validity with other measures of sexual satisfaction and excellent construct validity with anchor scales from the nomological net surrounding that construct, suggesting that they continue to assess the same theoretical construct as prior scales. Implications for research are discussed.

  7. A symptom profile of depression among Asian Americans: is there evidence for differential item functioning of depressive symptoms?

    Science.gov (United States)

    Kalibatseva, Z; Leong, F T L; Ham, E H

    2014-09-01

    Theoretical and clinical publications suggest the existence of cultural differences in the expression and experience of depression. Measurement non-equivalence remains a potential methodological explanation for the lower prevalence of depression among Asian Americans compared to European Americans. This study compared DSM-IV depressive symptoms among Asian Americans and European Americans using secondary data analysis of the Collaborative Psychiatric Epidemiology Surveys (CPES). The Composite International Diagnostic Interview (CIDI) was used for the assessment of depressive symptoms. Of the entire sample, 310 Asian Americans and 1974 European Americans reported depressive symptoms and were included in the analyses. Measurement variance was examined with an item response theory differential item functioning (IRT DIF) analysis. χ2 analyses indicated that, compared to Asian Americans, European American participants more frequently endorsed affective symptoms such as 'feeling depressed', 'feeling discouraged' and 'cried more often'. The IRT analysis detected DIF for four out of the 15 depression symptom items. At equal levels of depression, Asian Americans endorsed feeling worthless and appetite changes more easily than European Americans, and European Americans endorsed feeling nervous and crying more often than Asian Americans. Asian Americans did not seem to over-report somatic symptoms; however, European Americans seemed to report more affective symptoms than Asian Americans. The results suggest that there was measurement variance in a few of the depression items.

  8. Measuring depression after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Depression item bank and linkage with PHQ-9.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Bombardier, Charles H; Pohlig, Ryan T; Heinemann, Allen W; Carle, Adam; Choi, Seung W

    2015-05-01

    To develop a calibrated spinal cord injury-quality of life (SCI-QOL) item bank, computer adaptive test (CAT), and short form to assess depressive symptoms experienced by individuals with SCI, transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a crosswalk to the Patient Health Questionnaire (PHQ)-9. We used grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, item response theory (IRT) analyses, and statistical linking techniques to transform scores to a PROMIS metric and to provide a crosswalk with the PHQ-9. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. Spinal Cord Injury--Quality of Life (SCI-QOL) Depression Item Bank Individuals with SCI were involved in all phases of SCI-QOL development. A sample of 716 individuals with traumatic SCI completed 35 items assessing depression, 18 of which were PROMIS items. After removing 7 non-PROMIS items, factor analyses confirmed a unidimensional pool of items. We used a graded response IRT model to estimate slopes and thresholds for the 28 retained items. The SCI-QOL Depression measure correlated 0.76 with the PHQ-9. The SCI-QOL Depression item bank provides a reliable and sensitive measure of depressive symptoms with scores reported in terms of general population norms. We provide a crosswalk to the PHQ-9 to facilitate comparisons between measures. The item bank may be administered as a CAT or as a short form and is suitable for research and clinical applications.

  9. Semiparametric Item Response Functions in the Context of Guessing

    Science.gov (United States)

    Falk, Carl F.; Cai, Li

    2016-01-01

    We present a logistic function of a monotonic polynomial with a lower asymptote, allowing additional flexibility beyond the three-parameter logistic model. We develop a maximum marginal likelihood-based approach to estimate the item parameters. The new item response model is demonstrated on math assessment data from a state, and a computationally…

  10. Bayes factor covariance testing in item response models

    NARCIS (Netherlands)

    Fox, J.P.; Mulder, J.; Sinharay, Sandip

    2017-01-01

    Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning

  11. Bayes Factor Covariance Testing in Item Response Models

    NARCIS (Netherlands)

    Fox, Jean-Paul; Mulder, Joris; Sinharay, Sandip

    2017-01-01

    Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning

  12. Language-related differential item functioning between English and German PROMIS Depression items is negligible.

    Science.gov (United States)

    Fischer, H Felix; Wahl, Inka; Nolte, Sandra; Liegl, Gregor; Brähler, Elmar; Löwe, Bernd; Rose, Matthias

    2017-12-01

    To investigate differential item functioning (DIF) of PROMIS Depression items between US and German samples we compared data from the US PROMIS calibration sample (n = 780), a German general population survey (n = 2,500) and a German clinical sample (n = 621). DIF was assessed in an ordinal logistic regression framework, with 0.02 as criterion for R 2 -change and 0.096 for Raju's non-compensatory DIF. Item parameters were initially fixed to the PROMIS Depression metric; we used plausible values to account for uncertainty in depression estimates. Only four items showed DIF. Accounting for DIF led to negligible effects for the full item bank as well as a post hoc simulated computer-adaptive test (German general population sample was considerably lower compared to the US reference value of 50. Overall, we found little evidence for language DIF between US and German samples, which could be addressed by either replacing the DIF items by items not showing DIF or by scoring the short form in German samples with the corrected item parameters reported. Copyright © 2016 John Wiley & Sons, Ltd.

  13. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate and massive objects require a longer procedure and will therefore take longer.

  14. Spare Items validation

    International Nuclear Information System (INIS)

    Fernandez Carratala, L.

    1998-01-01

    There is an increasing difficulty for purchasing safety related spare items, with certifications by manufacturers for maintaining the original qualifications of the equipment of destination. The main reasons are, on the top of the logical evolution of technology, applied to the new manufactured components, the quitting of nuclear specific production lines and the evolution of manufacturers quality systems, originally based on nuclear codes and standards, to conventional industry standards. To face this problem, for many years different Dedication processes have been implemented to verify whether a commercial grade element is acceptable to be used in safety related applications. In the same way, due to our particular position regarding the spare part supplies, mainly from markets others than the american, C.N. Trillo has developed a methodology called Spare Items Validation. This methodology, which is originally based on dedication processes, is not a single process but a group of coordinated processes involving engineering, quality and management activities. These are to be performed on the spare item itself, its design control, its fabrication and its supply for allowing its use in destinations with specific requirements. The scope of application is not only focussed on safety related items, but also to complex design, high cost or plant reliability related components. The implementation in C.N. Trillo has been mainly curried out by merging, modifying and making the most of processes and activities which were already being performed in the company. (Author)

  15. Selecting Lower Priced Items.

    Science.gov (United States)

    Kleinert, Harold L.; And Others

    1988-01-01

    A program used to teach moderately to severely mentally handicapped students to select the lower priced items in actual shopping activities is described. Through a five-phase process, students are taught to compare prices themselves as well as take into consideration variations in the sizes of containers and varying product weights. (VW)

  16. Design and Preliminary Results of a Feedback Circuit for Plasma Displacement Control in IR-T1 Tokamak

    International Nuclear Information System (INIS)

    TalebiTaher, A.; Ghoranneviss, M.; Tarkeshian, R.; Salem, M. K.; Khorshid, P.

    2008-01-01

    Since displacement is very important for plasma position control, in IR-T1 tokamak a combination of two cosine coils and two saddle sine coils is used for horizontal displacement measurement. According to the multiple moment theory, the output of these coils linearly depends to radial displacement of plasma column. A new circuit for adding these signals to feedback system designed and unwanted effects of other fields in final output compensated. After compensation and calibration of the system, the output of horizontal displacement circuits applied to feedback control system. By considers the required auxiliary vertical field, a proportional amplifier and driver circuit are constructed to drive power transistors these power transistors switch the feedback bank capacitors. In the experiment, a good linear proportionality between displacement and output observed by applying an appropriate feedback field, the linger confinement time in IR-T1 tokamak obtained, applying this system to discharge increased the plasma duration and realizes repetitive discharges

  17. The Role of Item Models in Automatic Item Generation

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2012-01-01

    Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…

  18. Developing a Screening Inventory Reading Test (IRT for the Isfahanian Students of the First to Fifth Grade

    Directory of Open Access Journals (Sweden)

    Bijan Shafiei

    2009-12-01

    Full Text Available Background and aim: Reading is one of the most essential skills in this century. Reading disorders can cause several problems for the person who has reading disorder. Early assessment and diagnosis play an important role in treatment of this disorder. The main purpose of this study was to develope a screening inventory reading test (IRT for first to fifth grade student in Isfahan in order to early diagnosis of reading disorder.Materials and Methods: The test, consisting of 100 words context and four comprehension questions, named Inventory Reading Test (IRT, was evaluated by several speech therapists. It was standardized by testing on one thousand boys and girls, 200 students in every grade, that were selected through a multi-stage random sampling method. Test was performed on two other groups, a normal and a reading-disordered.Results: Scores of reading accuracy and velocity were highly correlated with the test total score. Test reliability was calculated as 0.77 by Cronbach`s alpha measure. There was significant difference between two groups mean score (p=0.01.Conclusion: IRT seems to be an appropriate tool for screening reading disorder of first to fifth grade students.

  19. Combined application of OGTT, IRT and CPRT for diagnosis and treatment of type 2 diabetes mellitus

    International Nuclear Information System (INIS)

    Wei Zikun; Yang Xiaoli; Tian Zhufang

    2006-01-01

    Objective: To assess the value of combined clinical application of oral glucose tolerance test (OGTT), insulin release test (IRT) and C-peptide release test (CPRT) for the diagnosis and treatment of type 2 diabetes mellitus (DM2). Methods: Retrospect analysis of the data of the results of these three tests in 217 subjects examined was performed. Results: (1) Among the 217 subjects, 71 of them were not diagnosed as diabetics. However, upon further scrutinization of the data, 49 (69%) should be classified as diabetics. Fasting blood sugar (FPG) levels were normal in 53% of the 49, but 2h PG levels were mostly elevated with the exception of only 4 (4/49, 8%), Therefore, 2h PG levels were much more useful for screening of diabetes than FPG levels were. (2) Treatment result in these patients was not very satisfactory: only 24% of the patients (35/146) had their disease well-controlled. Conclusion: Combined clinical application of OGTT, ITR and CPRT would enhance the diagnostic accuracy of diabetes with fewer cases missed. (authors)

  20. Determination of the heat transfer coefficient from IRT measurement data using the Trefftz method

    Directory of Open Access Journals (Sweden)

    Maciejewska Beata

    2016-01-01

    Full Text Available The paper presents the method of heat transfer coefficient determination for boiling research during FC-72 flow in the minichannels, each 1.7 mm deep, 24 mm wide and 360 mm long. The heating element was the thin foil, enhanced on the side which comes into contact with fluid in the minichannels. Local values of the heat transfer coefficient were calculated from the Robin boundary condition. The foil temperature distribution and the derivative of the foil temperature were obtained by solving the two-dimensional inverse heat conduction problem, due to measurements obtained by IRT. Calculations was carried out by the method based on the approximation of the solution of the problem using a linear combination of Trefftz functions. The basic property of this functions is they satisfy the governing equation. Unknown coefficients of linear combination of Trefftz functions are calculated from the minimization of the functional that expresses the mean square error of the approximate solution on the boundary. The results presented as IR thermographs, two-phase flow structure images and the heat transfer coefficient as a function of the distance from the channel inlet, were analyzed.

  1. Robustness of the charge-ordered phases in IrTe2 against photoexcitation

    Science.gov (United States)

    Monney, C.; Schuler, A.; Jaouen, T.; Mottas, M.-L.; Wolf, Th.; Merz, M.; Muntwiler, M.; Castiglioni, L.; Aebi, P.; Weber, F.; Hengsberger, M.

    2018-02-01

    We present a time-resolved angle-resolved photoelectron spectroscopy study of IrTe2, which undergoes two first-order structural and charge-ordered phase transitions on cooling below 270 K and below 180 K. The possibility of inducing a phase transition by photoexcitation with near-infrared femtosecond pulses is investigated in the charge-ordered phases. We observe changes of the spectral function occurring within a few hundreds of femtoseconds and persisting up to several picoseconds, which we interpret as a partial photoinduced phase transition (PIPT). The necessary time for photoinducing these spectral changes increases with increasing photoexcitation density and reaches time scales longer than the rise time of the transient electronic temperature. We conclude that the PIPT is driven by a transient increase of the lattice temperature following the energy transfer from the electrons. However, the photoinduced changes of the spectral function are small, which indicates that the low-temperature phase is particularly robust against photoexcitation. We suggest that the system might be trapped in an out-of-equilibrium state, for which only a partial structural transition is achieved.

  2. Assessment of Teacher Perceived Skill in Classroom Assessment Practices Using IRT Models

    Science.gov (United States)

    Koloi-Keaikitse, Setlhomo

    2017-01-01

    The purpose of this study was to assess teacher perceived skill in classroom assessment practices. Data were collected from a sample of (N = 691) teachers selected from government primary, junior secondary, and senior secondary schools in Botswana. Item response theory models were used to identify teacher response on items that measured their…

  3. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate, preparation of the package and related paperwork). Large and massive objects require a longer procedure and will therefore take longer.

  4. Further Simplification of the Simple Erosion Narrowing Score With Item Response Theory Methodology.

    Science.gov (United States)

    Oude Voshaar, Martijn A H; Schenk, Olga; Ten Klooster, Peter M; Vonkeman, Harald E; Bernelot Moens, Hein J; Boers, Maarten; van de Laar, Mart A F J

    2016-08-01

    To further simplify the simple erosion narrowing score (SENS) by removing scored areas that contribute the least to its measurement precision according to analysis based on item response theory (IRT) and to compare the measurement performance of the simplified version to the original. Baseline and 18-month data of the Combinatietherapie Bij Reumatoide Artritis (COBRA) trial were modeled using longitudinal IRT methodology. Measurement precision was evaluated across different levels of structural damage. SENS was further simplified by omitting the least reliably scored areas. Discriminant validity of SENS and its simplification were studied by comparing their ability to differentiate between the COBRA and sulfasalazine arms. Responsiveness was studied by comparing standardized change scores between versions. SENS data showed good fit to the IRT model. Carpal and feet joints contributed the least statistical information to both erosion and joint space narrowing scores. Omitting the joints of the foot reduced measurement precision for the erosion score in cases with below-average levels of structural damage (relative efficiency compared with the original version ranged 35-59%). Omitting the carpal joints had minimal effect on precision (relative efficiency range 77-88%). Responsiveness of a simplified SENS without carpal joints closely approximated the original version (i.e., all Δ standardized change scores were ≤0.06). Discriminant validity was also similar between versions for both the erosion score (relative efficiency = 97%) and the SENS total score (relative efficiency = 84%). Our results show that the carpal joints may be omitted from the SENS without notable repercussion for its measurement performance. © 2016, American College of Rheumatology.

  5. Item Response Theory for Peer Assessment

    Science.gov (United States)

    Uto, Masaki; Ueno, Maomi

    2016-01-01

    As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…

  6. An item-response theory approach to safety climate measurement: The Liberty Mutual Safety Climate Short Scales.

    Science.gov (United States)

    Huang, Yueng-Hsiang; Lee, Jin; Chen, Zhuo; Perry, MacKenna; Cheung, Janelle H; Wang, Mo

    2017-06-01

    Zohar and Luria's (2005) safety climate (SC) scale, measuring organization- and group- level SC each with 16 items, is widely used in research and practice. To improve the utility of the SC scale, we shortened the original full-length SC scales. Item response theory (IRT) analysis was conducted using a sample of 29,179 frontline workers from various industries. Based on graded response models, we shortened the original scales in two ways: (1) selecting items with above-average discriminating ability (i.e. offering more than 6.25% of the original total scale information), resulting in 8-item organization-level and 11-item group-level SC scales; and (2) selecting the most informative items that together retain at least 30% of original scale information, resulting in 4-item organization-level and 4-item group-level SC scales. All four shortened scales had acceptable reliability (≥0.89) and high correlations (≥0.95) with the original scale scores. The shortened scales will be valuable for academic research and practical survey implementation in improving occupational safety. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  7. Assessing Psychopathy Among Justice Involved Adolescents with the PCL: YV: An Item Response Theory Examination Across Gender

    Science.gov (United States)

    Tsang, Siny; Schmidt, Karen M.; Vincent, Gina M.; Salekin, Randall T.; Moretti, Marlene M.; Odgers, Candice L.

    2014-01-01

    This study used an item response theory (IRT) model and a large adolescent sample of justice involved youth (N = 1,007, 38% female) to examine the item functioning of the Psychopathy Checklist – Youth Version (PCL: YV). Items that were most discriminating (or most sensitive to changes) of the latent trait (thought to be psychopathy) among adolescents included “Glibness/superficial charm”, “Lack of remorse”, and “Need for stimulation”, whereas items that were least discriminating included “Pathological lying”, “Failure to accept responsibility”, and “Lacks goals.” The items “Impulsivity” and “Irresponsibility” were the most likely to be rated high among adolescents, whereas “Parasitic lifestyle”, and “Glibness/superficial charm” were the most likely to be rated low. Evidence of differential item functioning (DIF) on four of the 13 items was found between boys and girls. “Failure to accept responsibility” and “Impulsivity” were endorsed more frequently to describe adolescent girls than boys at similar levels of the latent trait, and vice versa for “Grandiose sense of self-worth” and “Lacks goals.” The DIF findings suggest that four PCL: YV items function differently between boys and girls. PMID:25580672

  8. Sources of interference in item and associative recognition memory.

    Science.gov (United States)

    Osth, Adam F; Dennis, Simon

    2015-04-01

    A powerful theoretical framework for exploring recognition memory is the global matching framework, in which a cue's memory strength reflects the similarity of the retrieval cues being matched against the contents of memory simultaneously. Contributions at retrieval can be categorized as matches and mismatches to the item and context cues, including the self match (match on item and context), item noise (match on context, mismatch on item), context noise (match on item, mismatch on context), and background noise (mismatch on item and context). We present a model that directly parameterizes the matches and mismatches to the item and context cues, which enables estimation of the magnitude of each interference contribution (item noise, context noise, and background noise). The model was fit within a hierarchical Bayesian framework to 10 recognition memory datasets that use manipulations of strength, list length, list strength, word frequency, study-test delay, and stimulus class in item and associative recognition. Estimates of the model parameters revealed at most a small contribution of item noise that varies by stimulus class, with virtually no item noise for single words and scenes. Despite the unpopularity of background noise in recognition memory models, background noise estimates dominated at retrieval across nearly all stimulus classes with the exception of high frequency words, which exhibited equivalent levels of context noise and background noise. These parameter estimates suggest that the majority of interference in recognition memory stems from experiences acquired before the learning episode. (c) 2015 APA, all rights reserved).

  9. Applying Kaplan-Meier to Item Response Data

    Science.gov (United States)

    McNeish, Daniel

    2018-01-01

    Some IRT models can be equivalently modeled in alternative frameworks such as logistic regression. Logistic regression can also model time-to-event data, which concerns the probability of an event occurring over time. Using the relation between time-to-event models and logistic regression and the relation between logistic regression and IRT, this…

  10. Modeling the World Health Organization Disability Assessment Schedule II using non-parametric item response models.

    Science.gov (United States)

    Galindo-Garre, Francisca; Hidalgo, María Dolores; Guilera, Georgina; Pino, Oscar; Rojo, J Emilio; Gómez-Benito, Juana

    2015-03-01

    The World Health Organization Disability Assessment Schedule II (WHO-DAS II) is a multidimensional instrument developed for measuring disability. It comprises six domains (getting around, self-care, getting along with others, life activities and participation in society). The main purpose of this paper is the evaluation of the psychometric properties for each domain of the WHO-DAS II with parametric and non-parametric Item Response Theory (IRT) models. A secondary objective is to assess whether the WHO-DAS II items within each domain form a hierarchy of invariantly ordered severity indicators of disability. A sample of 352 patients with a schizophrenia spectrum disorder is used in this study. The 36 items WHO-DAS II was administered during the consultation. Partial Credit and Mokken scale models are used to study the psychometric properties of the questionnaire. The psychometric properties of the WHO-DAS II scale are satisfactory for all the domains. However, we identify a few items that do not discriminate satisfactorily between different levels of disability and cannot be invariantly ordered in the scale. In conclusion the WHO-DAS II can be used to assess overall disability in patients with schizophrenia, but some domains are too general to assess functionality in these patients because they contain items that are not applicable to this pathology. Copyright © 2014 John Wiley & Sons, Ltd.

  11. The Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES): item response theory findings.

    Science.gov (United States)

    Grigg, Kaine; Manderson, Lenore

    2016-03-17

    Racism and associated discrimination are pervasive and persistent challenges with multiple cumulative deleterious effects contributing to inequities in various health outcomes. Globally, research over the past decade has shown consistent associations between racism and negative health concerns. Such research confirms that race endures as one of the strongest predictors of poor health. Due to the lack of validated Australian measures of racist attitudes, RACES (Racism, Acceptance, and Cultural-Ethnocentrism Scale) was developed. Here, we examine RACES' psychometric properties, including the latent structure, utilising Item Response Theory (IRT). Unidimensional and Multidimensional Rating Scale Model (RSM) Rasch analyses were utilised with 296 Victorian primary school students and 182 adolescents and 220 adults from the Australian community. RACES was demonstrated to be a robust 24-item three-dimensional scale of Accepting Attitudes (12 items), Racist Attitudes (8 items), and Ethnocentric Attitudes (4 items). RSM Rasch analyses provide strong support for the instrument as a robust measure of racist attitudes in the Australian context, and for the overall factorial and construct validity of RACES across primary school children, adolescents, and adults. RACES provides a reliable and valid measure that can be utilised across the lifespan to evaluate attitudes towards all racial, ethnic, cultural, and religious groups. A core function of RACES is to assess the effectiveness of interventions to reduce community levels of racism and in turn inequities in health outcomes within Australia.

  12. Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form.

    Science.gov (United States)

    Kalpakjian, Claire Z; Tate, Denise G; Kisala, Pamela A; Tulsky, David S

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory-(IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. A total of 717 individuals with SCI completed the self-esteem items. A unidimensional model was observed (CFI=0.946; RMSEA=0.087) and measurement precision was good (theta range between -2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.

  13. Gender-Based Differential Item Performance in Mathematics Achievement Items.

    Science.gov (United States)

    Doolittle, Allen E.; Cleary, T. Anne

    1987-01-01

    Eight randomly equivalent samples of high school seniors were each given a unique form of the ACT Assessment Mathematics Usage Test (ACTM). Signed measures of differential item performance (DIP) were obtained for each item in the eight ACTM forms. DIP estimates were analyzed and a significant item category effect was found. (Author/LMO)

  14. Development and psychometric characteristics of the SCI-QOL Bladder Management Difficulties and Bowel Management Difficulties item banks and short forms and the SCI-QOL Bladder Complications scale.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Tate, Denise G; Spungen, Ann M; Kirshblum, Steven C

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury--Quality of Life (SCI-QOL) Bladder Management Difficulties and Bowel Management Difficulties item banks and Bladder Complications scale. Using a mixed-methods design, a pool of items assessing bladder and bowel-related concerns were developed using focus groups with individuals with spinal cord injury (SCI) and SCI clinicians, cognitive interviews, and item response theory (IRT) analytic approaches, including tests of model fit and differential item functioning. Thirty-eight bladder items and 52 bowel items were tested at the University of Michigan, Kessler Foundation Research Center, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters VA Medical Center, Bronx, NY. Seven hundred fifty-seven adults with traumatic SCI. The final item banks demonstrated unidimensionality (Bladder Management Difficulties CFI=0.965; RMSEA=0.093; Bowel Management Difficulties CFI=0.955; RMSEA=0.078) and acceptable fit to a graded response IRT model. The final calibrated Bladder Management Difficulties bank includes 15 items, and the final Bowel Management Difficulties item bank consists of 26 items. Additionally, 5 items related to urinary tract infections (UTI) did not fit with the larger Bladder Management Difficulties item bank but performed relatively well independently (CFI=0.992, RMSEA=0.050) and were thus retained as a separate scale. The SCI-QOL Bladder Management Difficulties and Bowel Management Difficulties item banks are psychometrically robust and are available as computer adaptive tests or short forms. The SCI-QOL Bladder Complications scale is a brief, fixed-length outcomes instrument for individuals with a UTI.

  15. Item-focussed Trees for the Identification of Items in Differential Item Functioning.

    Science.gov (United States)

    Tutz, Gerhard; Berger, Moritz

    2016-09-01

    A novel method for the identification of differential item functioning (DIF) by means of recursive partitioning techniques is proposed. We assume an extension of the Rasch model that allows for DIF being induced by an arbitrary number of covariates for each item. Recursive partitioning on the item level results in one tree for each item and leads to simultaneous selection of items and variables that induce DIF. For each item, it is possible to detect groups of subjects with different item difficulties, defined by combinations of characteristics that are not pre-specified. The way a DIF item is determined by covariates is visualized in a small tree and therefore easily accessible. An algorithm is proposed that is based on permutation tests. Various simulation studies, including the comparison with traditional approaches to identify items with DIF, show the applicability and the competitive performance of the method. Two applications illustrate the usefulness and the advantages of the new method.

  16. An Effect Size Measure for Raju's Differential Functioning for Items and Tests

    Science.gov (United States)

    Wright, Keith D.; Oshima, T. C.

    2015-01-01

    This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…

  17. Item validity vs. item discrimination index: a redundancy?

    Science.gov (United States)

    Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.

    2018-03-01

    In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.

  18. Landslide Mapping and Characterization through Infrared Thermography (IRT: Suggestions for a Methodological Approach from Some Case Studies

    Directory of Open Access Journals (Sweden)

    William Frodella

    2017-12-01

    Full Text Available In this paper, the potential of Infrared Thermography (IRT as a novel operational tool for landslide surveying, mapping and characterization was tested and demonstrated in different case studies, by analyzing various types of instability processes (rock slide/fall, roto-translational slide-flow. In particular, IRT was applied, both from terrestrial and airborne platforms, in an integrated methodology with other geomatcs methods, such as terrestrial laser scanning (TLS and global positioning systems (GPS, for the detection and mapping of landslides’ potentially hazardous structural and morphological features (structural discontinuities and open fractures, scarps, seepage and moisture zones, landslide drainage network and ponds. Depending on the study areas’ hazard context, the collected remotely sensed data were validated through field inspections, with the purpose of studying and verifying the causes of mass movements. The challenge of this work is to go beyond the current state of the art of IRT in landslide studies, with the aim of improving and extending the investigative capacity of the analyzed technique, in the framework of a growing demand for effective Civil Protection procedures in landslide geo-hydrological disaster managing activities. The proposed methodology proved to be an effective tool for landslide analysis, especially in the field of emergency management, when it is often necessary to gather all the required information in dangerous environments as fast as possible, to be used for the planning of mitigation measures and the evaluation of hazardous scenarios. Advantages and limitations of the proposed method in the field of the explored applications were evaluated, as well as general operative recommendations and future perspectives.

  19. Item Response Theory Modeling of the Philadelphia Naming Test

    Science.gov (United States)

    Fergadiotis, Gerasimos; Kellough, Stacey; Hula, William D.

    2015-01-01

    Purpose: In this study, we investigated the fit of the Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996) to an item-response-theory measurement model, estimated the precision of the resulting scores and item parameters, and provided a theoretical rationale for the interpretation of PNT overall scores by relating…

  20. A Teoria da Resposta ao Item: possíveis contribuições aos estudos em marketing The Item Response Theory: possible contributions to marketing studies

    Directory of Open Access Journals (Sweden)

    Danielle Ramos de Miranda Pereira

    2011-01-01

    Full Text Available A constatação da ampla utilização de escalas multidimensionais por parte dos pesquisadores da área de marketing motivou a elaboração de um artigo com o propósito de discutir a aplicação da Teoria da Resposta ao Item (TRI, bem como apresentar a essa área um método que tem se mostrado bastante eficaz na estimação de construtos comportamentais. Sendo assim, o artigo apresenta uma discussão sobre a TRI, ressaltando seus avanços em relação à Teoria Clássica do Teste (TCT e suas aplicações tradicionais no campo da psicometria e da avaliação educacional. Para verificar sua aplicabilidade nos estudos de marketing, julgou-se adequado conduzir uma aplicação prática da TRI em um estudo envolvendo uma escala já bastante utilizada pelos pesquisadores - a de orientação de mercado (Escala MkTor proposta por Narver e Slater (1990. Os resultados da aplicação demonstraram que, embora o modelo da TRI proposto possa ser considerado satisfatório para a aplicação no contexto da Orientação para o Mercado, existem muitos desafios a serem enfrentados por novos estudos como a construção de uma escala com interpretação prática, indicando o que significa para uma empresa possuir um nível de maturidade associado a um determinado construto. As considerações finais ressaltam que a grande contribuição do artigo aos estudos em marketing é a apresentação de um método alternativo para estimar de forma mais apurada os construtos e avaliar a qualidade dos itens das escalas.The widespread utilization of multidimensional scales by researchers in field of marketing have motivated the conduction of a study to discuss the application of the Item Response Theory (IRT as well as presenting a method that has proved very effective in the estimation of behavioral constructs. Therefore, this article presents a discussion about IRT highlighting its advances regarding the Classical Theory of Tests (CTT and its traditional applications in the

  1. Annual report of the working group 'fuel pin and fuel element mechanics' of the Institut fuer Reaktortechnik (IRT) of the Technische Hochschule Darmstadt for the Fast Breeder Project

    International Nuclear Information System (INIS)

    Fabian, H.; Humbach, W.; Lassmann, K.; Mueller, J.J.; Preusser, T.; Schmelz, K.

    1978-09-01

    This report comprises six single lectures given at an information meeting organized by the Institut fuer Reaktortechnik der Technischen Hochschule Darmstadt (IRT) in Darmstadt on April 24, 1978. The lectures are an account of work performed at IRT on the mechanics of fuel pins and fuel elements and supported by the Fast Breeder Project (PSB) of KfK. These activities can be broken down into studies of the integral fuel pin (URANUS computer code) and into multidimensional studies of the fuel pin using the finite-element method (FINEL and ZIDRIG computer codes). Moreover, a report is presented of the status of the test facility for simulation of out-of-pile cladding tube loads and of the IRT project on the simulation and analysis of radiation damage. (orig./GL) [de

  2. The first critical experiment with a new type of fuel assemblies IRT-3M on the training reactor VR-I

    International Nuclear Information System (INIS)

    Matejka, Karel; Sklenka, Lubomir

    1997-01-01

    The paper 'The first critical experiment with a new type of fuel assemblies IRT-3M on training reactor VR-1 presents basic information about the replacement of fuel on the reactor VR-1 run on FJFI CVUT in Prague. In spring 1997 the IRT-2M fuel type used till then was replaced by the IRT-3M type. When the fuel was replaced, no change in its enrichment was made, i.e. its level remained as 36% 235 U. The replacement itself was carried out in tight co-operation with the Nuclear Research Institute Rez plc., as related to the operation of the research reactor LVR-15. The fuel replacement on the VR-I reactor is a part of the international program RERTR (Reduced Enrichment for Research and Test Reactors) in which the Czech Republic participates. (author)

  3. A dynamic Thurstonian item response theory of motive expression in the picture story exercise: solving the internal consistency paradox of the PSE.

    Science.gov (United States)

    Lang, Jonas W B

    2014-07-01

    The measurement of implicit or unconscious motives using the picture story exercise (PSE) has long been a target of debate in the psychological literature. Most debates have centered on the apparent paradox that PSE measures of implicit motives typically show low internal consistency reliability on common indices like Cronbach's alpha but nevertheless predict behavioral outcomes. I describe a dynamic Thurstonian item response theory (IRT) model that builds on dynamic system theories of motivation, theorizing on the PSE response process, and recent advancements in Thurstonian IRT modeling of choice data. To assess the models' capability to explain the internal consistency paradox, I first fitted the model to archival data (Gurin, Veroff, & Feld, 1957) and then simulated data based on bias-corrected model estimates from the real data. Simulation results revealed that the average squared correlation reliability for the motives in the Thurstonian IRT model was .74 and that Cronbach's alpha values were similar to the real data (value of extant evidence from motivational research using PSE motive measures. (c) 2014 APA, all rights reserved.

  4. Combining item response theory with multiple imputation to equate health assessment questionnaires.

    Science.gov (United States)

    Gu, Chenyang; Gutman, Roee

    2017-09-01

    The assessment of patients' functional status across the continuum of care requires a common patient assessment tool. However, assessment tools that are used in various health care settings differ and cannot be easily contrasted. For example, the Functional Independence Measure (FIM) is used to evaluate the functional status of patients who stay in inpatient rehabilitation facilities, the Minimum Data Set (MDS) is collected for all patients who stay in skilled nursing facilities, and the Outcome and Assessment Information Set (OASIS) is collected if they choose home health care provided by home health agencies. All three instruments or questionnaires include functional status items, but the specific items, rating scales, and instructions for scoring different activities vary between the different settings. We consider equating different health assessment questionnaires as a missing data problem, and propose a variant of predictive mean matching method that relies on Item Response Theory (IRT) models to impute unmeasured item responses. Using real data sets, we simulated missing measurements and compared our proposed approach to existing methods for missing data imputation. We show that, for all of the estimands considered, and in most of the experimental conditions that were examined, the proposed approach provides valid inferences, and generally has better coverages, relatively smaller biases, and shorter interval estimates. The proposed method is further illustrated using a real data set. © 2016, The International Biometric Society.

  5. Reading ability and print exposure: item response theory analysis of the author recognition test.

    Science.gov (United States)

    Moore, Mariah; Gordon, Peter C

    2015-12-01

    In the author recognition test (ART), participants are presented with a series of names and foils and are asked to indicate which ones they recognize as authors. The test is a strong predictor of reading skill, and this predictive ability is generally explained as occurring because author knowledge is likely acquired through reading or other forms of print exposure. In this large-scale study (1,012 college student participants), we used item response theory (IRT) to analyze item (author) characteristics in order to facilitate identification of the determinants of item difficulty, provide a basis for further test development, and optimize scoring of the ART. Factor analysis suggested a potential two-factor structure of the ART, differentiating between literary and popular authors. Effective and ineffective author names were identified so as to facilitate future revisions of the ART. Analyses showed that the ART is a highly significant predictor of the time spent encoding words, as measured using eyetracking during reading. The relationship between the ART and time spent reading provided a basis for implementing a higher penalty for selecting foils, rather than the standard method of ART scoring (names selected minus foils selected). The findings provide novel support for the view that the ART is a valid indicator of reading volume. Furthermore, they show that frequency data can be used to select items of appropriate difficulty, and that frequency data from corpora based on particular time periods and types of texts may allow adaptations of the test for different populations.

  6. Creating IRT-Based Parallel Test Forms Using the Genetic Algorithm Method

    Science.gov (United States)

    Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen

    2008-01-01

    In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…

  7. On the Relationships between Jeffreys Modal and Weighted Likelihood Estimation of Ability under Logistic IRT Models

    Science.gov (United States)

    Magis, David; Raiche, Gilles

    2012-01-01

    This paper focuses on two estimators of ability with logistic item response theory models: the Bayesian modal (BM) estimator and the weighted likelihood (WL) estimator. For the BM estimator, Jeffreys' prior distribution is considered, and the corresponding estimator is referred to as the Jeffreys modal (JM) estimator. It is established that under…

  8. Potential application of thermography (IRT in animal production and for animal welfare. A case report of working dogs

    Directory of Open Access Journals (Sweden)

    Veronica Redaelli

    2014-06-01

    Full Text Available INTRODUCTION. The authors describe the thermography technique in animal production and in veterinary medicine applications. The thermographic technique lends itself to countless applications in biology, thanks to its characteristics of versatility, lack of invasiveness and high sensitivity. Probably the major limitation to most important aspects for its application in the animal lies in the ease of use and in its extreme sensitivity. Materials and methods. This review provides an overview of the possible applications of the technique of thermo visual inspection, but it is clear that every phenomenon connected to temperature variations can be identified with this technique. Then the operator has to identify the best experimental context to obtain as much information as possible, concerning the physiopathological problems considered. Furthermore, we reported an experimental study about the thermography (IRT as a noninvasive technique to assess the state of wellbeing in working dogs. RESULTS. The first results showed the relationship between superficial temperatures and scores obtained by the animal during the behavioral test. This result suggests an interesting application of infrared thermography (IRT to measure the state of wellbeing of animals in a noninvasive way.

  9. Problems with the factor analysis of items: Solutions based on item response theory and item parcelling

    Directory of Open Access Journals (Sweden)

    Gideon P. De Bruin

    2004-10-01

    Full Text Available The factor analysis of items often produces spurious results in the sense that unidimensional scales appear multidimensional. This may be ascribed to failure in meeting the assumptions of linearity and normality on which factor analysis is based. Item response theory is explicitly designed for the modelling of the non-linear relations between ordinal variables and provides a strong alternative to the factor analysis of items. Items may also be combined in parcels that are more likely to satisfy the assumptions of factor analysis than do the items. The use of the Rasch rating scale model and the factor analysis of parcels is illustrated with data obtained with the Locus of Control Inventory. The results of these analyses are compared with the results obtained through the factor analysis of items. It is shown that the Rasch rating scale model and the factoring of parcels produce superior results to the factor analysis of items. Recommendations for the analysis of scales are made. Opsomming Die faktorontleding van items lewer dikwels misleidende resultate op, veral in die opsig dat eendimensionele skale as meerdimensioneel voorkom. Hierdie resultate kan dikwels daaraan toegeskryf word dat daar nie aan die aannames van lineariteit en normaliteit waarop faktorontleding berus, voldoen word nie. Itemresponsteorie, wat eksplisiet vir die modellering van die nie-liniêre verbande tussen ordinale items ontwerp is, bied ’n aantreklike alternatief vir die faktorontleding van items. Items kan ook in pakkies gegroepeer word wat meer waarskynlik aan die aannames van faktorontleding voldoen as individuele items. Die gebruik van die Rasch beoordelingskaalmodel en die faktorontleding van pakkies word aan die hand van data wat met die Lokus van Beheervraelys verkry is, gedemonstreer. Die resultate van hierdie ontledings word vergelyk met die resultate wat deur ‘n faktorontleding van die individuele items verkry is. Die resultate dui daarop dat die Rasch

  10. MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

    Science.gov (United States)

    Wang, Wen-Chung; Shih, Ching-Lin

    2010-01-01

    Three multiple indicators-multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods…

  11. The psychometric properties of the 16-item version of the Prodromal Questionnaire (PQ-16) as a screening instrument for perinatal psychosis

    DEFF Research Database (Denmark)

    Levey, Elizabeth J.; Zhong, Q; Rondon, M

    2018-01-01

    negative symptoms, accounted for 6.3%. Rasch IRT analysis found that all of the items fit the model. These findings support the construct validity of the PQ-16 in this pregnant Peruvian population. Also, further research is needed to establish definitive psychiatric diagnoses to determine the predictive...... accounted for 44% of the variance. Factor 1, representing "unstable sense of self," accounted for 22.1% of the total variance; factor 2, representing "ideas of reference/paranoia," for 8.4%; factor 3, representing "sensitivity to sensory experiences," accounted for 7.2%; and factor 4, possibly representing...

  12. Selecting Items for Criterion-Referenced Tests.

    Science.gov (United States)

    Mellenbergh, Gideon J.; van der Linden, Wim J.

    1982-01-01

    Three item selection methods for criterion-referenced tests are examined: the classical theory of item difficulty and item-test correlation; the latent trait theory of item characteristic curves; and a decision-theoretic approach for optimal item selection. Item contribution to the standardized expected utility of mastery testing is discussed. (CM)

  13. Modeling differential item functioning with group-specific item parameters: A computerized adaptive testing application

    NARCIS (Netherlands)

    Makransky, Guido; Glas, Cornelis A.W.

    2013-01-01

    Many important decisions are made based on the results of tests administered under different conditions in the fields of educational and psychological testing. Inaccurate inferences are often made if the property of measurement invariance (MI) is not assessed across these conditions. The importance

  14. DIF Testing with an Empirical-Histogram Approximation of the Latent Density for Each Group

    Science.gov (United States)

    Woods, Carol M.

    2011-01-01

    This research introduces, illustrates, and tests a variation of IRT-LR-DIF, called EH-DIF-2, in which the latent density for each group is estimated simultaneously with the item parameters as an empirical histogram (EH). IRT-LR-DIF is used to evaluate the degree to which items have different measurement properties for one group of people versus…

  15. High dose rate (HDR) and low dose rate (LDR) interstitial irradiation (IRT) of the rat spinal cord

    International Nuclear Information System (INIS)

    Pop, Lucas A.M.; Plas, Mirjam van der; Skwarchuk, Mark W.; Hanssen, Alex E.J.; Kogel, Albert J. van der

    1997-01-01

    Purpose: To describe a newly developed technique to study radiation tolerance of rat spinal cord to continuous interstitial irradiation (IRT) at different dose rates. Material and methods: Two parallel catheters are inserted just laterally on each side of the vertebral bodies from the level of Th 10 to L 4 . These catheters are afterloaded with two 192 Ir wires of 4 cm length each (activity 1-2.3 mCi/cm) for the low dose rate (LDR) IRT or connected to the HDR micro-Selectron for the high dose rate (HDR) IRT. Spinal cord target volume is located at the level of Th 12 -L 2 . Due to the rapid dose fall-off around the implanted sources, a dose inhomogeneity across the spinal cord thickness is obtained in the dorso-ventral direction. Using the 100% reference dose (rate) at the ventral side of the spinal cord to prescribe the dose, experiments have been carried out to obtain complete dose response curves at average dose rates of 0.49, 0.96 and 120 Gy/h. Paralysis of the hind-legs after 5-6 months and histopathological examination of the spinal cord of each irradiated rat are used as experimental endpoints. Results: The histopathological damage seen after irradiation is clearly reflected the inhomogeneous dose distribution around the implanted catheters, with the damage predominantly located in the dorsal tract of the cord or dorsal roots. With each reduction in average dose rate, spinal cord radiation tolerance is significantly increased. When the dose is prescribed at the 100% reference dose rate, the ED 50 (induction of paresis in 50% of the animals) for the HDR-IRT is 17.3 Gy. If the average dose rate is reduced from 120 Gy/h to 0.96 or 0.49 Gy/h, a 2.9- or 4.7-fold increase in the ED 50 values to 50.3 Gy and 80.9 Gy is observed; for the dose prescribed at the 150% reference dose rate (dorsal side of cord) ED 50 values are 26.0, 75.5 and 121.4 Gy, respectively. Using different types of analysis and in dependence of the dose prescription and reference dose rate, the

  16. Structure of chaotic magnetic field lines in IR-T1 tokamak due to ergodic magnetic limiter

    Science.gov (United States)

    Ahmadi, S.; Salar Elahi, A.; Ghorannevis, M.

    2018-03-01

    In this paper we have studied an Ergodic Magnetic Limiter (EML) based chaotic magnetic field for transport control in the edge plasma of IR-T1 tokamak. The resonance created by the EML causes perturbation of the equilibrium field line in tokamak and as a result, the field lines are chaotic in the vicinity of the dimerized island chains. Transport barriers are formed in the chaotic field line and actually observe in tokamak with reverse magnetic shear. We used area-preserving non-twist (and twist) Poincaré maps to describe the formation of transport barriers, which are actually features of Hamiltonian systems. This transport barrier is useful in reducing radial diffusion of the field line and thus improving the plasma confinement.

  17. Structure of chaotic magnetic field lines in IR-T1 tokamak due to ergodic magnetic limiter

    Directory of Open Access Journals (Sweden)

    S. Ahmadi

    2018-03-01

    Full Text Available In this paper we have studied an Ergodic Magnetic Limiter (EML based chaotic magnetic field for transport control in the edge plasma of IR-T1 tokamak. The resonance created by the EML causes perturbation of the equilibrium field line in tokamak and as a result, the field lines are chaotic in the vicinity of the dimerized island chains. Transport barriers are formed in the chaotic field line and actually observe in tokamak with reverse magnetic shear. We used area-preserving non-twist (and twist Poincaré maps to describe the formation of transport barriers, which are actually features of Hamiltonian systems. This transport barrier is useful in reducing radial diffusion of the field line and thus improving the plasma confinement.

  18. Statistical Bias in Maximum Likelihood Estimators of Item Parameters.

    Science.gov (United States)

    1982-04-01

    34 a> E r’r~e r ,C Ie I# ne,..,.rVi rnd Id.,flfv b1 - bindk numb.r) I; ,t-i i-cd I ’ tiie bias in the maximum likelihood ,st i- i;, ’ t iIeiIrs in...NTC, IL 60088 Psychometric Laboratory University of North Carolina I ERIC Facility-Acquisitions Davie Hall 013A 4833 Rugby Avenue Chapel Hill, NC

  19. Calibrating the Medical Council of Canada's Qualifying Examination Part I using an integrated item response theory framework: a comparison of models and designs.

    Science.gov (United States)

    De Champlain, Andre F; Boulais, Andre-Philippe; Dallas, Andrew

    2016-01-01

    The aim of this research was to compare different methods of calibrating multiple choice question (MCQ) and clinical decision making (CDM) components for the Medical Council of Canada's Qualifying Examination Part I (MCCQEI) based on item response theory. Our data consisted of test results from 8,213 first time applicants to MCCQEI in spring and fall 2010 and 2011 test administrations. The data set contained several thousand multiple choice items and several hundred CDM cases. Four dichotomous calibrations were run using BILOG-MG 3.0. All 3 mixed item format (dichotomous MCQ responses and polytomous CDM case scores) calibrations were conducted using PARSCALE 4. The 2-PL model had identical numbers of items with chi-square values at or below a Type I error rate of 0.01 (83/3,499 or 0.02). In all 3 polytomous models, whether the MCQs were either anchored or concurrently run with the CDM cases, results suggest very poor fit. All IRT abilities estimated from dichotomous calibration designs correlated very highly with each other. IRT-based pass-fail rates were extremely similar, not only across calibration designs and methods, but also with regard to the actual reported decision to candidates. The largest difference noted in pass rates was 4.78%, which occurred between the mixed format concurrent 2-PL graded response model (pass rate= 80.43%) and the dichotomous anchored 1-PL calibrations (pass rate= 85.21%). Simpler calibration designs with dichotomized items should be implemented. The dichotomous calibrations provided better fit of the item response matrix than more complex, polytomous calibrations.

  20. Calibrating the Medical Council of Canada’s Qualifying Examination Part I using an integrated item response theory framework: a comparison of models and designs

    Directory of Open Access Journals (Sweden)

    Andre F. De Champlain

    2016-01-01

    Full Text Available Purpose: The aim of this research was to compare different methods of calibrating multiple choice question (MCQ and clinical decision making (CDM components for the Medical Council of Canada’s Qualifying Examination Part I (MCCQEI based on item response theory. Methods: Our data consisted of test results from 8,213 first time applicants to MCCQEI in spring and fall 2010 and 2011 test administrations. The data set contained several thousand multiple choice items and several hundred CDM cases. Four dichotomous calibrations were run using BILOG-MG 3.0. All 3 mixed item format (dichotomous MCQ responses and polytomous CDM case scores calibrations were conducted using PARSCALE 4. Results: The 2-PL model had identical numbers of items with chi-square values at or below a Type I error rate of 0.01 (83/3,499 or 0.02. In all 3 polytomous models, whether the MCQs were either anchored or concurrently run with the CDM cases, results suggest very poor fit. All IRT abilities estimated from dichotomous calibration designs correlated very highly with each other. IRT-based pass-fail rates were extremely similar, not only across calibration designs and methods, but also with regard to the actual reported decision to candidates. The largest difference noted in pass rates was 4.78%, which occurred between the mixed format concurrent 2-PL graded response model (pass rate= 80.43% and the dichotomous anchored 1-PL calibrations (pass rate= 85.21%. Conclusion: Simpler calibration designs with dichotomized items should be implemented. The dichotomous calibrations provided better fit of the item response matrix than more complex, polytomous calibrations.

  1. Item calibration in incomplete testing designs

    Directory of Open Access Journals (Sweden)

    Norman D. Verhelst

    2011-01-01

    Full Text Available This study discusses the justifiability of item parameter estimation in incomplete testing designs in item response theory. Marginal maximum likelihood (MML as well as conditional maximum likelihood (CML procedures are considered in three commonly used incomplete designs: random incomplete, multistage testing and targeted testing designs. Mislevy and Sheenan (1989 have shown that in incomplete designs the justifiability of MML can be deduced from Rubin's (1976 general theory on inference in the presence of missing data. Their results are recapitulated and extended for more situations. In this study it is shown that for CML estimation the justification must be established in an alternative way, by considering the neglected part of the complete likelihood. The problems with incomplete designs are not generally recognized in practical situations. This is due to the stochastic nature of the incomplete designs which is not taken into account in standard computer algorithms. For that reason, incorrect uses of standard MML- and CML-algorithms are discussed.

  2. Using Procedure Based on Item Response Theory to Evaluate Classification Consistency Indices in the Practice of Large-Scale Assessment

    Directory of Open Access Journals (Sweden)

    Shanshan Zhang

    2017-09-01

    Full Text Available In spite of the growing interest in the methods of evaluating the classification consistency (CC indices, only few researches are available in the field of applying these methods in the practice of large-scale educational assessment. In addition, only few studies considered the influence of practical factors, for example, the examinee ability distribution, the cut score location and the score scale, on the performance of CC indices. Using the newly developed Lee's procedure based on the item response theory (IRT, the main purpose of this study is to investigate the performance of CC indices when practical factors are taken into consideration. A simulation study and an empirical study were conducted under comprehensive conditions. Results suggested that with negatively skewed distribution, the CC indices were larger than with other distributions. Interactions occurred among ability distribution, cut score location, and score scale. Consequently, Lee's IRT procedure is reliable to be used in the field of large-scale educational assessment, and when reporting the indices, it should be treated with caution as testing conditions may vary a lot.

  3. A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

    Science.gov (United States)

    Edwards, Michael C.

    2010-01-01

    Item factor analysis has a rich tradition in both the structural equation modeling and item response theory frameworks. The goal of this paper is to demonstrate a novel combination of various Markov chain Monte Carlo (MCMC) estimation routines to estimate parameters of a wide variety of confirmatory item factor analysis models. Further, I show…

  4. A New Functional Health Literacy Scale for Japanese Young Adults Based on Item Response Theory.

    Science.gov (United States)

    Tsubakita, Takashi; Kawazoe, Nobuo; Kasano, Eri

    2017-03-01

    Health literacy predicts health outcomes. Despite concerns surrounding the health of Japanese young adults, to date there has been no objective assessment of health literacy in this population. This study aimed to develop a Functional Health Literacy Scale for Young Adults (funHLS-YA) based on item response theory. Each item in the scale requires participants to choose the most relevant term from 3 choices in relation to a target item, thus assessing objective rather than perceived health literacy. The 20-item scale was administered to 1816 university students and 1751 responded. Cronbach's α coefficient was .73. Difficulty and discrimination parameters of each item were estimated, resulting in the exclusion of 1 item. Some items showed different difficulty parameters for male and female participants, reflecting that some aspects of health literacy may differ by gender. The current 19-item version of funHLS-YA can reliably assess the objective health literacy of Japanese young adults.

  5. Investigation of the Performance of Multidimensional Equating Procedures for Common-Item Nonequivalent Groups Design

    Directory of Open Access Journals (Sweden)

    Burcu ATAR

    2017-12-01

    Full Text Available In this study, the performance of the multidimensional extentions of Stocking-Lord, mean/mean, and mean/sigma equating procedures under common-item nonequivalent groups design was investigated. The performance of those three equating procedures was examined under the combination of various conditions including sample size, ability distribution, correlation between two dimensions, and percentage of anchor items in the test. Item parameter recovery was evaluated calculating RMSE (root man squared error and BIAS values. It was found that Stocking-Lord procedure provided the smaller RMSE and BIAS values for both item discrimination and item difficulty parameter estimates across most conditions.

  6. Item response theory analysis of the mechanics baseline test

    Science.gov (United States)

    Cardamone, Caroline N.; Abbott, Jonathan E.; Rayyan, Saif; Seaton, Daniel T.; Pawl, Andrew; Pritchard, David E.

    2012-02-01

    Item response theory is useful in both the development and evaluation of assessments and in computing standardized measures of student performance. In item response theory, individual parameters (difficulty, discrimination) for each item or question are fit by item response models. These parameters provide a means for evaluating a test and offer a better measure of student skill than a raw test score, because each skill calculation considers not only the number of questions answered correctly, but the individual properties of all questions answered. Here, we present the results from an analysis of the Mechanics Baseline Test given at MIT during 2005-2010. Using the item parameters, we identify questions on the Mechanics Baseline Test that are not effective in discriminating between MIT students of different abilities. We show that a limited subset of the highest quality questions on the Mechanics Baseline Test returns accurate measures of student skill. We compare student skills as determined by item response theory to the more traditional measurement of the raw score and show that a comparable measure of learning gain can be computed.

  7. Reliability measures in item response theory: manifest versus latent correlation functions.

    Science.gov (United States)

    Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Verbeke, Geert; De Boeck, Paul

    2015-02-01

    For item response theory (IRT) models, which belong to the class of generalized linear or non-linear mixed models, reliability at the scale of observed scores (i.e., manifest correlation) is more difficult to calculate than latent correlation based reliability, but usually of greater scientific interest. This is not least because it cannot be calculated explicitly when the logit link is used in conjunction with normal random effects. As such, approximations such as Fisher's information coefficient, Cronbach's α, or the latent correlation are calculated, allegedly because it is easy to do so. Cronbach's α has well-known and serious drawbacks, Fisher's information is not meaningful under certain circumstances, and there is an important but often overlooked difference between latent and manifest correlations. Here, manifest correlation refers to correlation between observed scores, while latent correlation refers to correlation between scores at the latent (e.g., logit or probit) scale. Thus, using one in place of the other can lead to erroneous conclusions. Taylor series based reliability measures, which are based on manifest correlation functions, are derived and a careful comparison of reliability measures based on latent correlations, Fisher's information, and exact reliability is carried out. The latent correlations are virtually always considerably higher than their manifest counterparts, Fisher's information measure shows no coherent behaviour (it is even negative in some cases), while the newly introduced Taylor series based approximations reflect the exact reliability very closely. Comparisons among the various types of correlations, for various IRT models, are made using algebraic expressions, Monte Carlo simulations, and data analysis. Given the light computational burden and the performance of Taylor series based reliability measures, their use is recommended. © 2014 The British Psychological Society.

  8. Differential item functioning of the patient-reported outcomes information system (PROMIS®) pain interference item bank by language (Spanish versus English).

    Science.gov (United States)

    Paz, Sylvia H; Spritzer, Karen L; Reise, Steven P; Hays, Ron D

    2017-06-01

    About 70% of Latinos, 5 years old or older, in the United States speak Spanish at home. Measurement equivalence of the PROMIS ® pain interference (PI) item bank by language of administration (English versus Spanish) has not been evaluated. A sample of 527 adult Spanish-speaking Latinos completed the Spanish version of the 41-item PROMIS ® pain interference item bank. We evaluate dimensionality, monotonicity and local independence of the Spanish-language items. Then we evaluate differential item functioning (DIF) using ordinal logistic regression with item response theory scores estimated from DIF-free "anchor" items. One of the 41 items in the Spanish version of the PROMIS ® PI item bank was identified as having significant uniform DIF. English- and Spanish-speaking subjects with the same level of pain interference responded differently to 1 of the 41 items in the PROMIS ® PI item bank. This item was not retained due to proprietary issues. The original English language item parameters can be used when estimating PROMIS ® PI scores.

  9. Lord-Wingersky Algorithm Version 2.0 for Hierarchical Item Factor Models with Applications in Test Scoring, Scale Alignment, and Model Fit Testing.

    Science.gov (United States)

    Cai, Li

    2015-06-01

    Lord and Wingersky's (Appl Psychol Meas 8:453-461, 1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined on a grid formed by direct products of quadrature points. However, the increase in computational burden remains exponential in the number of dimensions, making the implementation of the recursive algorithm cumbersome for truly high-dimensional models. In this paper, a dimension reduction method that is specific to the Lord-Wingersky recursions is developed. This method can take advantage of the restrictions implied by hierarchical item factor models, e.g., the bifactor model, the testlet model, or the two-tier model, such that a version of the Lord-Wingersky recursive algorithm can operate on a dramatically reduced set of quadrature points. For instance, in a bifactor model, the dimension of integration is always equal to 2, regardless of the number of factors. The new algorithm not only provides an effective mechanism to produce summed score to IRT scaled score translation tables properly adjusted for residual dependence, but leads to new applications in test scoring, linking, and model fit checking as well. Simulated and empirical examples are used to illustrate the new applications.

  10. Partially Observed Mixtures of IRT Models: An Extension of the Generalized Partial-Credit Model

    Science.gov (United States)

    Von Davier, Matthias; Yamamoto, Kentaro

    2004-01-01

    The generalized partial-credit model (GPCM) is used frequently in educational testing and in large-scale assessments for analyzing polytomous data. Special cases of the generalized partial-credit model are the partial-credit model--or Rasch model for ordinal data--and the two parameter logistic (2PL) model. This article extends the GPCM to the…

  11. Self-Compassion Scale: IRT Psychometric Analysis, Validation, and Factor Structure – Slovak Translation

    Directory of Open Access Journals (Sweden)

    Júlia Halamová

    2018-01-01

    Full Text Available The present study verifies the psychometric properties of the Slovak version of the Self-Compassion Scale through item response theory, factor-analysis, validity analyses and norm development. The surveyed sample consisted of 1,181 participants (34% men and 66% women with a mean age of 30.30 years (SD = 12.40. Two general factors (Self-compassionate responding and Self-uncompassionate responding were identified, whereas there was no support for a single general factor of the scale and six subscales. The results of the factor analysis were supported by an independent sample of 676 participants. Therefore, the use of total score for the whole scale would be inappropriate. In Slovak language the Self-Compassion Scale should be used in the form of two general subscales (Self-compassionate responding and Self-uncompassionate responding. In line with our theoretical assumptions, we obtained relatively high Spearman’s correlation coefficients between the Self-Compassion Scale and related external variables, demonstrating construct validity for the scale. To sum up, the Slovak translation of The Self-Compassion Scale is a reliable and valid instrument that measures Self-compassionate responding and Self-uncompassionate responding.

  12. An IRT Analysis of the Reading the Mind in the Eyes Test.

    Science.gov (United States)

    Black, Jessica E

    2018-04-03

    The Reading the Mind in the Eyes Test (RMET; Baron-Cohen, Wheelwright, Hill, Raste, & Plumb, 2001 ), originally designed for use in clinical populations, has been used with increasing frequency as a measure of advanced social cognition in nonclinical samples (e.g., Domes, Heinriches, Michel, Berger, & Herpertz, 2007 ; Kidd & Castano, 2013 ; Mar, Oatley, Hirsh, de la Paz, & Peterson, 2006 ). The purpose of this research was to use item response theory to assess the ability of the RMET to detect differences at the high levels of theory of mind to be expected in neurotypical adults. Results indicate that the RMET is an easy test that fails to discriminate between individuals exhibiting high ability. As such, it is unlikely that it could adequately or reliably capture the expected effects of manipulations designed to boost ability in samples of neurotypical populations. Reported effects and noneffects from such manipulations might reflect noise introduced by inaccurate measurement; a more sensitive instrument is needed to verify the effects of manipulations to enhance theory of mind.

  13. 48 CFR 852.214-72 - Alternate item(s).

    Science.gov (United States)

    2010-10-01

    ... AND FORMS SOLICITATION PROVISIONS AND CONTRACT CLAUSES Texts of Provisions and Clauses 852.214-72... 2008) Bids on []* will be given equal consideration along with bids on []** and any such bids received... [].** * Contracting officer will insert an alternate item that is considered acceptable. ** Contracting officer will...

  14. Profile-likelihood Confidence Intervals in Item Response Theory Models.

    Science.gov (United States)

    Chalmers, R Philip; Pek, Jolynn; Liu, Yang

    2017-01-01

    Confidence intervals (CIs) are fundamental inferential devices which quantify the sampling variability of parameter estimates. In item response theory, CIs have been primarily obtained from large-sample Wald-type approaches based on standard error estimates, derived from the observed or expected information matrix, after parameters have been estimated via maximum likelihood. An alternative approach to constructing CIs is to quantify sampling variability directly from the likelihood function with a technique known as profile-likelihood confidence intervals (PL CIs). In this article, we introduce PL CIs for item response theory models, compare PL CIs to classical large-sample Wald-type CIs, and demonstrate important distinctions among these CIs. CIs are then constructed for parameters directly estimated in the specified model and for transformed parameters which are often obtained post-estimation. Monte Carlo simulation results suggest that PL CIs perform consistently better than Wald-type CIs for both non-transformed and transformed parameters.

  15. Semi-Parametric Item Response Functions in the Context of Guessing. CRESST Report 844

    Science.gov (United States)

    Falk, Carl F.; Cai, Li

    2015-01-01

    We present a logistic function of a monotonic polynomial with a lower asymptote, allowing additional flexibility beyond the three-parameter logistic model. We develop a maximum marginal likelihood based approach to estimate the item parameters. The new item response model is demonstrated on math assessment data from a state, and a computationally…

  16. Modelling sequentially scored item responses

    NARCIS (Netherlands)

    Akkermans, W.

    2000-01-01

    The sequential model can be used to describe the variable resulting from a sequential scoring process. In this paper two more item response models are investigated with respect to their suitability for sequential scoring: the partial credit model and the graded response model. The investigation is

  17. Reconstructing the Roman Site “Aquis Querquennis” (Bande, Spain from GPR, T-LiDAR and IRT Data Fusion

    Directory of Open Access Journals (Sweden)

    Iván Puente

    2018-03-01

    Full Text Available This work presents the three-dimensional (3D reconstruction of one of the most important archaeological sites in Galicia: “Aquis Querquennis” (Bande, Spain using in-situ non-invasive ground-penetrating radar (GPR and Terrestrial Light Detection and Ranging (T-LiDAR techniques, complemented with infrared thermography. T-LiDAR is used for the recording of the 3D surface of this particular case and provides high resolution 3D digital models. GPR data processing is performed through the novel software tool “toGPRi”, developed by the authors, which allows the creation of a 3D model of the sub-surface and the subsequent XY images or time-slices at different depths. All these products are georeferenced, in such a way that the GPR orthoimages can be combined with the orthoimages from the T-LiDAR for a complete interpretation of the site. In this way, the GPR technique allows for the detection of the structures of the barracks that are buried, and their distribution is completed with the structure measured by the T-LiDAR on the surface. In addition, the detection of buried elements made possible the identification and labelling of the structures of the surface and their uses. These structures are additionally inspected with infrared thermography (IRT to determine their conservation condition and distinguish between original and subsequent constructions.

  18. Lord's Wald Test for Detecting Dif in Multidimensional Irt Models: A Comparison of Two Estimation Approaches

    Science.gov (United States)

    Lee, Soo; Suh, Youngsuk

    2018-01-01

    Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…

  19. The Langer-Improved Wald Test for DIF Testing with Multiple Groups: Evaluation and Comparison to Two-Group IRT

    Science.gov (United States)

    Woods, Carol M.; Cai, Li; Wang, Mian

    2013-01-01

    Differential item functioning (DIF) occurs when the probability of responding in a particular category to an item differs for members of different groups who are matched on the construct being measured. The identification of DIF is important for valid measurement. This research evaluates an improved version of Lord's chi [superscript 2]…

  20. Analyzing force concept inventory with item response theory

    Science.gov (United States)

    Wang, Jing; Bao, Lei

    2010-10-01

    Item response theory is a popular assessment method used in education. It rests on the assumption of a probability framework that relates students' innate ability and their performance on test questions. Item response theory transforms students' raw test scores into a scaled proficiency score, which can be used to compare results obtained with different test questions. The scaled score also addresses the issues of ceiling effects and guessing, which commonly exist in quantitative assessment. We used item response theory to analyze the force concept inventory (FCI). Our results show that item response theory can be useful for analyzing physics concept surveys such as the FCI and produces results about the individual questions and student performance that are beyond the capability of classical statistics. The theory yields detailed measurement parameters regarding the difficulty, discrimination features, and probability of correct guess for each of the FCI questions.

  1. Avaliação de atitudes de estudantes de psicologia via modelo de crédito parcial da TRI Assessment of psychology students' attitudes through credit partial model of IRT

    Directory of Open Access Journals (Sweden)

    Claudette Maria Medeiros Vendramini

    2009-12-01

    Full Text Available O objetivo deste trabalho foi avaliar as atitudes de estudantes de Psicologia em relação a estatística, via modelo de créditos parciais da TRI, e suas relações com a autopercepção e desempenho em estatística. Uma amostra não aleatória de 361 estudantes de Psicologia, com idades de 18 a 65 anos, 81% mulheres e 53% do noturno, respondeu a um questionário de identificação e uma escala de atitudes. A escala é do tipo likert de quatro pontos e composta de 20 itens que expressam os sentimentos em relação a estatística, sendo dez positivos e dez negativos, e um item complementar, que verifica a autopercepção do universitário em relação ao próprio desempenho em estatística. Observou-se que a escala é fidedigna e válida para medir as atitudes. Os participantes apresentaram atitudes ligeiramente mais negativas do que positivas. Constatou-se a existência de correlações positivas e significativas entre atitude, desempenho acadêmico e autopercepção de desempenho.The aim of this work was to assess psychology students' attitudes toward statistics trough credit partial model of IRT, and to identify the association among the students' attitudes, academic performance, and self-perception in Statistics. A not random sample of 361 Psychology students answered the identification questionnaire and the attitudes scale towards Statistics. The students aged 18-65, 81% were female and 53% from evening classes. The likert scale is composed of 20 items, ten positives and ten negatives, which express the feelings towards Statistics. There is one item which verifies the university student's self-perception towards its own performance in Statistics. It was observed that the scale was reliable and valid to measure attitudes. The students presented their attitudes slightly more negative than positive. It was noticed the existence of positive and significant correlations among attitudes, academic performance and performance self-perception.

  2. Understanding and quantifying cognitive complexity level in mathematical problem solving items

    Directory of Open Access Journals (Sweden)

    SUSAN E. EMBRETSON

    2008-09-01

    Full Text Available The linear logistic test model (LLTM; Fischer, 1973 has been applied to a wide variety of new tests. When the LLTM application involves item complexity variables that are both theoretically interesting and empirically supported, several advantages can result. These advantages include elaborating construct validity at the item level, defining variables for test design, predicting parameters of new items, item banking by sources of complexity and providing a basis for item design and item generation. However, despite the many advantages of applying LLTM to test items, it has been applied less often to understand the sources of complexity for large-scale operational test items. Instead, previously calibrated item parameters are modeled using regression techniques because raw item response data often cannot be made available. In the current study, both LLTM and regression modeling are applied to mathematical problem solving items from a widely used test. The findings from the two methods are compared and contrasted for their implications for continued development of ability and achievement tests based on mathematical problem solving items.

  3. Generalizability theory and item response theory

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a

  4. Generalizability theory and item response theory

    OpenAIRE

    Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a selected-response format. This chapter presents a short overview of how item response theory and generalizability theory were integrated to model such assessments. Further, the precision of the esti...

  5. Item response theory and factor analysis as a mean to characterize occurrence of response shift in a longitudinal quality of life study in breast cancer patients

    Science.gov (United States)

    2014-01-01

    Background The occurrence of response shift (RS) in longitudinal health-related quality of life (HRQoL) studies, reflecting patient adaptation to disease, has already been demonstrated. Several methods have been developed to detect the three different types of response shift (RS), i.e. recalibration RS, 2) reprioritization RS, and 3) reconceptualization RS. We investigated two complementary methods that characterize the occurrence of RS: factor analysis, comprising Principal Component Analysis (PCA) and Multiple Correspondence Analysis (MCA), and a method of Item Response Theory (IRT). Methods Breast cancer patients (n = 381) completed the EORTC QLQ-C30 and EORTC QLQ-BR23 questionnaires at baseline, immediately following surgery, and three and six months after surgery, according to the “then-test/post-test” design. Recalibration was explored using MCA and a model of IRT, called the Linear Logistic Model with Relaxed Assumptions (LLRA) using the then-test method. Principal Component Analysis (PCA) was used to explore reconceptualization and reprioritization. Results MCA highlighted the main profiles of recalibration: patients with high HRQoL level report a slightly worse HRQoL level retrospectively and vice versa. The LLRA model indicated a downward or upward recalibration for each dimension. At six months, the recalibration effect was statistically significant for 11/22 dimensions of the QLQ-C30 and BR23 according to the LLRA model (p ≤ 0.001). Regarding the QLQ-C30, PCA indicated a reprioritization of symptom scales and reconceptualization via an increased correlation between functional scales. Conclusions Our findings demonstrate the usefulness of these analyses in characterizing the occurrence of RS. MCA and IRT model had convergent results with then-test method to characterize recalibration component of RS. PCA is an indirect method in investigating the reprioritization and reconceptualization components of RS. PMID:24606836

  6. International Review Team (IRT) Safety Case Recommendations for the Yucca Mountain Total System Performance Assessment (TSPA) Supporting the Site Recommendation

    International Nuclear Information System (INIS)

    Van Luik, Abraham E.

    2004-01-01

    The session started with Abe Van Luik (IGSC Chair, US-DOE-YM, USA) who presented the feedback of the international peer review of the US-DOE Yucca Mountain TSPA (Total System Performance Assessment) supporting the successful designation of the site by the Congress and the President of the U.S. In particular, he listed key implications of the IRT (International Review team) recommendations on the forthcoming US-DOE documentation of its case for safety to be submitted to the regulator, the U.S. Nuclear Regulatory Commission, mainly: - The documentation submitted to the licensing authority should address technical aspects and compliance with regulatory criteria. - That documentation should reflect sound science and good engineering practice; it should present detailed and rigorous modelling. - In addition, it should present both quantitative and qualitative arguments, make a statement on why there can be confidence in the face of uncertainty, acknowledge remaining issues and provide the strategy to resolve them. - Demonstrating understanding is as important as demonstrating compliance. - There is a need to provide a clear explanation of the case made to the regulator for more general audiences to complement the large amount of technical documents that will be produced. The US-DOE response to these recommendations for the License Application, which is under preparation, is that the recommendations will be implemented to the maximum extent possible. In subsequent discussion, with respect to the License Application, it was acknowledged that detailed guidance from the U.S. regulator was very useful, and guidance of this type would be generally useful. At the current time, the words 'safety case' are not mentioned in U.S. regulations, but if one reads both the regulation and guidance documents it becomes evident that all aspects of a safety case need to be provided in the License Application and its accompanying documents

  7. Sharing the cost of redundant items

    DEFF Research Database (Denmark)

    Hougaard, Jens Leth; Moulin, Hervé

    2014-01-01

    We ask how to share the cost of finitely many public goods (items) among users with different needs: some smaller subsets of items are enough to serve the needs of each user, yet the cost of all items must be covered, even if this entails inefficiently paying for redundant items. Typical examples...... are network connectivity problems when an existing (possibly inefficient) network must be maintained. We axiomatize a family cost ratios based on simple liability indices, one for each agent and for each item, measuring the relative worth of this item across agents, and generating cost allocation rules...... additive in costs....

  8. Development of a psychological test to measure ability-based emotional intelligence in the Indonesian workplace using an item response theory

    Directory of Open Access Journals (Sweden)

    Fajrianthi

    2017-11-01

    Full Text Available Fajrianthi,1 Rizqy Amelia Zein2 1Department of Industrial and Organizational Psychology, 2Department of Personality and Social Psychology, Faculty of Psychology, Universitas Airlangga, Surabaya, East Java, Indonesia Abstract: This study aimed to develop an emotional intelligence (EI test that is suitable to the Indonesian workplace context. Airlangga Emotional Intelligence Test (Tes Kecerdasan Emosi Airlangga [TKEA] was designed to measure three EI domains: 1 emotional appraisal, 2 emotional recognition, and 3 emotional regulation. TKEA consisted of 120 items with 40 items for each subset. TKEA was developed based on the Situational Judgment Test (SJT approach. To ensure its psychometric qualities, categorical confirmatory factor analysis (CCFA and item response theory (IRT were applied to test its validity and reliability. The study was conducted on 752 participants, and the results showed that test information function (TIF was 3.414 (ability level = 0 for subset 1, 12.183 for subset 2 (ability level = -2, and 2.398 for subset 3 (level of ability = -2. It is concluded that TKEA performs very well to measure individuals with a low level of EI ability. It is worth to note that TKEA is currently at the development stage; therefore, in this study, we investigated TKEA’s item analysis and dimensionality test of each TKEA subset. Keywords: categorical confirmatory factor analysis, emotional intelligence, item response theory 

  9. Emergency Power For Critical Items

    Science.gov (United States)

    Young, William R.

    2009-07-01

    Natural disasters, such as hurricanes, floods, tornados, and tsunami, are becoming a greater problem as climate change impacts our environment. Disasters, whether natural or man made, destroy lives, homes, businesses and the natural environment. Such disasters can happen with little or no warning, leaving hundreds or even thousands of people without medical services, potable water, sanitation, communications and electrical services for up to several weeks. In our modern world, the need for electricity has become a necessity. Modern building codes and new disaster resistant building practices are reducing the damage to homes and businesses. Emergency gasoline and diesel generators are becoming common place for power outages. Generators need fuel, which may not be available after a disaster, but Photovoltaic (solar-electric) systems supply electricity without petroleum fuel as they are powered by the sun. Photovoltaic (PV) systems can provide electrical power for a home or business. PV systems can operate as utility interactive or stand-alone with battery backup. Determining your critical load items and sizing the photovoltaic system for those critical items, guarantees their operation in a disaster.

  10. Development of six PROMIS pediatrics proxy-report item banks.

    Science.gov (United States)

    Irwin, Debra E; Gross, Heather E; Stucky, Brian D; Thissen, David; DeWitt, Esi Morgan; Lai, Jin Shei; Amtmann, Dagmar; Khastou, Leyla; Varni, James W; DeWalt, Darren A

    2012-02-22

    Pediatric self-report should be considered the standard for measuring patient reported outcomes (PRO) among children. However, circumstances exist when the child is too young, cognitively impaired, or too ill to complete a PRO instrument and a proxy-report is needed. This paper describes the development process including the proxy cognitive interviews and large-field-test survey methods and sample characteristics employed to produce item parameters for the Patient Reported Outcomes Measurement Information System (PROMIS) pediatric proxy-report item banks. The PROMIS pediatric self-report items were converted into proxy-report items before undergoing cognitive interviews. These items covered six domains (physical function, emotional distress, social peer relationships, fatigue, pain interference, and asthma impact). Caregivers (n = 25) of children ages of 5 and 17 years provided qualitative feedback on proxy-report items to assess any major issues with these items. From May 2008 to March 2009, the large-scale survey enrolled children ages 8-17 years to complete the self-report version and caregivers to complete the proxy-report version of the survey (n = 1548 dyads). Caregivers of children ages 5 to 7 years completed the proxy report survey (n = 432). In addition, caregivers completed other proxy instruments, PedsQL™ 4.0 Generic Core Scales Parent Proxy-Report version, PedsQL™ Asthma Module Parent Proxy-Report version, and KIDSCREEN Parent-Proxy-52. Item content was well understood by proxies and did not require item revisions but some proxies clearly noted that determining an answer on behalf of their child was difficult for some items. Dyads and caregivers of children ages 5-17 years old were enrolled in the large-scale testing. The majority were female (85%), married (70%), Caucasian (64%) and had at least a high school education (94%). Approximately 50% had children with a chronic health condition, primarily asthma, which was diagnosed or treated within 6

  11. Development of six PROMIS pediatrics proxy-report item banks

    Directory of Open Access Journals (Sweden)

    Irwin Debra E

    2012-02-01

    Full Text Available Abstract Background Pediatric self-report should be considered the standard for measuring patient reported outcomes (PRO among children. However, circumstances exist when the child is too young, cognitively impaired, or too ill to complete a PRO instrument and a proxy-report is needed. This paper describes the development process including the proxy cognitive interviews and large-field-test survey methods and sample characteristics employed to produce item parameters for the Patient Reported Outcomes Measurement Information System (PROMIS pediatric proxy-report item banks. Methods The PROMIS pediatric self-report items were converted into proxy-report items before undergoing cognitive interviews. These items covered six domains (physical function, emotional distress, social peer relationships, fatigue, pain interference, and asthma impact. Caregivers (n = 25 of children ages of 5 and 17 years provided qualitative feedback on proxy-report items to assess any major issues with these items. From May 2008 to March 2009, the large-scale survey enrolled children ages 8-17 years to complete the self-report version and caregivers to complete the proxy-report version of the survey (n = 1548 dyads. Caregivers of children ages 5 to 7 years completed the proxy report survey (n = 432. In addition, caregivers completed other proxy instruments, PedsQL™ 4.0 Generic Core Scales Parent Proxy-Report version, PedsQL™ Asthma Module Parent Proxy-Report version, and KIDSCREEN Parent-Proxy-52. Results Item content was well understood by proxies and did not require item revisions but some proxies clearly noted that determining an answer on behalf of their child was difficult for some items. Dyads and caregivers of children ages 5-17 years old were enrolled in the large-scale testing. The majority were female (85%, married (70%, Caucasian (64% and had at least a high school education (94%. Approximately 50% had children with a chronic health condition, primarily

  12. Piecewise Polynomial Fitting with Trend Item Removal and Its Application in a Cab Vibration Test

    Directory of Open Access Journals (Sweden)

    Wu Ren

    2018-01-01

    Full Text Available The trend item of a long-term vibration signal is difficult to remove. This paper proposes a piecewise integration method to remove trend items. Examples of direct integration without trend item removal, global integration after piecewise polynomial fitting with trend item removal, and direct integration after piecewise polynomial fitting with trend item removal were simulated. The results showed that direct integration of the fitted piecewise polynomial provided greater acceleration and displacement precision than the other two integration methods. A vibration test was then performed on a special equipment cab. The results indicated that direct integration by piecewise polynomial fitting with trend item removal was highly consistent with the measured signal data. However, the direct integration method without trend item removal resulted in signal distortion. The proposed method can help with frequency domain analysis of vibration signals and modal parameter identification for such equipment.

  13. Approaching the Type-II Dirac Point and Concomitant Superconductivity in Pt-doping Stabilized Metastable 1T-phase IrTe2

    OpenAIRE

    Fei, Fucong; Bo, Xiangyan; Wang, Pengdong; Ying, Jianghua; Chen, Bo; Liu, Qianqian; Zhang, Yong; Sun, Zhe; Qu, Fanming; Zhang, Yi; Li, Jian; Song, Fengqi; Wan, Xiangang; Wang, Baigeng; Wang, Guanghou

    2017-01-01

    Topological semimetal is a topic of general interest in material science. Recently, a new kind of topological semimetal called type-II Dirac semimetal with tilted Dirac cones is discovered in PtSe2 family. However, the further investigation is hindered due to the huge energy difference from Dirac points to Fermi level and the irrelevant conducting pockets at Fermi surface. Here we characterize the optimized type-II Dirac dispersions in a metastable 1T phase of IrTe2. Our strategy of Pt doping...

  14. Plasma column displacement measurements by modified Rogowski sine-coil and Biot-Savart/magnetic flux equation solution on IR-T1 tokamak

    International Nuclear Information System (INIS)

    Razavi, M.; Mollai, M.; Khorshid, P.; Nedzelskiy, I.; Ghoranneviss, M.

    2010-01-01

    The modified Rogowski sine-coil (MRSC) has been designed and implemented for the plasma column horizontal displacement measurements on small IR-T1 tokamak. MRSC operation has been examined on test assembly and tokamak. Obtained results show high sensitivity to the plasma column horizontal displacement and negligible sensitivity to the vertical displacement; linearity in wide, ±0.1 m, range of the displacements; and excellent, 1.5%, agreement with the results of numerical solution of Biot-Savart and magnetic flux equations.

  15. Possible use of dual purpose dry storage casks for transportation and future storage of spent nuclear fuel from IRT-Sofia

    International Nuclear Information System (INIS)

    Manev, L.; Baltiyski, M.

    2003-01-01

    Objectives: The main objective of the present paper is related to one of the priority goals stipulated in Bulgarian Governmental Decision No.332 from May 17, 1999 - removal of SNF from IRT-Sofia site and its exporting for reprocessing and/or for temporary storage at Kozloduy NPP site. The variant of using dual purpose dry storage casks for transportation and future temporary storage of SNF from IRT-Sofia aims to find out a reasonable alternative of the existing till now variant for temporary SNF storage under water in the existing Kozloduy NPP Spent Fuel Storage Facility until its export for reprocessing. Results: Based on the given data for the condition of 73 Spent Nuclear Fuel Assemblies (SNFA) stored in the storage pool and technical data as well as data for available equipment and IRT-Sofia layout the following framework are specified: draft technical features of dual purpose dry storage casks and their overall dimensions; the suitability of the available equipment for safety and reliable performance of transportation and handling operations of assemblies from storage pool to dual purpose dry storage casks; the necessity of new equipment for performance of the above mentioned operations; Assemblies' transportation and handling operations are described; requirements to and conditions for future safety and reliable storage of SNFA loaded casks are determined. When selecting the technical solutions for safety assurance during performance of site handling operations of IRT-Sofia and for description of the exemplary casks the Effective Bulgarian Regulations are considered. The experience of other countries in performance of transfer and transportation of SNFA from such types of research reactors is taken into account. Also, Kozloduy NPP experience in SNF handling operations is taken into account. Conclusions: The Decision of Council of Minister for refurbishment of research reactor into a low power one and its future utilization for experimental and training

  16. The Experience of Storage and Shipment for Reprocessing of HEU Nuclear Fuel Irradiated in the IRT-M Research Reactor and Pamir-630 Mobile Reactor

    Energy Technology Data Exchange (ETDEWEB)

    Sikorin, S. N.; Polazau, S. A.; Luneu, A. N.; Hrigarovich, T. K. [Joint Institute for Power and Nuclear Research–Sosny of the National Academy of Sciences of Belarus, Minsk (Belarus)

    2014-08-15

    At the end of 2010 under the Global Threat Reduction Initiative (GTRI), the Joint Institute for Power and Nuclear Research–“Sosny” (JIPNR–Sosny) of the National Academy of Sciences of the Republic of Belarus repatriated HEU spent nuclear fuel to the Russian Federation. The spent nuclear fuel was from the decommissioned Pamir-630D mobile reactor and IRT-M research reactor. The paper discusses the Pamir-630D spent nuclear fuel; experience and problems of spent nuclear fuel storage; and various aspects of the shipment including legal framework, preparation activities and shipment logistics. The conceptual project of a new research reactor for Belarus is also presented.

  17. Using automatic item generation to create multiple-choice test items.

    Science.gov (United States)

    Gierl, Mark J; Lai, Hollis; Turner, Simon R

    2012-08-01

    Many tests of medical knowledge, from the undergraduate level to the level of certification and licensure, contain multiple-choice items. Although these are efficient in measuring examinees' knowledge and skills across diverse content areas, multiple-choice items are time-consuming and expensive to create. Changes in student assessment brought about by new forms of computer-based testing have created the demand for large numbers of multiple-choice items. Our current approaches to item development cannot meet this demand. We present a methodology for developing multiple-choice items based on automatic item generation (AIG) concepts and procedures. We describe a three-stage approach to AIG and we illustrate this approach by generating multiple-choice items for a medical licensure test in the content area of surgery. To generate multiple-choice items, our method requires a three-stage process. Firstly, a cognitive model is created by content specialists. Secondly, item models are developed using the content from the cognitive model. Thirdly, items are generated from the item models using computer software. Using this methodology, we generated 1248 multiple-choice items from one item model. Automatic item generation is a process that involves using models to generate items using computer technology. With our method, content specialists identify and structure the content for the test items, and computer technology systematically combines the content to generate new test items. By combining these outcomes, items can be generated automatically. © Blackwell Publishing Ltd 2012.

  18. A Balance Sheet for Educational Item Banking.

    Science.gov (United States)

    Hiscox, Michael D.

    Educational item banking presents observers with a considerable paradox. The development of test items from scratch is viewed as wasteful, a luxury in times of declining resources. On the other hand, item banking has failed to become a mature technology despite large amounts of money and the efforts of talented professionals. The question of which…

  19. 76 FR 60474 - Commercial Item Handbook

    Science.gov (United States)

    2011-09-29

    ... DEPARTMENT OF DEFENSE Defense Acquisition Regulations System Commercial Item Handbook AGENCY.... SUMMARY: DoD has updated its Commercial Item Handbook. The purpose of the Handbook is to help acquisition personnel develop sound business strategies for procuring commercial items. DoD is seeking industry input on...

  20. Towards an authoring system for item construction

    NARCIS (Netherlands)

    Rikers, Jos H.A.N.

    1988-01-01

    The process of writing test items is analyzed, and a blueprint is presented for an authoring system for test item writing to reduce invalidity and to structure the process of item writing. The developmental methodology is introduced, and the first steps in the process are reported. A historical

  1. Obtaining a Proportional Allocation by Deleting Items

    NARCIS (Netherlands)

    Dorn, B.; de Haan, R.; Schlotter, I.; Röthe, J.

    2017-01-01

    We consider the following control problem on fair allocation of indivisible goods. Given a set I of items and a set of agents, each having strict linear preference over the items, we ask for a minimum subset of the items whose deletion guarantees the existence of a proportional allocation in the

  2. Item Analysis in Introductory Economics Testing.

    Science.gov (United States)

    Tinari, Frank D.

    1979-01-01

    Computerized analysis of multiple choice test items is explained. Examples of item analysis applications in the introductory economics course are discussed with respect to three objectives: to evaluate learning; to improve test items; and to help improve classroom instruction. Problems, costs and benefits of the procedures are identified. (JMD)

  3. New technologies for item monitoring

    International Nuclear Information System (INIS)

    Abbott, J.A.; Waddoups, I.G.

    1993-12-01

    This report responds to the Department of Energy's request that Sandia National Laboratories compare existing technologies against several advanced technologies as they apply to DOE needs to monitor the movement of material, weapons, or personnel for safety and security programs. The authors describe several material control systems, discuss their technologies, suggest possible applications, discuss assets and limitations, and project costs for each system. The following systems are described: WATCH system (Wireless Alarm Transmission of Container Handling); Tag system (an electrostatic proximity sensor); PANTRAK system (Personnel And Material Tracking); VRIS (Vault Remote Inventory System); VSIS (Vault Safety and Inventory System); AIMS (Authenticated Item Monitoring System); EIVS (Experimental Inventory Verification System); Metrox system (canister monitoring system); TCATS (Target Cueing And Tracking System); LGVSS (Light Grid Vault Surveillance System); CSS (Container Safeguards System); SAMMS (Security Alarm and Material Monitoring System); FOIDS (Fiber Optic Intelligence ampersand Detection System); GRADS (Graded Radiation Detection System); and PINPAL (Physical Inventory Pallet)

  4. New technologies for item monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Abbott, J.A. [EG & G Energy Measurements, Albuquerque, NM (United States); Waddoups, I.G. [Sandia National Labs., Albuquerque, NM (United States)

    1993-12-01

    This report responds to the Department of Energy`s request that Sandia National Laboratories compare existing technologies against several advanced technologies as they apply to DOE needs to monitor the movement of material, weapons, or personnel for safety and security programs. The authors describe several material control systems, discuss their technologies, suggest possible applications, discuss assets and limitations, and project costs for each system. The following systems are described: WATCH system (Wireless Alarm Transmission of Container Handling); Tag system (an electrostatic proximity sensor); PANTRAK system (Personnel And Material Tracking); VRIS (Vault Remote Inventory System); VSIS (Vault Safety and Inventory System); AIMS (Authenticated Item Monitoring System); EIVS (Experimental Inventory Verification System); Metrox system (canister monitoring system); TCATS (Target Cueing And Tracking System); LGVSS (Light Grid Vault Surveillance System); CSS (Container Safeguards System); SAMMS (Security Alarm and Material Monitoring System); FOIDS (Fiber Optic Intelligence & Detection System); GRADS (Graded Radiation Detection System); and PINPAL (Physical Inventory Pallet).

  5. Approximation Preserving Reductions among Item Pricing Problems

    Science.gov (United States)

    Hamane, Ryoso; Itoh, Toshiya; Tomita, Kouhei

    When a store sells items to customers, the store wishes to determine the prices of the items to maximize its profit. Intuitively, if the store sells the items with low (resp. high) prices, the customers buy more (resp. less) items, which provides less profit to the store. So it would be hard for the store to decide the prices of items. Assume that the store has a set V of n items and there is a set E of m customers who wish to buy those items, and also assume that each item i ∈ V has the production cost di and each customer ej ∈ E has the valuation vj on the bundle ej ⊆ V of items. When the store sells an item i ∈ V at the price ri, the profit for the item i is pi = ri - di. The goal of the store is to decide the price of each item to maximize its total profit. We refer to this maximization problem as the item pricing problem. In most of the previous works, the item pricing problem was considered under the assumption that pi ≥ 0 for each i ∈ V, however, Balcan, et al. [In Proc. of WINE, LNCS 4858, 2007] introduced the notion of “loss-leader, ” and showed that the seller can get more total profit in the case that pi < 0 is allowed than in the case that pi < 0 is not allowed. In this paper, we derive approximation preserving reductions among several item pricing problems and show that all of them have algorithms with good approximation ratio.

  6. Modeling Answer Change Behavior: An Application of a Generalized Item Response Tree Model

    Science.gov (United States)

    Jeon, Minjeong; De Boeck, Paul; van der Linden, Wim

    2017-01-01

    We present a novel application of a generalized item response tree model to investigate test takers' answer change behavior. The model allows us to simultaneously model the observed patterns of the initial and final responses after an answer change as a function of a set of latent traits and item parameters. The proposed application is illustrated…

  7. Item Modeling Concept Based on Multimedia Authoring

    Directory of Open Access Journals (Sweden)

    Janez Stergar

    2008-09-01

    Full Text Available In this paper a modern item design framework for computer based assessment based on Flash authoring environment will be introduced. Question design will be discussed as well as the multimedia authoring environment used for item modeling emphasized. Item type templates are a structured means of collecting and storing item information that can be used to improve the efficiency and security of the innovative item design process. Templates can modernize the item design, enhance and speed up the development process. Along with content creation, multimedia has vast potential for use in innovative testing. The introduced item design template is based on taxonomy of innovative items which have great potential for expanding the content areas and construct coverage of an assessment. The presented item design approach is based on GUI's – one for question design based on implemented item design templates and one for user interaction tracking/retrieval. The concept of user interfaces based on Flash technology will be discussed as well as implementation of the innovative approach of the item design forms with multimedia authoring. Also an innovative method for user interaction storage/retrieval based on PHP extending Flash capabilities in the proposed framework will be introduced.

  8. Losing Items in the Psychogeriatric Nursing Home

    Directory of Open Access Journals (Sweden)

    J. van Hoof PhD

    2016-09-01

    Full Text Available Introduction: Losing items is a time-consuming occurrence in nursing homes that is ill described. An explorative study was conducted to investigate which items got lost by nursing home residents, and how this affects the residents and family caregivers. Method: Semi-structured interviews and card sorting tasks were conducted with 12 residents with early-stage dementia and 12 family caregivers. Thematic analysis was applied to the outcomes of the sessions. Results: The participants stated that numerous personal items and assistive devices get lost in the nursing home environment, which had various emotional, practical, and financial implications. Significant amounts of time are spent on trying to find items, varying from 1 hr up to a couple of weeks. Numerous potential solutions were identified by the interviewees. Discussion: Losing items often goes together with limitations to the participation of residents. Many family caregivers are reluctant to replace lost items, as these items may get lost again.

  9. Using Differential Item Functioning Procedures to Explore Sources of Item Difficulty and Group Performance Characteristics.

    Science.gov (United States)

    Scheuneman, Janice Dowd; Gerritz, Kalle

    1990-01-01

    Differential item functioning (DIF) methodology for revealing sources of item difficulty and performance characteristics of different groups was explored. A total of 150 Scholastic Aptitude Test items and 132 Graduate Record Examination general test items were analyzed. DIF was evaluated for males and females and Blacks and Whites. (SLD)

  10. Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2013-01-01

    Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

  11. Development of a psychological test to measure ability-based emotional intelligence in the Indonesian workplace using an item response theory.

    Science.gov (United States)

    Fajrianthi; Zein, Rizqy Amelia

    2017-01-01

    This study aimed to develop an emotional intelligence (EI) test that is suitable to the Indonesian workplace context. Airlangga Emotional Intelligence Test (Tes Kecerdasan Emosi Airlangga [TKEA]) was designed to measure three EI domains: 1) emotional appraisal, 2) emotional recognition, and 3) emotional regulation. TKEA consisted of 120 items with 40 items for each subset. TKEA was developed based on the Situational Judgment Test (SJT) approach. To ensure its psychometric qualities, categorical confirmatory factor analysis (CCFA) and item response theory (IRT) were applied to test its validity and reliability. The study was conducted on 752 participants, and the results showed that test information function (TIF) was 3.414 (ability level = 0) for subset 1, 12.183 for subset 2 (ability level = -2), and 2.398 for subset 3 (level of ability = -2). It is concluded that TKEA performs very well to measure individuals with a low level of EI ability. It is worth to note that TKEA is currently at the development stage; therefore, in this study, we investigated TKEA's item analysis and dimensionality test of each TKEA subset.

  12. Pricing policy for declining demand using item preservation technology.

    Science.gov (United States)

    Khedlekar, Uttam Kumar; Shukla, Diwakar; Namdeo, Anubhav

    2016-01-01

    We have designed an inventory model for seasonal products in which deterioration can be controlled by item preservation technology investment. Demand for the product is considered price sensitive and decreases linearly. This study has shown that the profit is a concave function of optimal selling price, replenishment time and preservation cost parameter. We simultaneously determined the optimal selling price of the product, the replenishment cycle and the cost of item preservation technology. Additionally, this study has shown that there exists an optimal selling price and optimal preservation investment to maximize the profit for every business set-up. Finally, the model is illustrated by numerical examples and sensitive analysis of the optimal solution with respect to major parameters.

  13. Optimal pricing and marketing planning for deteriorating items.

    Directory of Open Access Journals (Sweden)

    Seyed Reza Moosavi Tabatabaei

    Full Text Available Optimal pricing and marketing planning plays an essential role in production decisions on deteriorating items. This paper presents a mathematical model for a three-level supply chain, which includes one producer, one distributor and one retailer. The proposed study considers the production of a deteriorating item where demand is influenced by price, marketing expenditure, quality of product and after-sales service expenditures. The proposed model is formulated as a geometric programming with 5 degrees of difficulty and the problem is solved using the recent advances in optimization techniques. The study is supported by several numerical examples and sensitivity analysis is performed to analyze the effects of the changes in different parameters on the optimal solution. The preliminary results indicate that with the change in parameters influencing on demand, inventory holding, inventory deteriorating and set-up costs change and also significantly affect total revenue.

  14. Optimal pricing and marketing planning for deteriorating items

    Science.gov (United States)

    Moosavi Tabatabaei, Seyed Reza; Sadjadi, Seyed Jafar; Makui, Ahmad

    2017-01-01

    Optimal pricing and marketing planning plays an essential role in production decisions on deteriorating items. This paper presents a mathematical model for a three-level supply chain, which includes one producer, one distributor and one retailer. The proposed study considers the production of a deteriorating item where demand is influenced by price, marketing expenditure, quality of product and after-sales service expenditures. The proposed model is formulated as a geometric programming with 5 degrees of difficulty and the problem is solved using the recent advances in optimization techniques. The study is supported by several numerical examples and sensitivity analysis is performed to analyze the effects of the changes in different parameters on the optimal solution. The preliminary results indicate that with the change in parameters influencing on demand, inventory holding, inventory deteriorating and set-up costs change and also significantly affect total revenue. PMID:28306750

  15. Optimal pricing and marketing planning for deteriorating items.

    Science.gov (United States)

    Moosavi Tabatabaei, Seyed Reza; Sadjadi, Seyed Jafar; Makui, Ahmad

    2017-01-01

    Optimal pricing and marketing planning plays an essential role in production decisions on deteriorating items. This paper presents a mathematical model for a three-level supply chain, which includes one producer, one distributor and one retailer. The proposed study considers the production of a deteriorating item where demand is influenced by price, marketing expenditure, quality of product and after-sales service expenditures. The proposed model is formulated as a geometric programming with 5 degrees of difficulty and the problem is solved using the recent advances in optimization techniques. The study is supported by several numerical examples and sensitivity analysis is performed to analyze the effects of the changes in different parameters on the optimal solution. The preliminary results indicate that with the change in parameters influencing on demand, inventory holding, inventory deteriorating and set-up costs change and also significantly affect total revenue.

  16. Why Japanese workers show low work engagement: An item response theory analysis of the Utrecht Work Engagement scale.

    Science.gov (United States)

    Shimazu, Akihito; Schaufeli, Wilmar B; Miyanaka, Daisuke; Iwata, Noboru

    2010-11-05

    With the globalization of occupational health psychology, more and more researchers are interested in applying employee well-being like work engagement (i.e., a positive, fulfilling, work-related state of mind that is characterized by vigor, dedication, and absorption) to diverse populations. Accurate measurement contributes to our further understanding and to the generalizability of the concept of work engagement across different cultures. The present study investigated the measurement accuracy of the Japanese and the original Dutch versions of the Utrecht Work Engagement Scale (9-item version, UWES-9) and the comparability of this scale between both countries. Item Response Theory (IRT) was applied to the data from Japan (N = 2,339) and the Netherlands (N = 13,406). Reliability of the scale was evaluated at various levels of the latent trait (i.e., work engagement) based the test information function (TIF) and the standard error of measurement (SEM). The Japanese version had difficulty in differentiating respondents with extremely low work engagement, whereas the original Dutch version had difficulty in differentiating respondents with high work engagement. The measurement accuracy of both versions was not similar. Suppression of positive affect among Japanese people and self-enhancement (the general sensitivity to positive self-relevant information) among Dutch people may have caused decreased measurement accuracy. Hence, we should be cautious when interpreting low engagement scores among Japanese as well as high engagement scores among western employees.

  17. Why Japanese workers show low work engagement: An item response theory analysis of the Utrecht Work Engagement scale

    Directory of Open Access Journals (Sweden)

    Iwata Noboru

    2010-11-01

    Full Text Available Abstract With the globalization of occupational health psychology, more and more researchers are interested in applying employee well-being like work engagement (i.e., a positive, fulfilling, work-related state of mind that is characterized by vigor, dedication, and absorption to diverse populations. Accurate measurement contributes to our further understanding and to the generalizability of the concept of work engagement across different cultures. The present study investigated the measurement accuracy of the Japanese and the original Dutch versions of the Utrecht Work Engagement Scale (9-item version, UWES-9 and the comparability of this scale between both countries. Item Response Theory (IRT was applied to the data from Japan (N = 2,339 and the Netherlands (N = 13,406. Reliability of the scale was evaluated at various levels of the latent trait (i.e., work engagement based the test information function (TIF and the standard error of measurement (SEM. The Japanese version had difficulty in differentiating respondents with extremely low work engagement, whereas the original Dutch version had difficulty in differentiating respondents with high work engagement. The measurement accuracy of both versions was not similar. Suppression of positive affect among Japanese people and self-enhancement (the general sensitivity to positive self-relevant information among Dutch people may have caused decreased measurement accuracy. Hence, we should be cautious when interpreting low engagement scores among Japanese as well as high engagement scores among western employees.

  18. Bayes Factor Covariance Testing in Item Response Models.

    Science.gov (United States)

    Fox, Jean-Paul; Mulder, Joris; Sinharay, Sandip

    2017-12-01

    Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning the underlying covariance structure are evaluated using (fractional) Bayes factor tests. The support for a unidimensional factor (i.e., assumption of local independence) and differential item functioning are evaluated by testing the covariance components. The posterior distribution of common covariance components is obtained in closed form by transforming latent responses with an orthogonal (Helmert) matrix. This posterior distribution is defined as a shifted-inverse-gamma, thereby introducing a default prior and a balanced prior distribution. Based on that, an MCMC algorithm is described to estimate all model parameters and to compute (fractional) Bayes factor tests. Simulation studies are used to show that the (fractional) Bayes factor tests have good properties for testing the underlying covariance structure of binary response data. The method is illustrated with two real data studies.

  19. Optimal ordering quantities for substitutable deteriorating items under joint replenishment with cost of substitution

    Science.gov (United States)

    Mishra, Vinod Kumar

    2017-09-01

    In this paper we develop an inventory model, to determine the optimal ordering quantities, for a set of two substitutable deteriorating items. In this inventory model the inventory level of both items depleted due to demands and deterioration and when an item is out of stock, its demands are partially fulfilled by the other item and all unsatisfied demand is lost. Each substituted item incurs a cost of substitution and the demands and deterioration is considered to be deterministic and constant. Items are order jointly in each ordering cycle, to take the advantages of joint replenishment. The problem is formulated and a solution procedure is developed to determine the optimal ordering quantities that minimize the total inventory cost. We provide an extensive numerical and sensitivity analysis to illustrate the effect of different parameter on the model. The key observation on the basis of numerical analysis, there is substantial improvement in the optimal total cost of the inventory model with substitution over without substitution.

  20. CERN Running Club – Sale of Items

    CERN Multimedia

    CERN Running club

    2018-01-01

    The CERN Running Club is organising a sale of items  on 26 June from 11:30 – 13:00 in the entry area of Restaurant 2 (504 R-202). The items for sale are souvenir prizes of past Relay Races and comprise: Backpacks, thermos, towels, gloves & caps, lamps, long sleeve winter shirts and windproof vest. All items will be sold at 5 CHF.

  1. Validation of the Ten-Item Internet Gaming Disorder Test (IGDT-10) and evaluation of the nine DSM-5 Internet Gaming Disorder criteria.

    Science.gov (United States)

    Király, Orsolya; Sleczka, Pawel; Pontes, Halley M; Urbán, Róbert; Griffiths, Mark D; Demetrovics, Zsolt

    2017-01-01

    The inclusion of Internet Gaming Disorder (IGD) in the DSM-5 (Section 3) has given rise to much scholarly debate regarding the proposed criteria and their operationalization. The present study's aim was threefold: to (i) develop and validate a brief psychometric instrument (Ten-Item Internet Gaming Disorder Test; IGDT-10) to assess IGD using definitions suggested in DSM-5, (ii) contribute to ongoing debate regards the usefulness and validity of each of the nine IGD criteria (using Item Response Theory [IRT]), and (iii) investigate the cut-off threshold suggested in the DSM-5. An online gamer sample of 4887 gamers (age range 14-64years, mean age 22.2years [SD=6.4], 92.5% male) was collected through Facebook and a gaming-related website with the cooperation of a popular Hungarian gaming magazine. A shopping voucher of approx. 300 Euros was drawn between participants to boost participation (i.e., lottery incentive). Confirmatory factor analysis and a structural regression model were used to test the psychometric properties of the IGDT-10 and IRT analysis was conducted to test the measurement performance of the nine IGD criteria. Finally, Latent Class Analysis along with sensitivity and specificity analysis were used to investigate the cut-off threshold proposed in the DSM-5. Analysis supported IGDT-10's validity, reliability, and suitability to be used in future research. Findings of the IRT analysis suggest IGD is manifested through a different set of symptoms depending on the level of severity of the disorder. More specifically, "continuation", "preoccupation", "negative consequences" and "escape" were associated with lower severity of IGD, while "tolerance", "loss of control", "giving up other activities" and "deception" criteria were associated with more severe levels. "Preoccupation" and "escape" provided very little information to the estimation IGD severity. Finally, the DSM-5 suggested threshold appeared to be supported by our statistical analyses. IGDT-10 is

  2. Evaluating an Automated Number Series Item Generator Using Linear Logistic Test Models

    Directory of Open Access Journals (Sweden)

    Bao Sheng Loe

    2018-04-01

    Full Text Available This study investigates the item properties of a newly developed Automatic Number Series Item Generator (ANSIG. The foundation of the ANSIG is based on five hypothesised cognitive operators. Thirteen item models were developed using the numGen R package and eleven were evaluated in this study. The 16-item ICAR (International Cognitive Ability Resource1 short form ability test was used to evaluate construct validity. The Rasch Model and two Linear Logistic Test Model(s (LLTM were employed to estimate and predict the item parameters. Results indicate that a single factor determines the performance on tests composed of items generated by the ANSIG. Under the LLTM approach, all the cognitive operators were significant predictors of item difficulty. Moderate to high correlations were evident between the number series items and the ICAR test scores, with high correlation found for the ICAR Letter-Numeric-Series type items, suggesting adequate nomothetic span. Extended cognitive research is, nevertheless, essential for the automatic generation of an item pool with predictable psychometric properties.

  3. Improving measurement of injection drug risk behavior using item response theory.

    Science.gov (United States)

    Janulis, Patrick

    2014-03-01

    Recent research highlights the multiple steps to preparing and injecting drugs and the resultant viral threats faced by drug users. This research suggests that more sensitive measurement of injection drug HIV risk behavior is required. In addition, growing evidence suggests there are gender differences in injection risk behavior. However, the potential for differential item functioning between genders has not been explored. To explore item response theory as an improved measurement modeling technique that provides empirically justified scaling of injection risk behavior and to examine for potential gender-based differential item functioning. Data is used from three studies in the National Institute on Drug Abuse's Criminal Justice Drug Abuse Treatment Studies. A two-parameter item response theory model was used to scale injection risk behavior and logistic regression was used to examine for differential item functioning. Item fit statistics suggest that item response theory can be used to scale injection risk behavior and these models can provide more sensitive estimates of risk behavior. Additionally, gender-based differential item functioning is present in the current data. Improved measurement of injection risk behavior using item response theory should be encouraged as these models provide increased congruence between construct measurement and the complexity of injection-related HIV risk. Suggestions are made to further improve injection risk behavior measurement. Furthermore, results suggest direct comparisons of composite scores between males and females may be misleading and future work should account for differential item functioning before comparing levels of injection risk behavior.

  4. The Dif Identification in Constructed Response Items Using Partial Credit Model

    Directory of Open Access Journals (Sweden)

    Heri Retnawati

    2017-10-01

    Full Text Available The study was to identify the load, the type and the significance of differential item functioning (DIF in constructed response item using the partial credit model (PCM. The data in the study were the students’ instruments and the students’ responses toward the PISA-like test items that had been completed by 386 ninth grade students and 460 tenth grade students who had been about 15 years old in the Province of Yogyakarta Special Region in Indonesia. The analysis toward the item characteristics through the student categorization based on their class was conducted toward the PCM using CONQUEST software. Furthermore, by applying these items characteristics, the researcher draw the category response function (CRF graphic in order to identify whether the type of DIF content had been in uniform or non-uniform. The significance of DIF was identified by comparing the discrepancy between the difficulty level parameter and the error in the CONQUEST output results. The results of the analysis showed that from 18 items that had been analyzed there were 4 items which had not been identified load DIF, there were 5 items that had been identified containing DIF but not statistically significant and there were 9 items that had been identified containing DIF significantly. The causes of items containing DIF were discussed.

  5. Calibration of the Dutch-Flemish PROMIS Pain Behavior item bank in patients with chronic pain.

    Science.gov (United States)

    Crins, M H P; Roorda, L D; Smits, N; de Vet, H C W; Westhovens, R; Cella, D; Cook, K F; Revicki, D; van Leeuwen, J; Boers, M; Dekker, J; Terwee, C B

    2016-02-01

    The aims of the current study were to calibrate the item parameters of the Dutch-Flemish PROMIS Pain Behavior item bank using a sample of Dutch patients with chronic pain and to evaluate cross-cultural validity between the Dutch-Flemish and the US PROMIS Pain Behavior item banks. Furthermore, reliability and construct validity of the Dutch-Flemish PROMIS Pain Behavior item bank were evaluated. The 39 items in the bank were completed by 1042 Dutch patients with chronic pain. To evaluate unidimensionality, a one-factor confirmatory factor analysis (CFA) was performed. A graded response model (GRM) was used to calibrate the items. To evaluate cross-cultural validity, Differential item functioning (DIF) for language (Dutch vs. English) was evaluated. Reliability of the item bank was also examined and construct validity was studied using several legacy instruments, e.g. the Roland Morris Disability Questionnaire. CFA supported the unidimensionality of the Dutch-Flemish PROMIS Pain Behavior item bank (CFI = 0.960, TLI = 0.958), the data also fit the GRM, and demonstrated good coverage across the pain behavior construct (threshold parameters range: -3.42 to 3.54). Analysis showed good cross-cultural validity (only six DIF items), reliability (Cronbach's α = 0.95) and construct validity (all correlations ≥0.53). The Dutch-Flemish PROMIS Pain Behavior item bank was found to have good cross-cultural validity, reliability and construct validity. The development of the Dutch-Flemish PROMIS Pain Behavior item bank will serve as the basis for Dutch-Flemish PROMIS short forms and computer adaptive testing (CAT). © 2015 European Pain Federation - EFIC®

  6. Refinement of a Bias-Correction Procedure for the Weighted Likelihood Estimator of Ability. Research Report. ETS RR-07-23

    Science.gov (United States)

    Zhang, Jinming; Lu, Ting

    2007-01-01

    In practical applications of item response theory (IRT), item parameters are usually estimated first from a calibration sample. After treating these estimates as fixed and known, ability parameters are then estimated. However, the statistical inferences based on the estimated abilities can be misleading if the uncertainty of the item parameter…

  7. A Comparison of the One-and Three-Parameter Logistic Models on Measures of Test Efficiency.

    Science.gov (United States)

    Benson, Jeri

    Two methods of item selection were used to select sets of 40 items from a 50-item verbal analogies test, and the resulting item sets were compared for relative efficiency. The BICAL program was used to select the 40 items having the best mean square fit to the one parameter logistic (Rasch) model. The LOGIST program was used to select the 40 items…

  8. Identifying the Source of Misfit in Item Response Theory Models.

    Science.gov (United States)

    Liu, Yang; Maydeu-Olivares, Alberto

    2014-01-01

    When an item response theory model fails to fit adequately, the items for which the model provides a good fit and those for which it does not must be determined. To this end, we compare the performance of several fit statistics for item pairs with known asymptotic distributions under maximum likelihood estimation of the item parameters: (a) a mean and variance adjustment to bivariate Pearson's X(2), (b) a bivariate subtable analog to Reiser's (1996) overall goodness-of-fit test, (c) a z statistic for the bivariate residual cross product, and (d) Maydeu-Olivares and Joe's (2006) M2 statistic applied to bivariate subtables. The unadjusted Pearson's X(2) with heuristically determined degrees of freedom is also included in the comparison. For binary and ordinal data, our simulation results suggest that the z statistic has the best Type I error and power behavior among all the statistics under investigation when the observed information matrix is used in its computation. However, if one has to use the cross-product information, the mean and variance adjusted X(2) is recommended. We illustrate the use of pairwise fit statistics in 2 real-data examples and discuss possible extensions of the current research in various directions.

  9. 38 CFR 3.1606 - Transportation items.

    Science.gov (United States)

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Transportation items. 3... Burial Benefits § 3.1606 Transportation items. The transportation costs of those persons who come within... shipment. (6) Cost of transportation by common carrier including amounts paid as Federal taxes. (7) Cost of...

  10. Grouping of Items in Mobile Web Questionnaires

    Science.gov (United States)

    Mavletova, Aigul; Couper, Mick P.

    2016-01-01

    There is some evidence that a scrolling design may reduce breakoffs in mobile web surveys compared to a paging design, but there is little empirical evidence to guide the choice of the optimal number of items per page. We investigate the effect of the number of items presented on a page on data quality in two types of questionnaires: with or…

  11. Binomial test models and item difficulty

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1979-01-01

    In choosing a binomial test model, it is important to know exactly what conditions are imposed on item difficulty. In this paper these conditions are examined for both a deterministic and a stochastic conception of item responses. It appears that they are more restrictive than is generally

  12. Comparison on Computed Tomography using industrial items

    DEFF Research Database (Denmark)

    Angel, Jais Andreas Breusch; De Chiffre, Leonardo

    2014-01-01

    In a comparison involving 27 laboratories from 8 countries, measurements on two common industrial items, a polymer part and a metal part, were carried out using X-ray Computed Tomography. All items were measured using coordinate measuring machines before and after circulation, with reference...

  13. Factoring handedness data: I. Item analysis.

    Science.gov (United States)

    Messinger, H B; Messinger, M I

    1995-12-01

    Recently in this journal Peters and Murphy challenged the validity of factor analyses done on bimodal handedness data, suggesting instead that right- and left-handers be studied separately. But bimodality may be avoidable if attention is paid to Oldfield's questionnaire format and instructions for the subjects. Two characteristics appear crucial: a two-column LEFT-RIGHT format for the body of the instrument and what we call Oldfield's Admonition: not to indicate strong preference for handedness item, such as write, unless "... the preference is so strong that you would never try to use the other hand unless absolutely forced to...". Attaining unimodality of an item distribution would seem to overcome the objections of Peters and Murphy. In a 1984 survey in Boston we used Oldfield's ten-item questionnaire exactly as published. This produced unimodal item distributions. With reflection of the five-point item scale and a logarithmic transformation, we achieved a degree of normalization for the items. Two surveys elsewhere based on Oldfield's 20-item list but with changes in the questionnaire format and the instructions, yielded markedly different item distributions with peaks at each extreme and sometimes in the middle as well.

  14. Radiation protection, radioactive waste management and site monitoring at the nuclear scientific experimental and educational centre IRT-Sofia at INRNE-BAS.

    Science.gov (United States)

    Mladenov, Al; Stankov, D; Nonova, Tz; Krezhov, K

    2014-11-01

    This article identifies important components and describes the safe practices in implementing radiation protection and radioactive waste management programmes, and in their optimisation at the Nuclear Scientific Experimental and Educational Centre with research reactor IRT at INRNE-BAS. It covers the instrumentation and personal protective equipment and organisational issues related to the continuous site monitoring. The reactor is under major reconstruction and the measures applied to radiation monitoring of environment and working area focused on restricting the radiation exposure of the staff as well as compliance with international good practices related to the environmental and public radiation safety requirements are also addressed. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  15. HIV/AIDS knowledge among men who have sex with men: applying the item response theory.

    Science.gov (United States)

    Gomes, Raquel Regina de Freitas Magalhães; Batista, José Rodrigues; Ceccato, Maria das Graças Braga; Kerr, Lígia Regina Franco Sansigolo; Guimarães, Mark Drew Crosland

    2014-04-01

    To evaluate the level of HIV/AIDS knowledge among men who have sex with men in Brazil using the latent trait model estimated by Item Response Theory. Multicenter, cross-sectional study, carried out in ten Brazilian cities between 2008 and 2009. Adult men who have sex with men were recruited (n = 3,746) through Respondent Driven Sampling. HIV/AIDS knowledge was ascertained through ten statements by face-to-face interview and latent scores were obtained through two-parameter logistic modeling (difficulty and discrimination) using Item Response Theory. Differential item functioning was used to examine each item characteristic curve by age and schooling. Overall, the HIV/AIDS knowledge scores using Item Response Theory did not exceed 6.0 (scale 0-10), with mean and median values of 5.0 (SD = 0.9) and 5.3, respectively, with 40.7% of the sample with knowledge levels below the average. Some beliefs still exist in this population regarding the transmission of the virus by insect bites, by using public restrooms, and by sharing utensils during meals. With regard to the difficulty and discrimination parameters, eight items were located below the mean of the scale and were considered very easy, and four items presented very low discrimination parameter (items contributed to the inaccuracy of the measurement of knowledge among those with median level and above. Item Response Theory analysis, which focuses on the individual properties of each item, allows measures to be obtained that do not vary or depend on the questionnaire, which provides better ascertainment and accuracy of knowledge scores. Valid and reliable scales are essential for monitoring HIV/AIDS knowledge among the men who have sex with men population over time and in different geographic regions, and this psychometric model brings this advantage.

  16. Joint Testlet Cognitive Diagnosis Modeling for Paired Local Item Dependence in Response Times and Response Accuracy

    Directory of Open Access Journals (Sweden)

    Peida Zhan

    2018-04-01

    Full Text Available In joint models for item response times (RTs and response accuracy (RA, local item dependence is composed of local RA dependence and local RT dependence. The two components are usually caused by the same common stimulus and emerge as pairs. Thus, the violation of local item independence in the joint models is called paired local item dependence. To address the issue of paired local item dependence while applying the joint cognitive diagnosis models (CDMs, this study proposed a joint testlet cognitive diagnosis modeling approach. The proposed approach is an extension of Zhan et al. (2017 and it incorporates two types of random testlet effect parameters (one for RA and the other for RTs to account for paired local item dependence. The model parameters were estimated using the full Bayesian Markov chain Monte Carlo (MCMC method. The 2015 PISA computer-based mathematics data were analyzed to demonstrate the application of the proposed model. Further, a brief simulation study was conducted to demonstrate the acceptable parameter recovery and the consequence of ignoring paired local item dependence.

  17. The randomly renewed general item and the randomly inspected item with exponential life distribution

    International Nuclear Information System (INIS)

    Schneeweiss, W.G.

    1979-01-01

    For a randomly renewed item the probability distributions of the time to failure and of the duration of down time and the expectations of these random variables are determined. Moreover, it is shown that the same theory applies to randomly checked items with exponential probability distribution of life such as electronic items. The case of periodic renewals is treated as an example. (orig.) [de

  18. Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items

    Science.gov (United States)

    Aybek, Eren Can; Demirtasli, R. Nukhet

    2017-01-01

    This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…

  19. Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory

    Science.gov (United States)

    Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi

    2016-01-01

    High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…

  20. Asymptotic Standard Errors for Item Response Theory True Score Equating of Polytomous Items

    Science.gov (United States)

    Cher Wong, Cheow

    2015-01-01

    Building on previous works by Lord and Ogasawara for dichotomous items, this article proposes an approach to derive the asymptotic standard errors of item response theory true score equating involving polytomous items, for equivalent and nonequivalent groups of examinees. This analytical approach could be used in place of empirical methods like…

  1. The Technical Quality of Test Items Generated Using a Systematic Approach to Item Writing.

    Science.gov (United States)

    Siskind, Theresa G.; Anderson, Lorin W.

    The study was designed to examine the similarity of response options generated by different item writers using a systematic approach to item writing. The similarity of response options to student responses for the same item stems presented in an open-ended format was also examined. A non-systematic (subject matter expertise) approach and a…

  2. Parameter Estimation

    DEFF Research Database (Denmark)

    Sales-Cruz, Mauricio; Heitzig, Martina; Cameron, Ian

    2011-01-01

    of optimisation techniques coupled with dynamic solution of the underlying model. Linear and nonlinear approaches to parameter estimation are investigated. There is also the application of maximum likelihood principles in the estimation of parameters, as well as the use of orthogonal collocation to generate a set......In this chapter the importance of parameter estimation in model development is illustrated through various applications related to reaction systems. In particular, rate constants in a reaction system are obtained through parameter estimation methods. These approaches often require the application...... of algebraic equations as the basis for parameter estimation.These approaches are illustrated using estimations of kinetic constants from reaction system models....

  3. Refinement of the Brazilian Household Food Insecurity Measurement Scale: Recommendation for a 14-item EBIA

    Directory of Open Access Journals (Sweden)

    Ana Maria Segall-Corrêa

    2014-04-01

    Full Text Available OBJECTIVE: To review and refine Brazilian Household Food Insecurity Measurement Scale structure. METHODS: The study analyzed the impact of removing the item "adult lost weight" and one of two possibly redundant items on Brazilian Household Food Insecurity Measurement Scale psychometric behavior using the one-parameter logistic (Rasch model. Brazilian Household Food Insecurity Measurement Scale psychometric behavior was analyzed with respect to acceptable adjustment values ranging from 0.7 to 1.3, and to severity scores of the items with theoretically expected gradients. The socioeconomic and food security indicators came from the 2004 National Household Sample Survey, which obtained complete answers to Brazilian Household Food Insecurity Measurement Scale items from 112,665 households. RESULTS: Removing the items "adult reduced amount..." followed by "adult ate less..." did not change the infit of the remaining items, except for "adult lost weight", whose infit increased from 1.21 to 1.56. The internal consistency and item severity scores did not change when "adult ate less" and one of the two redundant items were removed. CONCLUSION: Brazilian Household Food Insecurity Measurement Scale reanalysis reduced the number of scale items from 16 to 14 without changing its internal validity. Its use as a nationwide household food security measure is strongly recommended.

  4. Automated Item Generation with Recurrent Neural Networks.

    Science.gov (United States)

    von Davier, Matthias

    2018-03-12

    Utilizing technology for automated item generation is not a new idea. However, test items used in commercial testing programs or in research are still predominantly written by humans, in most cases by content experts or professional item writers. Human experts are a limited resource and testing agencies incur high costs in the process of continuous renewal of item banks to sustain testing programs. Using algorithms instead holds the promise of providing unlimited resources for this crucial part of assessment development. The approach presented here deviates in several ways from previous attempts to solve this problem. In the past, automatic item generation relied either on generating clones of narrowly defined item types such as those found in language free intelligence tests (e.g., Raven's progressive matrices) or on an extensive analysis of task components and derivation of schemata to produce items with pre-specified variability that are hoped to have predictable levels of difficulty. It is somewhat unlikely that researchers utilizing these previous approaches would look at the proposed approach with favor; however, recent applications of machine learning show success in solving tasks that seemed impossible for machines not too long ago. The proposed approach uses deep learning to implement probabilistic language models, not unlike what Google brain and Amazon Alexa use for language processing and generation.

  5. Estimation of an Examinee's Ability in the Web-Based Computerized Adaptive Testing Program IRT-CAT

    Directory of Open Access Journals (Sweden)

    Yoon-Hwan Lee

    2006-11-01

    Full Text Available We developed a program to estimate an examinee's ability in order to provide freely available access to a web-based computerized adaptive testing (CAT program. We used PHP and Java Script as the program languages, PostgresSQL as the database management system on an Apache web server and Linux as the operating system. A system which allows for user input and searching within inputted items and creates tests was constructed. We performed an ability estimation on each test based on a Rasch model and 2- or 3-parametric logistic models. Our system provides an algorithm for a web-based CAT, replacing previous personal computer-based ones, and makes it possible to estimate an examinee?占퐏 ability immediately at the end of test.

  6. Poisson and negative binomial item count techniques for surveys with sensitive question.

    Science.gov (United States)

    Tian, Guo-Liang; Tang, Man-Lai; Wu, Qin; Liu, Yin

    2017-04-01

    Although the item count technique is useful in surveys with sensitive questions, privacy of those respondents who possess the sensitive characteristic of interest may not be well protected due to a defect in its original design. In this article, we propose two new survey designs (namely the Poisson item count technique and negative binomial item count technique) which replace several independent Bernoulli random variables required by the original item count technique with a single Poisson or negative binomial random variable, respectively. The proposed models not only provide closed form variance estimate and confidence interval within [0, 1] for the sensitive proportion, but also simplify the survey design of the original item count technique. Most importantly, the new designs do not leak respondents' privacy. Empirical results show that the proposed techniques perform satisfactorily in the sense that it yields accurate parameter estimate and confidence interval.

  7. NHRIC (National Health Related Items Code)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The National Health Related Items Code (NHRIC) is a system for identification and numbering of marketed device packages that is compatible with other numbering...

  8. Basic Stand Alone Carrier Line Items PUF

    Data.gov (United States)

    U.S. Department of Health & Human Services — This release contains the Basic Stand Alone (BSA) Carrier Line Items Public Use Files (PUF) with information from Medicare Carrier claims. The CMS BSA Carrier Line...

  9. Inventions on presenting textual items in Graphical User Interface

    OpenAIRE

    Mishra, Umakant

    2014-01-01

    Although a GUI largely replaces textual descriptions by graphical icons, the textual items are not completely removed. The textual items are inevitably used in window titles, message boxes, help items, menu items and popup items. Textual items are necessary for communicating messages that are beyond the limitation of graphical messages. However, it is necessary to harness the textual items on the graphical interface in such a way that they complement each other to produce the best effect. One...

  10. The role of attention in item-item binding in visual working memory.

    Science.gov (United States)

    Peterson, Dwight J; Naveh-Benjamin, Moshe

    2017-09-01

    An important yet unresolved question regarding visual working memory (VWM) relates to whether or not binding processes within VWM require additional attentional resources compared with processing solely the individual components comprising these bindings. Previous findings indicate that binding of surface features (e.g., colored shapes) within VWM is not demanding of resources beyond what is required for single features. However, it is possible that other types of binding, such as the binding of complex, distinct items (e.g., faces and scenes), in VWM may require additional resources. In 3 experiments, we examined VWM item-item binding performance under no load, articulatory suppression, and backward counting using a modified change detection task. Binding performance declined to a greater extent than single-item performance under higher compared with lower levels of concurrent load. The findings from each of these experiments indicate that processing item-item bindings within VWM requires a greater amount of attentional resources compared with single items. These findings also highlight an important distinction between the role of attention in item-item binding within VWM and previous studies of long-term memory (LTM) where declines in single-item and binding test performance are similar under divided attention. The current findings provide novel evidence that the specific type of binding is an important determining factor regarding whether or not VWM binding processes require attention. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  11. Evaluation of item candidates for a diabetic retinopathy quality of life item bank.

    Science.gov (United States)

    Fenwick, Eva K; Pesudovs, Konrad; Khadka, Jyoti; Rees, Gwyn; Wong, Tien Y; Lamoureux, Ecosse L

    2013-09-01

    We are developing an item bank assessing the impact of diabetic retinopathy (DR) on quality of life (QoL) using a rigorous multi-staged process combining qualitative and quantitative methods. We describe here the first two qualitative phases: content development and item evaluation. After a comprehensive literature review, items were generated from four sources: (1) 34 previously validated patient-reported outcome measures; (2) five published qualitative articles; (3) eight focus groups and 18 semi-structured interviews with 57 DR patients; and (4) seven semi-structured interviews with diabetes or ophthalmic experts. Items were then evaluated during 3 stages, namely binning (grouping) and winnowing (reduction) based on key criteria and panel consensus; development of item stems and response options; and pre-testing of items via cognitive interviews with patients. The content development phase yielded 1,165 unique items across 7 QoL domains. After 3 sessions of binning and winnowing, items were reduced to a minimally representative set (n = 312) across 9 domains of QoL: visual symptoms; ocular surface symptoms; activity limitation; mobility; emotional; health concerns; social; convenience; and economic. After 8 cognitive interviews, 42 items were amended resulting in a final set of 314 items. We have employed a systematic approach to develop items for a DR-specific QoL item bank. The psychometric properties of the nine QoL subscales will be assessed using Rasch analysis. The resulting validated item bank will allow clinicians and researchers to better understand the QoL impact of DR and DR therapies from the patient's perspective.

  12. Diagnostics of transparent polymer coatings of metal items

    Science.gov (United States)

    Varepo, L. G.; Ermakova, I. N.; Nagornova, I. V.; Kondratov, A. P.

    2017-08-01

    The methods of visual and instrumental express diagnostics of safety critical defects and non-uniform thickness of transparent mono- and multilayer polyolefin surface coating of metal items are analyzed in the paper. The instrumental diagnostics method relates to colorimetric measuring based on effects, which appear in the polarized light for extrusion polymer coatings. A color coordinates dependence (in the color system CIE La*b*) on both HDPE / PVC coating thickness fluctuation values (from average ones) and coating interlayer or adhesion layer delaminating is shown. A variation of color characteristics in the polarized light at a liquid penetration into delaminated polymer layers is found. Measuring parameters and critical uncertainties are defined.

  13. A production inventory model with deteriorating items and shortages

    Directory of Open Access Journals (Sweden)

    Samanta G.P.

    2004-01-01

    Full Text Available A continuous production control inventory model for deteriorating items with shortages is developed. A number of structural properties of the inventory system are studied analytically. The formulae for the optimal average system cost, stock level, backlog level and production cycle time are derived when the deterioration rate is very small. Numerical examples are taken to illustrate the procedure of finding the optimal total inventory cost, stock level, backlog level and production cycle time. Sensitivity analysis is carried out to demonstrate the effects of changing parameter values on the optimal solution of the system.

  14. An item response theory evaluation of the young mania rating scale and the montgomery-asberg depression rating scale in the systematic treatment enhancement program for bipolar disorder (STEP-BD).

    Science.gov (United States)

    Prisciandaro, James J; Tolliver, Bryan K

    2016-11-15

    The Young Mania Rating Scale (YMRS) and Montgomery-Asberg Depression Rating Scale (MADRS) are among the most widely used outcome measures for clinical trials of medications for Bipolar Disorder (BD). Nonetheless, very few studies have examined the measurement characteristics of the YMRS and MADRS in individuals with BD using modern psychometric methods. The present study evaluated the YMRS and MADRS in the Systematic Treatment Enhancement Program for BD (STEP-BD) study using Item Response Theory (IRT). Baseline data from 3716 STEP-BD participants were available for the present analysis. The Graded Response Model (GRM) was fit separately to YMRS and MADRS item responses. Differential item functioning (DIF) was examined by regressing a variety of clinically relevant covariates (e.g., sex, substance dependence) on all test items and on the latent symptom severity dimension, within each scale. Both scales: 1) contained several items that provided little or no psychometric information, 2) were inefficient, in that the majority of item response categories did not provide incremental psychometric information, 3) poorly measured participants outside of a narrow band of severity, 4) evidenced DIF for nearly all items, suggesting that item responses were, in part, determined by factors other than symptom severity. Limited to outpatients; DIF analysis only sensitive to certain forms of DIF. The present study provides evidence for significant measurement problems involving the YMRS and MADRS. More work is needed to refine these measures and/or develop suitable alternative measures of BD symptomatology for clinical trials research. Copyright © 2016 Elsevier B.V. All rights reserved.

  15. Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level.

    Science.gov (United States)

    Savalei, Victoria; Rhemtulla, Mijke

    2017-08-01

    In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately handle missing data at the item level. Item-level multiple imputation (MI), however, can handle such missing data straightforwardly. In this article, we develop an analytic approach for dealing with item-level missing data-that is, one that obtains a unique set of parameter estimates directly from the incomplete data set and does not require imputations. The proposed approach is a variant of the two-stage maximum likelihood (TSML) methodology, and it is the analytic equivalent of item-level MI. We compare the new TSML approach to three existing alternatives for handling item-level missing data: scale-level full information maximum likelihood, available-case maximum likelihood, and item-level MI. We find that the TSML approach is the best analytic approach, and its performance is similar to item-level MI. We recommend its implementation in popular software and its further study.

  16. Response Mixture Modeling: Accounting for Heterogeneity in Item Characteristics across Response Times.

    Science.gov (United States)

    Molenaar, Dylan; de Boeck, Paul

    2018-06-01

    In item response theory modeling of responses and response times, it is commonly assumed that the item responses have the same characteristics across the response times. However, heterogeneity might arise in the data if subjects resort to different response processes when solving the test items. These differences may be within-subject effects, that is, a subject might use a certain process on some of the items and a different process with different item characteristics on the other items. If the probability of using one process over the other process depends on the subject's response time, within-subject heterogeneity of the item characteristics across the response times arises. In this paper, the method of response mixture modeling is presented to account for such heterogeneity. Contrary to traditional mixture modeling where the full response vectors are classified, response mixture modeling involves classification of the individual elements in the response vector. In a simulation study, the response mixture model is shown to be viable in terms of parameter recovery. In addition, the response mixture model is applied to a real dataset to illustrate its use in investigating within-subject heterogeneity in the item characteristics across response times.

  17. A strategy for optimizing item-pool management

    NARCIS (Netherlands)

    Ariel, A.; van der Linden, Willem J.; Veldkamp, Bernard P.

    2006-01-01

    Item-pool management requires a balancing act between the input of new items into the pool and the output of tests assembled from it. A strategy for optimizing item-pool management is presented that is based on the idea of a periodic update of an optimal blueprint for the item pool to tune item

  18. Creation and validation of the barriers to alcohol reduction (BAR) scale using classical test theory and item response theory.

    Science.gov (United States)

    Kunicki, Zachary J; Schick, Melissa R; Spillane, Nichea S; Harlow, Lisa L

    2018-06-01

    Those who binge drink are at increased risk for alcohol-related consequences when compared to non-binge drinkers. Research shows individuals may face barriers to reducing their drinking behavior, but few measures exist to assess these barriers. This study created and validated the Barriers to Alcohol Reduction (BAR) scale. Participants were college students ( n  = 230) who endorsed at least one instance of past-month binge drinking (4+ drinks for women or 5+ drinks for men). Using classical test theory, exploratory structural equation modeling found a two-factor structure of personal/psychosocial barriers and perceived program barriers. The sub-factors, and full scale had reasonable internal consistency (i.e., coefficient omega = 0.78 (personal/psychosocial), 0.82 (program barriers), and 0.83 (full measure)). The BAR also showed evidence for convergent validity with the Brief Young Adult Alcohol Consequences Questionnaire ( r  = 0.39, p  Theory (IRT) analysis showed the two factors separately met the unidimensionality assumption, and provided further evidence for severity of the items on the two factors. Results suggest that the BAR measure appears reliable and valid for use in an undergraduate student population of binge drinkers. Future studies may want to re-examine this measure in a more diverse sample.

  19. The multi-dimensional model of Māori identity and cultural engagement: item response theory analysis of scale properties.

    Science.gov (United States)

    Sibley, Chris G; Houkamau, Carla A

    2013-01-01

    We argue that there is a need for culture-specific measures of identity that delineate the factors that most make sense for specific cultural groups. One such measure, recently developed specifically for Māori peoples, is the Multi-Dimensional Model of Māori Identity and Cultural Engagement (MMM-ICE). Māori are the indigenous peoples of New Zealand. The MMM-ICE is a 6-factor measure that assesses the following aspects of identity and cultural engagement as Māori: (a) group membership evaluation, (b) socio-political consciousness, (c) cultural efficacy and active identity engagement, (d) spirituality, (e) interdependent self-concept, and (f) authenticity beliefs. This article examines the scale properties of the MMM-ICE using item response theory (IRT) analysis in a sample of 492 Māori. The MMM-ICE subscales showed reasonably even levels of measurement precision across the latent trait range. Analysis of age (cohort) effects further indicated that most aspects of Māori identification tended to be higher among older Māori, and these cohort effects were similar for both men and women. This study provides novel support for the reliability and measurement precision of the MMM-ICE. The study also provides a first step in exploring change and stability in Māori identity across the life span. A copy of the scale, along with recommendations for scale scoring, is included.

  20. Influence of core model parameters on the characteristics of neutron beams of the research reactor

    Directory of Open Access Journals (Sweden)

    N. A. Khafizova

    2013-12-01

    Full Text Available IRT MEPhI reactor is equipped with a number of facilities at horizontal experimental channels (HEC. Knowing of parameters influencing spatio-angular distribution of irradiation fields is essential for each application area. The research for neutron capture therapy (NCT facility at HEC of the reactor was made. Calculation methods have been used to estimate how the reactor core parameters influence neutron beam characteristics at the HEC output. The impact of neutron source model in Monte Carlo calculations by MCNP code on the parameters of neutron and secondary photon field at the output of irradiation beam tubes of research reactor is estimated. The study shows that specifying neutron source with fission reaction rate distribution in SDEF option gives almost the same results as criticality calculation considered the most accurate. Our calculations show that changes of the core operational parameters have insignificant influence on characteristics of neutron beams at HEC output.

  1. Inventory parameters

    CERN Document Server

    Sharma, Sanjay

    2017-01-01

    This book provides a detailed overview of various parameters/factors involved in inventory analysis. It especially focuses on the assessment and modeling of basic inventory parameters, namely demand, procurement cost, cycle time, ordering cost, inventory carrying cost, inventory stock, stock out level, and stock out cost. In the context of economic lot size, it provides equations related to the optimum values. It also discusses why the optimum lot size and optimum total relevant cost are considered to be key decision variables, and uses numerous examples to explain each of these inventory parameters separately. Lastly, it provides detailed information on parameter estimation for different sectors/products. Written in a simple and lucid style, it offers a valuable resource for a broad readership, especially Master of Business Administration (MBA) students.

  2. Counterfeit and Fraudulent Items - Mitigating the risk

    International Nuclear Information System (INIS)

    Tannenbaum, Marc

    2011-01-01

    This presentation (slides) provides an overview of the industry's challenges and activities. Firstly, it outlines the differences between counterfeit, fraudulent, suspect, and also substandard items. Notice is given that items could be found not to meet the standard, but the difference in the intent to deceive with counterfeit and fraudulent items is the critical element. Examples from other industries are used which also rely heavily on the assurance of quality for safety. It also informs that EPRI has just completed a report in October 2009 in coordination with other US government agencies and industry organizations; this report, entitled Counterfeit, Substandard and Fraudulent Items, number 1019163, is available for free on the EPRI web site. As a follow-up to this report, EPRI is developing a CFSI Database; any country interested in a collaborative agreement is invited to use and contribute to the database information. Finally, it stresses the importance of the oversight of contractors, training to raise the awareness of the employees and the inspectors, and having a response plan for identified items

  3. The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency.

    Science.gov (United States)

    Rose, Matthias; Bjorner, Jakob B; Gandek, Barbara; Bruce, Bonnie; Fries, James F; Ware, John E

    2014-05-01

    To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. The items were evaluated using qualitative and quantitative methods. A total of 16,065 adults answered item subsets (n>2,200/item) on the Internet, with oversampling of the chronically ill. Classical test and item response theory methods were used to evaluate 149 PROMIS PF items plus 10 Short Form-36 and 20 Health Assessment Questionnaire-Disability Index items. A graded response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living. In simulations, a 10-item computerized adaptive test (CAT) eliminated floor and decreased ceiling effects, achieving higher measurement precision than any comparable length static tool across four SDs of the measurement range. Improved psychometric properties were transferred to the CAT's superior ability to identify differences between age and disease groups. The item bank provides a common metric and can improve the measurement of PF by facilitating the standardization of patient-reported outcome measures and implementation of CATs for more efficient PF assessments over a larger range. Copyright © 2014. Published by Elsevier Inc.

  4. An improved non-Markovian degradation model with long-term dependency and item-to-item uncertainty

    Science.gov (United States)

    Xi, Xiaopeng; Chen, Maoyin; Zhang, Hanwen; Zhou, Donghua

    2018-05-01

    It is widely noted in the literature that the degradation should be simplified into a memoryless Markovian process for the purpose of predicting the remaining useful life (RUL). However, there actually exists the long-term dependency in the degradation processes of some industrial systems, including electromechanical equipments, oil tankers, and large blast furnaces. This implies the new degradation state depends not only on the current state, but also on the historical states. Such dynamic systems cannot be accurately described by traditional Markovian models. Here we present an improved non-Markovian degradation model with both the long-term dependency and the item-to-item uncertainty. As a typical non-stationary process with dependent increments, fractional Brownian motion (FBM) is utilized to simulate the fractal diffusion of practical degradations. The uncertainty among multiple items can be represented by a random variable of the drift. Based on this model, the unknown parameters are estimated through the maximum likelihood (ML) algorithm, while a closed-form solution to the RUL distribution is further derived using a weak convergence theorem. The practicability of the proposed model is fully verified by two real-world examples. The results demonstrate that the proposed method can effectively reduce the prediction error.

  5. The development of the nuclear physics in Latvia III. The research nuclear reactor IRT begins to work in Latvia

    International Nuclear Information System (INIS)

    Ulmanis, U.

    2005-01-01

    This article is associated with the study of reactors technical parameters with specific interest on the effect the distribution of neutron and gamma radiation through the reactor's cooling systems has on the environment. Scientist began by implementing monitoring system to assist in the research of nuclear spectroscopy, neutron activation analysis, neutron diffraction, solid-state radiation physics, chemistry and radiobiology. The first sets of results are summarized with in the article. (author)

  6. Verification of Differential Item Functioning (DIF) Status of West ...

    African Journals Online (AJOL)

    This study investigated test item bias and Differential Item Functioning (DIF) of West African ... items in chemistry function differentially with respect to gender and location. In Aba education zone of Abia, 50 secondary schools were purposively ...

  7. Conjunctive and Disjunctive Item Response Functions.

    Science.gov (United States)

    1984-10-01

    fed set ofvaluesof a, b, AI , B1 A2 2 . 2 A3 , and 13 , the f ’. g ’a. nd h’a in (7) are fied. Equation (7) must still hold for S - e19029e3,..* . Thus...for Item I Is -- b ?(a:1 , b1 ,O) (1 + ’)(I + e4 (22 where a and pi are arbitrary constants. These constants mst be the sam for all Items In a given...NETHERLIS I E3I1 Focility-Acquisitions 4133 Rugby Avnue 1 Lee Cronbach Bethesda, NO 20014 16 Laburnue Road Atherton, CA 94205 1 Dr. Benjamin A. Fairbank

  8. A Comparison of the 27-Item and 12-Item Intolerance of Uncertainty Scales

    Science.gov (United States)

    Khawaja, Nigar G.; Yu, Lai Ngo Heidi

    2010-01-01

    The 27-item Intolerance of Uncertainty Scale (IUS) has become one of the most frequently used measures of Intolerance of Uncertainty. More recently, an abridged, 12-item version of the IUS has been developed. The current research used clinical (n = 50) and non-clinical (n = 56) samples to examine and compare the psychometric properties of both…

  9. Differential item functioning magnitude and impact measures from item response theory models.

    Science.gov (United States)

    Kleinman, Marjorie; Teresi, Jeanne A

    2016-01-01

    Measures of magnitude and impact of differential item functioning (DIF) at the item and scale level, respectively are presented and reviewed in this paper. Most measures are based on item response theory models. Magnitude refers to item level effect sizes, whereas impact refers to differences between groups at the scale score level. Reviewed are magnitude measures based on group differences in the expected item scores and impact measures based on differences in the expected scale scores. The similarities among these indices are demonstrated. Various software packages are described that provide magnitude and impact measures, and new software presented that computes all of the available statistics conveniently in one program with explanations of their relationships to one another.

  10. The continuity between DSM-5 obsessive-compulsive personality disorder traits and obsessive-compulsive symptoms in adolescence: an item response theory study.

    Science.gov (United States)

    De Caluwé, Elien; Rettew, David C; De Clercq, Barbara

    2014-11-01

    Various studies have shown that obsessive-compulsive symptoms exist as part of not only obsessive-compulsive disorder (OCD) but also obsessive-compulsive personality disorder (OCPD). Despite these shared characteristics, there is an ongoing debate on the inclusion of OCPD into the recently developed DSM-5 obsessive-compulsive and related disorders (OCRDs) category. The current study aims to clarify whether this inclusion can be justified from an item response theory approach. The validity of the continuity model for understanding the association between OCD and OCPD was explored in 787 Dutch community and referred adolescents (70% female, 12-20 years old, mean = 16.16, SD = 1.40) studied between July 2011 and January 2013, relying on item response theory (IRT) analyses of self-reported OCD symptoms (Youth Obsessive-Compulsive Symptoms Scale [YOCSS]) and OCPD traits (Personality Inventory for DSM-5 [PID-5]). The results support the continuity hypothesis, indicating that both OCD and OCPD can be represented along a single underlying spectrum. OCD, and especially the obsessive symptom domain, can be considered as the extreme end of OCPD traits. The current study empirically supports the classification of OCD and OCPD along a single dimension. This integrative perspective in OC-related pathology addresses the dimensional nature of traits and psychopathology and may improve the transparency and validity of assessment procedures. © Copyright 2014 Physicians Postgraduate Press, Inc.

  11. Effects of the location of a biased limiter on turbulent transport in the IR-T1 tokamak plasma

    International Nuclear Information System (INIS)

    Alipour, R.; Ghoranneviss, M.; Salar Elahi, A.; Meshkani, S.

    2017-01-01

    Plasma confinement plays an important role in fusion study. Applying an external voltage using limiter biasing system is proved to be an efficient approach for plasma confinement. In this study, the position of the limiter biasing system was changed to investigate the effect of applying external voltages at different places to the plasma. The external voltages of ±200 V were applied at the different positions of edge, 5 mm and 10 mm inside the plasma. Then, the main plasma parameters were measured. The results show that the poloidal turbulent transport and radial electric field increased about 25-35% and 35-45%, respectively (specially when the limiter biasing system was placed 5 mm inside the plasma). Also, the Reynolds stress has experienced its maximum reduction about 5-10% when the limiter biasing system was at 5 mm inside the plasma and the voltage of +200 V was applied to the plasma. When the limiter biasing system move 10 mm inside the plasma, the main plasma parameters experienced more instabilities and fluctuations than other positions. (authors)

  12. 47 CFR 32.7600 - Extraordinary items.

    Science.gov (United States)

    2010-10-01

    ... FOR TELECOMMUNICATIONS COMPANIES Instructions For Other Income Accounts § 32.7600 Extraordinary items... extraordinary. Extraordinary events and transactions are distinguished by both their unusual nature and by the infrequency of their occurrence, taking into account the environment in which the company operates. This...

  13. Soviet Cybernetics: Recent News Items, Number Thirteen.

    Science.gov (United States)

    Holland, Wade B.

    An issue of "Soviet Cybernetics: Recent News Items" consists of English translations of the leading recent Soviet contributions to the study of cybernetics. Articles deal with cybernetics in the 21st Century; the Soviet State Committee on Science and Technology; economic reforms in Rudnev's ministry; an interview with Rudnev; Dnepr-2; Dnepr-2…

  14. Random Item Generation Is Affected by Age

    Science.gov (United States)

    Multani, Namita; Rudzicz, Frank; Wong, Wing Yiu Stephanie; Namasivayam, Aravind Kumar; van Lieshout, Pascal

    2016-01-01

    Purpose: Random item generation (RIG) involves central executive functioning. Measuring aspects of random sequences can therefore provide a simple method to complement other tools for cognitive assessment. We examine the extent to which RIG relates to specific measures of cognitive function, and whether those measures can be estimated using RIG…

  15. In-Process Items on LCS.

    Science.gov (United States)

    Russell, Thyra K.

    Morris Library at Southern Illinois University computerized its technical processes using the Library Computer System (LCS), which was implemented in the library to streamline order processing by: (1) providing up-to-date online files to track in-process items; (2) encouraging quick, efficient accessing of information; (3) reducing manual files;…

  16. Item Effects in Recognition Memory for Words

    Science.gov (United States)

    Freeman, Emily; Heathcote, Andrew; Chalmers, Kerry; Hockley, William

    2010-01-01

    We investigate the effects of word characteristics on episodic recognition memory using analyses that avoid Clark's (1973) "language-as-a-fixed-effect" fallacy. Our results demonstrate the importance of modeling word variability and show that episodic memory for words is strongly affected by item noise (Criss & Shiffrin, 2004), as measured by the…

  17. 77 FR 59339 - Acquisition of Commercial Items

    Science.gov (United States)

    2012-09-27

    ... DEPARTMENT OF DEFENSE Defense Acquisition Regulations System 48 CFR Part 212 Acquisition of Commercial Items CFR Correction 212.504 [Corrected] In Title 48 of the Code of Federal Regulations, Chapter 2 (Parts 201--299), revised as of October 1, 2011, on page 73, in section 212.504, paragraph (a) is...

  18. Bayesian item selection criteria for adaptive testing

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1996-01-01

    R.J. Owen (1975) proposed an approximate empirical Bayes procedure for item selection in adaptive testing. The procedure replaces the true posterior by a normal approximation with closed-form expressions for its first two moments. This approximation was necessary to minimize the computational

  19. Aging and Confidence Judgments in Item Recognition

    Science.gov (United States)

    Voskuilen, Chelsea; Ratcliff, Roger; McKoon, Gail

    2018-01-01

    We examined the effects of aging on performance in an item-recognition experiment with confidence judgments. A model for confidence judgments and response time (RTs; Ratcliff & Starns, 2013) was used to fit a large amount of data from a new sample of older adults and a previously reported sample of younger adults. This model of confidence…

  20. 10 CFR 74.55 - Item monitoring.

    Science.gov (United States)

    2010-01-01

    ... Quantities of Strategic Special Nuclear Material § 74.55 Item monitoring. (a) Licensees subject to § 74.51... quantitatively measured, the validity of that measurement independently confirmed, and that additionally have..., except for reactor components measuring at least one meter in length and weighing in excess of 30...