WorldWideScience

Sample records for applying item response

  1. Applying Item Response Theory methods to design a learning progression-based science assessment

    Science.gov (United States)

    Chen, Jing

    Learning progressions are used to describe how students' understanding of a topic progresses over time and to classify the progress of students into steps or levels. This study applies Item Response Theory (IRT) based methods to investigate how to design learning progression-based science assessments. The research questions of this study are: (1) how to use items in different formats to classify students into levels on the learning progression, (2) how to design a test to give good information about students' progress through the learning progression of a particular construct and (3) what characteristics of test items support their use for assessing students' levels. Data used for this study were collected from 1500 elementary and secondary school students during 2009--2010. The written assessment was developed in several formats such as the Constructed Response (CR) items, Ordered Multiple Choice (OMC) and Multiple True or False (MTF) items. The followings are the main findings from this study. The OMC, MTF and CR items might measure different components of the construct. A single construct explained most of the variance in students' performances. However, additional dimensions in terms of item format can explain certain amount of the variance in student performance. So additional dimensions need to be considered when we want to capture the differences in students' performances on different types of items targeting the understanding of the same underlying progression. Items in each item format need to be improved in certain ways to classify students more accurately into the learning progression levels. This study establishes some general steps that can be followed to design other learning progression-based tests as well. For example, first, the boundaries between levels on the IRT scale can be defined by using the means of the item thresholds across a set of good items. Second, items in multiple formats can be selected to achieve the information criterion at all

  2. Applying Unidimensional and Multidimensional Item Response Theory Models in Testlet-Based Reading Assessment

    Science.gov (United States)

    Min, Shangchao; He, Lianzhen

    2014-01-01

    This study examined the relative effectiveness of the multidimensional bi-factor model and multidimensional testlet response theory (TRT) model in accommodating local dependence in testlet-based reading assessment with both dichotomously and polytomously scored items. The data used were 14,089 test-takers' item-level responses to the testlet-based…

  3. Item response theory applied to factors affecting the patient journey towards hearing rehabilitation

    Directory of Open Access Journals (Sweden)

    Michelene Chenault

    2016-11-01

    Full Text Available To develop a tool for use in hearing screening and to evaluate the patient journey towards hearing rehabilitation, responses to the hearing aid rehabilitation questionnaire scales aid stigma, pressure, and aid unwanted addressing respectively hearing aid stigma, experienced pressure from others; perceived hearing aid benefit were evaluated with item response theory. The sample was comprised of 212 persons aged 55 years or more; 63 were hearing aid users, 64 with and 85 persons without hearing impairment according to guidelines for hearing aid reimbursement in the Netherlands. Bias was investigated relative to hearing aid use and hearing impairment within the differential test functioning framework. Items compromising model fit or demonstrating differential item functioning were dropped. The aid stigma scale was reduced from 6 to 4, the pressure scale from 7 to 4, and the aid unwanted scale from 5 to 4 items. This procedure resulted in bias-free scales ready for screening purposes and application to further understand the help-seeking process of the hearing impaired.

  4. Generalizability theory and item response theory

    NARCIS (Netherlands)

    Glas, C.A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a s

  5. Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

    Science.gov (United States)

    Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

    2013-07-01

    Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.

  6. Teoria da resposta ao item aplicada ao Inventário de Depressão Beck Item response theory applied to the Beck Depression Inventory

    Directory of Open Access Journals (Sweden)

    Stela Maris de Jezus Castro

    2010-09-01

    Full Text Available O Inventário de Depressão Beck (BDI, uma escala que mede o traço latente de intensidade de sintomas depressivos, pode ser avaliado através da Teoria da Resposta ao Item (TRI. Este estudo utilizou o modelo TRI de Resposta Gradual na avaliação da intensidade de sintomas depressivos de 4.025 indivíduos que responderam ao BDI, de modo a explorar eficientemente a informação disponível nos diferentes aspectos possibilitados pelo uso desta metodologia. O ajuste foi efetuado no software PARSCALE. Foram identificados 13 itens do BDI nos quais pelo menos uma categoria de resposta não tinha chance maior que as demais de ser escolhida, de modo que estes itens tiveram de ser recategorizados. Os itens com maior capacidade de discriminação são relativos à tristeza, pessimismo, sentimento de fracasso, insatisfação, auto-aversão, indecisão e dificuldade para trabalhar. Os itens mais graves são aqueles relacionados com perda de peso, retraimento social e idéias suicidas. O grupo dos 202 indivíduos com as maiores intensidades de sintomas depressivos foi composto por 74% de mulheres, e praticamente 84% possuíam diagnóstico de algum transtorno psiquiátrico. Os resultados evidenciam alguns dos inúmeros ganhos advindos da utilização da TRI na análise de traços latentes.The Beck Depression Inventory (BDI, a scale that measures the latent trait intensity of depression symptoms, can be assessed by the Item Response Theory (IRT. This study used the Graded-Response model (GRM to assess the intensity of depressive symptoms in 4,025 individuals who responded to the BDI, in order to efficiently use the information available on different aspects enabled by the use of this methodology. The fit of this model was done in PARSCALE software. We identified 13 items of the BDI in which at least one response category was not more likely than others to be chosen, so that these items had to be categorized again. The items with greater power of

  7. A Mixed Effects Randomized Item Response Model

    Science.gov (United States)

    Fox, J.-P.; Wyrick, Cheryl

    2008-01-01

    The randomized response technique ensures that individual item responses, denoted as true item responses, are randomized before observing them and so-called randomized item responses are observed. A relationship is specified between randomized item response data and true item response data. True item response data are modeled with a (non)linear…

  8. Modelling sequentially scored item responses

    NARCIS (Netherlands)

    Akkermans, W.

    2000-01-01

    The sequential model can be used to describe the variable resulting from a sequential scoring process. In this paper two more item response models are investigated with respect to their suitability for sequential scoring: the partial credit model and the graded response model. The investigation is c

  9. Measuring student learning with item response theory

    Directory of Open Access Journals (Sweden)

    Young-Jin Lee

    2008-01-01

    Full Text Available We investigate short-term learning from hints and feedback in a Web-based physics tutoring system. Both the skill of students and the difficulty and discrimination of items were determined by applying item response theory (IRT to the first answers of students who are working on for-credit homework items in an introductory Newtonian physics course. We show that after tutoring a shifted logistic item response function with lower discrimination fits the students’ second responses to an item previously answered incorrectly. Student skill decreased by 1.0 standard deviation when students used no tutoring between their (incorrect first and second attempts, which we attribute to “item-wrong bias.” On average, using hints or feedback increased students’ skill by 0.8 standard deviation. A skill increase of 1.9 standard deviation was observed when hints were requested after viewing, but prior to attempting to answer, a particular item. The skill changes measured in this way will enable the use of IRT to assess students based on their second attempt in a tutoring environment.

  10. Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers

    Directory of Open Access Journals (Sweden)

    Stochl Jan

    2012-06-01

    Full Text Available Abstract Background Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Methods Scalability of data from 1 a cross-sectional health survey (the Scottish Health Education Population Survey and 2 a general population birth cohort study (the National Child Development Study illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. Results and conclusions After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items we show that all items from the 12-item General Health Questionnaire (GHQ-12 – when binary scored – were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech’s “well-being” and “distress” clinical scales. An illustration of ordinal item analysis

  11. Bayesian item fit analysis for unidimensional item response theory models.

    Science.gov (United States)

    Sinharay, Sandip

    2006-11-01

    Assessing item fit for unidimensional item response theory models for dichotomous items has always been an issue of enormous interest, but there exists no unanimously agreed item fit diagnostic for these models, and hence there is room for further investigation of the area. This paper employs the posterior predictive model-checking method, a popular Bayesian model-checking tool, to examine item fit for the above-mentioned models. An item fit plot, comparing the observed and predicted proportion-correct scores of examinees with different raw scores, is suggested. This paper also suggests how to obtain posterior predictive p-values (which are natural Bayesian p-values) for the item fit statistics of Orlando and Thissen that summarize numerically the information in the above-mentioned item fit plots. A number of simulation studies and a real data application demonstrate the effectiveness of the suggested item fit diagnostics. The suggested techniques seem to have adequate power and reasonable Type I error rate, and psychometricians will find them promising.

  12. Characterizing Sources of Uncertainty in Item Response Theory Scale Scores

    Science.gov (United States)

    Yang, Ji Seung; Hansen, Mark; Cai, Li

    2012-01-01

    Traditional estimators of item response theory scale scores ignore uncertainty carried over from the item calibration process, which can lead to incorrect estimates of the standard errors of measurement (SEMs). Here, the authors review a variety of approaches that have been applied to this problem and compare them on the basis of their statistical…

  13. Item response theory for measurement validity.

    Science.gov (United States)

    Yang, Frances M; Kao, Solon T

    2014-06-01

    Item response theory (IRT) is an important method of assessing the validity of measurement scales that is underutilized in the field of psychiatry. IRT describes the relationship between a latent trait (e.g., the construct that the scale proposes to assess), the properties of the items in the scale, and respondents' answers to the individual items. This paper introduces the basic premise, assumptions, and methods of IRT. To help explain these concepts we generate a hypothetical scale using three items from a modified, binary (yes/no) response version of the Center for Epidemiological Studies-Depression scale that was administered to 19,399 respondents. We first conducted a factor analysis to confirm the unidimensionality of the three items and then proceeded with Mplus software to construct the 2-Parameter Logic (2-PL) IRT model of the data, a method which allows for estimates of both item discrimination and item difficulty. The utility of this information both for clinical purposes and for scale construction purposes is discussed.

  14. Extending Item Response Theory to Online Homework

    CERN Document Server

    Kortemeyer, Gerd

    2014-01-01

    Item Response Theory becomes an increasingly important tool when analyzing ``Big Data'' gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large enrollment physics course for scientists and engineers, the study compares outcomes from IRT analyses of exam and homework data, and then proceeds to investigate the effects of each confounding factor introduced in the online realm. It is found that IRT yields the correct trends for learner ability and meaningful item parameters, yet overall agreement with exam data is moderate. It is also found that learner ability and item discrimination is over wide ranges robust with respect to model assumptions and introduced noise, less so than item difficulty.

  15. A Bifactor Multidimensional Item Response Theory Model for Differential Item Functioning Analysis on Testlet-Based Items

    Science.gov (United States)

    Fukuhara, Hirotaka; Kamata, Akihito

    2011-01-01

    A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…

  16. Differential item functioning analysis by applying multiple comparison procedures.

    Science.gov (United States)

    Eusebi, Paolo; Kreiner, Svend

    2015-01-01

    Analysis within a Rasch measurement framework aims at development of valid and objective test score. One requirement of both validity and objectivity is that items do not show evidence of differential item functioning (DIF). A number of procedures exist for the assessment of DIF including those based on analysis of contingency tables by Mantel-Haenszel tests and partial gamma coefficients. The aim of this paper is to illustrate Multiple Comparison Procedures (MCP) for analysis of DIF relative to a variable defining a very large number of groups, with an unclear ordering with respect to the DIF effect. We propose a single step procedure controlling the false discovery rate for DIF detection. The procedure applies for both dichotomous and polytomous items. In addition to providing evidence against a hypothesis of no DIF, the procedure also provides information on subset of groups that are homogeneous with respect to the DIF effect. A stepwise MCP procedure for this purpose is also introduced.

  17. Teoria da Resposta ao Item Teoria de la respuesta al item Item response theory

    Directory of Open Access Journals (Sweden)

    Eutalia Aparecida Candido de Araujo

    2009-12-01

    Full Text Available A preocupação com medidas de traços psicológicos é antiga, sendo que muitos estudos e propostas de métodos foram desenvolvidos no sentido de alcançar este objetivo. Entre os trabalhos propostos, destaca-se a Teoria da Resposta ao Item (TRI que, a princípio, veio completar limitações da Teoria Clássica de Medidas, empregada em larga escala até hoje na medida de traços psicológicos. O ponto principal da TRI é que ela leva em consideração o item particularmente, sem relevar os escores totais; portanto, as conclusões não dependem apenas do teste ou questionário, mas de cada item que o compõe. Este artigo propõe-se a apresentar esta Teoria que revolucionou a teoria de medidas.La preocupación con las medidas de los rasgos psicológicos es antigua y muchos estudios y propuestas de métodos fueron desarrollados para lograr este objetivo. Entre estas propuestas de trabajo se incluye la Teoría de la Respuesta al Ítem (TRI que, en principio, vino a completar las limitaciones de la Teoría Clásica de los Tests, ampliamente utilizada hasta hoy en la medida de los rasgos psicológicos. El punto principal de la TRI es que se tiene en cuenta el punto concreto, sin relevar las puntuaciones totales; por lo tanto, los resultados no sólo dependen de la prueba o cuestionario, sino que de cada ítem que lo compone. En este artículo se propone presentar la Teoría que revolucionó la teoría de medidas.The concern with measures of psychological traits is old and many studies and proposals of methods were developed to achieve this goal. Among these proposed methods highlights the Item Response Theory (IRT that, in principle, came to complete limitations of the Classical Test Theory, which is widely used until nowadays in the measurement of psychological traits. The main point of IRT is that it takes into account the item in particular, not relieving the total scores; therefore, the findings do not only depend on the test or questionnaire

  18. Analyzing Force Concept Inventory with Item Response Theory

    CERN Document Server

    Wang, Jing

    2010-01-01

    Item Response Theory (IRT) is a popular assessment method used in education measurement, which builds on an assumption of a probability framework connecting students' innate ability and their actual performances on test items. The model transforms students' raw test scores through a nonlinear regression process into a scaled proficiency rating, which can be used to compare results obtained with different test questions. IRT also provides a theoretical approach to address ceiling effect and guessing. We applied IRT to analyze the Force Concept Inventory (FCI). The data was collected from 2802 students taking intro level mechanics courses at The Ohio State University. The data was analyzed with a 3-parameter item response model for multiple choice questions. We describe the procedures of the analysis and discuss the results and the interpretations. The analysis outcomes are compiled to provide a detailed IRT measurement metric of the FCI, which can be easily referenced and used by teachers and researchers for a...

  19. Employment of Item Response Theory to measure change in Children's Analogical Thinking Modifiability Test

    OpenAIRE

    Queiroz,Odoisa Antunes de; Primi,Ricardo; Carvalho,Lucas de Francisco; Enumo,Sônia Regina Fiorim

    2013-01-01

    Dynamic testing, with an intermediate phase of assistance, measures changes between pretest and post-test assuming a common metric between them. To test this assumption we applied the Item Response Theory in the responses of 69 children to dynamic cognitive testing Children's Analogical Thinking Modifiability Test adapted, with 12 items, totaling 828 responses, with the purpose of verifying if the original scale yields the same results as the equalized scale obtained by Item Response Theory i...

  20. Stochastic Approximation Methods for Latent Regression Item Response Models

    Science.gov (United States)

    von Davier, Matthias; Sinharay, Sandip

    2010-01-01

    This article presents an application of a stochastic approximation expectation maximization (EM) algorithm using a Metropolis-Hastings (MH) sampler to estimate the parameters of an item response latent regression model. Latent regression item response models are extensions of item response theory (IRT) to a latent variable model with covariates…

  1. Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

    Science.gov (United States)

    Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina

    2015-06-01

    This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism.

  2. An item response curves analysis of the Force Concept Inventory

    Science.gov (United States)

    Morris, Gary A.; Harshman, Nathan; Branum-Martin, Lee; Mazur, Eric; Mzoughi, Taha; Baker, Stephen D.

    2012-09-01

    Several years ago, we introduced the idea of item response curves (IRC), a simplistic form of item response theory (IRT), to the physics education research community as a way to examine item performance on diagnostic instruments such as the Force Concept Inventory (FCI). We noted that a full-blown analysis using IRT would be a next logical step, which several authors have since taken. In this paper, we show that our simple approach not only yields similar conclusions in the analysis of the performance of items on the FCI to the more sophisticated and complex IRT analyses but also permits additional insights by characterizing both the correct and incorrect answer choices. Our IRC approach can be applied to a variety of multiple-choice assessments but, as applied to a carefully designed instrument such as the FCI, allows us to probe student understanding as a function of ability level through an examination of each answer choice. We imagine that physics teachers could use IRC analysis to identify prominent misconceptions and tailor their instruction to combat those misconceptions, fulfilling the FCI authors' original intentions for its use. Furthermore, the IRC analysis can assist test designers to improve their assessments by identifying nonfunctioning distractors that can be replaced with distractors attractive to students at various ability levels.

  3. A generalized item response tree model for psychological assessments.

    Science.gov (United States)

    Jeon, Minjeong; De Boeck, Paul

    2016-09-01

    A new item response theory (IRT) model with a tree structure has been introduced for modeling item response processes with a tree structure. In this paper, we present a generalized item response tree model with a flexible parametric form, dimensionality, and choice of covariates. The utilities of the model are demonstrated with two applications in psychological assessments for investigating Likert scale item responses and for modeling omitted item responses. The proposed model is estimated with the freely available R package flirt (Jeon et al., 2014b).

  4. Item Response Methods for Educational Assessment.

    Science.gov (United States)

    Mislevy, Robert J.; Rieser, Mark R.

    Multiple matrix sampling (MMS) theory indicates how data may be gathered to most efficiently convey information about levels of attainment in a population, but standard analyses of these data require random sampling of items from a fixed pool of items. This assumption proscribes the retirement of flawed or obsolete items from the pool as well as…

  5. Analysis Test of Understanding of Vectors with the Three-Parameter Logistic Model of Item Response Theory and Item Response Curves Technique

    Science.gov (United States)

    Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

    2016-01-01

    This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming…

  6. Evaluating Item Discrimination Power of WHOQOL-BREF from an Item Response Model Perspectives

    Science.gov (United States)

    Lin, Ting Hsiang; Yao, Grace

    2009-01-01

    Quality of life (QOL) has become an important component of health. By using the methodology of psychometric theory, we examine the item properties of the WHOQOL-BRIEF. Samejima's graded response model with natural metrics of the logistic response function was fitted. The results showed items with negative natures were less discriminating. Items…

  7. Higher-Order Item Response Models for Hierarchical Latent Traits

    Science.gov (United States)

    Huang, Hung-Yu; Wang, Wen-Chung; Chen, Po-Hsi; Su, Chi-Ming

    2013-01-01

    Many latent traits in the human sciences have a hierarchical structure. This study aimed to develop a new class of higher order item response theory models for hierarchical latent traits that are flexible in accommodating both dichotomous and polytomous items, to estimate both item and person parameters jointly, to allow users to specify…

  8. Application of Unidimensional Item Response Models to Tests with Items Sensitive to Secondary Dimensions

    Science.gov (United States)

    Zhang, Bo

    2008-01-01

    In this research, the author addresses whether the application of unidimensional item response models provides valid interpretation of test results when administering items sensitive to multiple latent dimensions. Overall, the present study found that unidimensional models are quite robust to the violation of the unidimensionality assumption due…

  9. Using response times for item selection in adaptive testing

    NARCIS (Netherlands)

    Linden, van der Wim J.

    2008-01-01

    Response times on items can be used to improve item selection in adaptive testing provided that a probabilistic model for their distribution is available. In this research, the author used a hierarchical modeling framework with separate first-level models for the responses and response times and a s

  10. Item Response Theory Using Hierarchical Generalized Linear Models

    Directory of Open Access Journals (Sweden)

    Hamdollah Ravand

    2015-03-01

    Full Text Available Multilevel models (MLMs are flexible in that they can be employed to obtain item and person parameters, test for differential item functioning (DIF and capture both local item and person dependence. Papers on the MLM analysis of item response data have focused mostly on theoretical issues where applications have been add-ons to simulation studies with a methodological focus. Although the methodological direction was necessary as a first step to show how MLMs can be utilized and extended to model item response data, the emphasis needs to be shifted towards providing evidence on how applications of MLMs in educational testing can provide the benefits that have been promised. The present study uses foreign language reading comprehension data to illustrate application of hierarchical generalized models to estimate person and item parameters, differential item functioning (DIF, and local person dependence in a three-level model.

  11. Classification Consistency and Accuracy for Complex Assessments Using Item Response Theory

    Science.gov (United States)

    Lee, Won-Chan

    2010-01-01

    In this article, procedures are described for estimating single-administration classification consistency and accuracy indices for complex assessments using item response theory (IRT). This IRT approach was applied to real test data comprising dichotomous and polytomous items. Several different IRT model combinations were considered. Comparisons…

  12. A nonparametric approach to the analysis of dichotomous item responses

    NARCIS (Netherlands)

    Mokken, R.J.; Lewis, C.

    1982-01-01

    An item response theory is discussed which is based on purely ordinal assumptions about the probabilities that people respond positively to items. It is considered as a natural generalization of both Guttman scaling and classical test theory. A distinction is drawn between construction and evaluatio

  13. Semiparametric Item Response Functions in the Context of Guessing

    Science.gov (United States)

    Falk, Carl F.; Cai, Li

    2016-01-01

    We present a logistic function of a monotonic polynomial with a lower asymptote, allowing additional flexibility beyond the three-parameter logistic model. We develop a maximum marginal likelihood-based approach to estimate the item parameters. The new item response model is demonstrated on math assessment data from a state, and a computationally…

  14. Item response theory modeling with nonignorable missing data

    NARCIS (Netherlands)

    Pimentel, Jonald L.

    2005-01-01

    This thesis discusses methods to detect nonignorable missing data and methods to adjust for the bias caused by nonignorable missing data, both by introducing a model for the missing data indicator using item response theory (IRT) models.

  15. Application of multidimensional item response theory models to longitudinal data

    NARCIS (Netherlands)

    Marvelde, te Janneke M.; Glas, Cees A.W.; Van Landeghem, Georges; Van Damme, Jan

    2006-01-01

    The application of multidimensional item response theory (IRT) models to longitudinal educational surveys where students are repeatedly measured is discussed and exemplified. A marginal maximum likelihood (MML) method to estimate the parameters of a multidimensional generalized partial credit model

  16. Inconsistent Student Responses in TIMSS Questionnaire Items on Mathematics Lessons

    Directory of Open Access Journals (Sweden)

    Selda Yıldırım

    2009-12-01

    Full Text Available This study investigated consistency among Turkish students’ responses to TIMSS 2007 questionnaire items on frequency of certain activities in mathematics classrooms. In Turkey, 4476 students from 143 schools participated in the study. Analyses have revealed the existence of inconsistencies in student responses as indicated by high proportion of within-class variance components. That is, students in same class specified fluctuating frequencies to certain classroom activities, showing that some factors had an affect on perception of individuals. Further analyses showed that students at different levels of mathematics achievement reported differently on frequency of classroom activities, and precise items were answered more consistently compared to items containing vague terms. Using factor scores instead of individual item responses contributed consistency of responses within classes but only to a small extent. Based on the findings, this study also provided implications for questionnaire design.

  17. An item factor analysis and item response theory-based revision of the Everyday Discrimination Scale.

    Science.gov (United States)

    Stucky, Brian D; Gottfredson, Nisha C; Panter, A T; Daye, Charles E; Allen, Walter R; Wightman, Linda F

    2011-04-01

    The Everyday Discrimination Scale (EDS), a widely used measure of daily perceived discrimination, is purported to be unidimensional, to function well among African Americans, and to have adequate construct validity. Two separate studies and data sources were used to examine and cross-validate the psychometric properties of the EDS. In Study 1, an exploratory factor analysis was conducted on a sample of African American law students (N = 589), providing strong evidence of local dependence, or nuisance multidimensionality within the EDS. In Study 2, a separate nationally representative community sample (N = 3,527) was used to model the identified local dependence in an item factor analysis (i.e., bifactor model). Next, item response theory (IRT) calibrations were conducted to obtain item parameters. A five-item, revised-EDS was then tested for gender differential item functioning (in an IRT framework). Based on these analyses, a summed score to IRT-scaled score translation table is provided for the revised-EDS. Our results indicate that the revised-EDS is unidimensional, with minimal differential item functioning, and retains predictive validity consistent with the original scale.

  18. Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

    Science.gov (United States)

    Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

    2016-12-01

    This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC) that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test's distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.

  19. An NCME Instructional Module on Item-Fit Statistics for Item Response Theory Models

    Science.gov (United States)

    Ames, Allison J.; Penfield, Randall D.

    2015-01-01

    Drawing valid inferences from item response theory (IRT) models is contingent upon a good fit of the data to the model. Violations of model-data fit have numerous consequences, limiting the usefulness and applicability of the model. This instructional module provides an overview of methods used for evaluating the fit of IRT models. Upon completing…

  20. Assessing Subgroup Differences in Item Response Times.

    Science.gov (United States)

    Schnipke, Deborah L.; Pashley, Peter J.

    Differences in test performance on time-limited tests may be due in part to differential response-time rates between subgroups, rather than real differences in the knowledge, skills, or developed abilities of interest. With computer-administered tests, response times are available and may be used to address this issue. This study investigates…

  1. Fighting bias with statistics: Detecting gender differences in responses to items on a preschool science assessment

    Science.gov (United States)

    Greenberg, Ariela Caren

    Differential item functioning (DIF) and differential distractor functioning (DDF) are methods used to screen for item bias (Camilli & Shepard, 1994; Penfield, 2008). Using an applied empirical example, this mixed-methods study examined the congruency and relationship of DIF and DDF methods in screening multiple-choice items. Data for Study I were drawn from item responses of 271 female and 236 male low-income children on a preschool science assessment. Item analyses employed a common statistical approach of the Mantel-Haenszel log-odds ratio (MH-LOR) to detect DIF in dichotomously scored items (Holland & Thayer, 1988), and extended the approach to identify DDF (Penfield, 2008). Findings demonstrated that the using MH-LOR to detect DIF and DDF supported the theoretical relationship that the magnitude and form of DIF and are dependent on the DDF effects, and demonstrated the advantages of studying DIF and DDF in multiple-choice items. A total of 4 items with DIF and DDF and 5 items with only DDF were detected. Study II incorporated an item content review, an important but often overlooked and under-published step of DIF and DDF studies (Camilli & Shepard). Interviews with 25 female and 22 male low-income preschool children and an expert review helped to interpret the DIF and DDF results and their comparison, and determined that a content review process of studied items can reveal reasons for potential item bias that are often congruent with the statistical results. Patterns emerged and are discussed in detail. The quantitative and qualitative analyses were conducted in an applied framework of examining the validity of the preschool science assessment scores for evaluating science programs serving low-income children, however, the techniques can be generalized for use with measures across various disciplines of research.

  2. The 12-item World Health Organization Disability Assessment Schedule II (WHO-DAS II: a nonparametric item response analysis

    Directory of Open Access Journals (Sweden)

    Fernandez Ana

    2010-05-01

    Full Text Available Abstract Background Previous studies have analyzed the psychometric properties of the World Health Organization Disability Assessment Schedule II (WHO-DAS II using classical omnibus measures of scale quality. These analyses are sample dependent and do not model item responses as a function of the underlying trait level. The main objective of this study was to examine the effectiveness of the WHO-DAS II items and their options in discriminating between changes in the underlying disability level by means of item response analyses. We also explored differential item functioning (DIF in men and women. Methods The participants were 3615 adult general practice patients from 17 regions of Spain, with a first diagnosed major depressive episode. The 12-item WHO-DAS II was administered by the general practitioners during the consultation. We used a non-parametric item response method (Kernel-Smoothing implemented with the TestGraf software to examine the effectiveness of each item (item characteristic curves and their options (option characteristic curves in discriminating between changes in the underliying disability level. We examined composite DIF to know whether women had a higher probability than men of endorsing each item. Results Item response analyses indicated that the twelve items forming the WHO-DAS II perform very well. All items were determined to provide good discrimination across varying standardized levels of the trait. The items also had option characteristic curves that showed good discrimination, given that each increasing option became more likely than the previous as a function of increasing trait level. No gender-related DIF was found on any of the items. Conclusions All WHO-DAS II items were very good at assessing overall disability. Our results supported the appropriateness of the weights assigned to response option categories and showed an absence of gender differences in item functioning.

  3. MODERATING ABILITY OF ITEM RESPONSE THEORY THROUGH PRIOR GUESSING PARAMETER

    Directory of Open Access Journals (Sweden)

    Siow Hoo Leong

    2013-01-01

    Full Text Available A psycho-technology approach to discouraging guessing in multiple-choice formatted item can be done through reducing the a priori guessing probability of an item. This study proposes a psychometrics framework of Item Response Theory (IRT to model the effect of having various priori guessing probabilities across different items. A prior guessing parameter is proposed to serves as a moderator of the ability parameter in the two parameter logistic IRT. The results show that the proposed prior guessing parameter successfully moderates the ability parameters of the subjects with different degrees of guessing. However, the prior guessing parameter is insensitive when the performance pattern is mixed within the testlet but similar across testlet with different priori guessing probabilities.

  4. PENGEMBANGAN TES BERPIKIR KRITIS DENGAN PENDEKATAN ITEM RESPONSE THEORY

    Directory of Open Access Journals (Sweden)

    Fajrianthi Fajrianthi

    2016-06-01

    Full Text Available Penelitian ini bertujuan untuk menghasilkan sebuah alat ukur (tes berpikir kritis yang valid dan reliabel untuk digunakan, baik dalam lingkup pendidikan maupun kerja di Indonesia. Tahapan penelitian dilakukan berdasarkan tahap pengembangan tes menurut Hambleton dan Jones (1993. Kisi-kisi dan pembuatan butir didasarkan pada konsep dalam tes Watson-Glaser Critical Thinking Appraisal (WGCTA. Pada WGCTA, berpikir kritis terdiri dari lima dimensi yaitu Inference, Recognition Assumption, Deduction, Interpretation dan Evaluation of arguments. Uji coba tes dilakukan pada 1.453 peserta tes seleksi karyawan di Surabaya, Gresik, Tuban, Bojonegoro, Rembang. Data dikotomi dianalisis dengan menggunakan model IRT dengan dua parameter yaitu daya beda dan tingkat kesulitan butir. Analisis dilakukan dengan menggunakan program statistik Mplus versi 6.11 Sebelum melakukan analisis dengan IRT, dilakukan pengujian asumsi yaitu uji unidimensionalitas, independensi lokal dan Item Characteristic Curve (ICC. Hasil analisis terhadap 68 butir menghasilkan 15 butir dengan daya beda yang cukup baik dan tingkat kesulitan butir yang berkisar antara –4 sampai dengan 2.448. Sedikitnya jumlah butir yang berkualitas baik disebabkan oleh kelemahan dalam menentukan subject matter experts di bidang berpikir kritis dan pemilihan metode skoring. Kata kunci: Pengembangan tes, berpikir kritis, item response theory   DEVELOPING CRITICAL THINKING TEST UTILISING ITEM RESPONSE THEORY Abstract The present study was aimed to develop a valid and reliable instrument in assesing critical thinking which can be implemented both in educational and work settings in Indonesia. Following the Hambleton and Jones’s (1993 procedures on test development, the study developed the instrument by employing the concept of critical thinking from Watson-Glaser Critical Thinking Appraisal (WGCTA. The study included five dimensions of critical thinking as adopted from the WGCTA: Inference, Recognition

  5. The Academic Medical Center Linear Disability Score (ALDS) item bank: item response theory analysis in a mixed patient population

    Science.gov (United States)

    Holman, Rebecca; Weisscher, Nadine; Glas, Cees AW; Dijkgraaf, Marcel GW; Vermeulen, Marinus; de Haan, Rob J; Lindeboom, Robert

    2005-01-01

    Background Currently, there is a lot of interest in the flexible framework offered by item banks for measuring patient relevant outcomes. However, there are few item banks, which have been developed to quantify functional status, as expressed by the ability to perform activities of daily life. This paper examines the measurement properties of the Academic Medical Center linear disability score item bank in a mixed population. Methods This paper uses item response theory to analyse data on 115 of 170 items from a total of 1002 respondents. These were: 551 (55%) residents of supported housing, residential care or nursing homes; 235 (23%) patients with chronic pain; 127 (13%) inpatients on a neurology ward following a stroke; and 89 (9%) patients suffering from Parkinson's disease. Results Of the 170 items, 115 were judged to be clinically relevant. Of these 115 items, 77 were retained in the item bank following the item response theory analysis. Of the 38 items that were excluded from the item bank, 24 had either been presented to fewer than 200 respondents or had fewer than 10% or more than 90% of responses in the category 'can carry out'. A further 11 items had different measurement properties for younger and older or for male and female respondents. Finally, 3 items were excluded because the item response theory model did not fit the data. Conclusion The Academic Medical Center linear disability score item bank has promising measurement characteristics for the mixed patient population described in this paper. Further studies will be needed to examine the measurement properties of the item bank in other populations. PMID:16381611

  6. The Academic Medical Center Linear Disability Score (ALDS item bank: item response theory analysis in a mixed patient population

    Directory of Open Access Journals (Sweden)

    Vermeulen Marinus

    2005-12-01

    Full Text Available Abstract Background Currently, there is a lot of interest in the flexible framework offered by item banks for measuring patient relevant outcomes. However, there are few item banks, which have been developed to quantify functional status, as expressed by the ability to perform activities of daily life. This paper examines the measurement properties of the Academic Medical Center linear disability score item bank in a mixed population. Methods This paper uses item response theory to analyse data on 115 of 170 items from a total of 1002 respondents. These were: 551 (55% residents of supported housing, residential care or nursing homes; 235 (23% patients with chronic pain; 127 (13% inpatients on a neurology ward following a stroke; and 89 (9% patients suffering from Parkinson's disease. Results Of the 170 items, 115 were judged to be clinically relevant. Of these 115 items, 77 were retained in the item bank following the item response theory analysis. Of the 38 items that were excluded from the item bank, 24 had either been presented to fewer than 200 respondents or had fewer than 10% or more than 90% of responses in the category 'can carry out'. A further 11 items had different measurement properties for younger and older or for male and female respondents. Finally, 3 items were excluded because the item response theory model did not fit the data. Conclusion The Academic Medical Center linear disability score item bank has promising measurement characteristics for the mixed patient population described in this paper. Further studies will be needed to examine the measurement properties of the item bank in other populations.

  7. Application of Group-Level Item Response Models in the Evaluation of Consumer Reports about Health Plan Quality

    Science.gov (United States)

    Reise, Steven P.; Meijer, Rob R.; Ainsworth, Andrew T.; Morales, Leo S.; Hays, Ron D.

    2006-01-01

    Group-level parametric and non-parametric item response theory models were applied to the Consumer Assessment of Healthcare Providers and Systems (CAHPS[R]) 2.0 core items in a sample of 35,572 Medicaid recipients nested within 131 health plans. Results indicated that CAHPS responses are dominated by within health plan variation, and only weakly…

  8. Scale construction and evaluation in practice: A review of factor analysis versus item response theory applications

    Directory of Open Access Journals (Sweden)

    Anne Boomsma

    2010-09-01

    Full Text Available In scale construction and evaluation, factor analysis (FA and item response theory (IRT are two methods frequently used to determine whether a set of items reliably measures a latent variable. In a review of 41 published studies we examined which methodology – FA or IRT – was used, and what researchers’ motivations were for applying either method. Characteristics of the studies were compared to gain more insight into the practice of scale analysis. Findings indicate that FA is applied far more often than IRT. Many times it is unclear whether the data justify the chosen method because model assumptions are neglected. We recommended that researchers (a use substantive knowledge about the items to their advantage by more frequently employing confirmatory techniques, as well as adding item content and interpretability of factors to the criteria in model evaluation; and (b investigate model assumptions and report corresponding findings. To this end, we recommend more collaboration between substantive researchers and statisticians/psychometricians.

  9. Multilevel Higher-Order Item Response Theory Models

    Science.gov (United States)

    Huang, Hung-Yu; Wang, Wen-Chung

    2014-01-01

    In the social sciences, latent traits often have a hierarchical structure, and data can be sampled from multiple levels. Both hierarchical latent traits and multilevel data can occur simultaneously. In this study, we developed a general class of item response theory models to accommodate both hierarchical latent traits and multilevel data. The…

  10. A short tutorial on item response theory in rheumatology

    NARCIS (Netherlands)

    Siemons, L.; Krishnan, E.

    2014-01-01

    OBJECTIVES: The aim is to familiarise physicians and researchers with the most important concepts of item response theory (IRT) and with its usefulness for improving test administration and data collection in health care. Special attention is given to the versatility of its use within the rheumatic

  11. A Speeded Item Response Model with Gradual Process Change

    Science.gov (United States)

    Goegebeur, Yuri; De Boeck, Paul; Wollack, James A.; Cohen, Allan S.

    2008-01-01

    An item response theory model for dealing with test speededness is proposed. The model consists of two random processes, a problem solving process and a random guessing process, with the random guessing gradually taking over from the problem solving process. The involved change point and change rate are considered random parameters in order to…

  12. Testing Linear Models for Ability Parameters in Item Response Models

    NARCIS (Netherlands)

    Glas, Cees A.W.; Hendrawan, Irene

    2005-01-01

    Methods for testing hypotheses concerning the regression parameters in linear models for the latent person parameters in item response models are presented. Three tests are outlined: A likelihood ratio test, a Lagrange multiplier test and a Wald test. The tests are derived in a marginal maximum like

  13. An Alternative Three-Parameter Logistic Item Response Model.

    Science.gov (United States)

    Pashley, Peter J.

    Birnbaum's three-parameter logistic function has become a common basis for item response theory modeling, especially within situations where significant guessing behavior is evident. This model is formed through a linear transformation of the two-parameter logistic function in order to facilitate a lower asymptote. This paper discusses an…

  14. Analysis of Individual "Test Of Astronomy STandards" (TOAST) Item Responses

    Science.gov (United States)

    Slater, Stephanie J.; Schleigh, Sharon Price; Stork, Debra J.

    2015-01-01

    The development of valid and reliable strategies to efficiently determine the knowledge landscape of introductory astronomy college students is an effort of great interest to the astronomy education community. This study examines individual item response rates from a widely used conceptual understanding survey, the Test Of Astronomy Standards…

  15. Using SAS PROC MCMC for Item Response Theory Models

    Science.gov (United States)

    Ames, Allison J.; Samonte, Kelli

    2015-01-01

    Interest in using Bayesian methods for estimating item response theory models has grown at a remarkable rate in recent years. This attentiveness to Bayesian estimation has also inspired a growth in available software such as WinBUGS, R packages, BMIRT, MPLUS, and SAS PROC MCMC. This article intends to provide an accessible overview of Bayesian…

  16. An item response theory analysis of the narcissistic personality inventory.

    Science.gov (United States)

    Ackerman, Robert A; Donnellan, M Brent; Robins, Richard W

    2012-01-01

    This research uses item response theory methods to evaluate the Narcissistic Personality Inventory (NPI; Raskin & Terry, 1988). Analyses using the 2-parameter logistic model were conducted on the total score and the Corry, Merritt, Mrug, and Pamp (2008) and Ackerman et al. (2011) subscales for the NPI. In addition to offering precise information about the psychometric properties of the NPI item pool, these analyses generated insights that can be used to develop new measures of the personality constructs embedded within this frequently used inventory.

  17. Functionally unidimensional item response models for multivariate binary data

    DEFF Research Database (Denmark)

    Ip, Edward; Molenberghs, Geert; Chen, Shyh-Huei;

    2013-01-01

    The problem of fitting unidimensional item response models to potentially multidimensional data has been extensively studied. The focus of this article is on response data that have a strong dimension but also contain minor nuisance dimensions. Fitting a unidimensional model to such multidimensio......The problem of fitting unidimensional item response models to potentially multidimensional data has been extensively studied. The focus of this article is on response data that have a strong dimension but also contain minor nuisance dimensions. Fitting a unidimensional model...... to such multidimensional data is believed to result in ability estimates that represent a combination of the major and minor dimensions. We conjecture that the underlying dimension for the fitted unidimensional model, which we call the functional dimension, represents a nonlinear projection. In this article we investigate...... tool. An example regarding a construct of desire for physical competency is used to illustrate the functional unidimensional approach....

  18. The Long-Term Sustainability of Different Item Response Theory Scaling Methods

    Science.gov (United States)

    Keller, Lisa A.; Keller, Robert R.

    2011-01-01

    This article investigates the accuracy of examinee classification into performance categories and the estimation of the theta parameter for several item response theory (IRT) scaling techniques when applied to six administrations of a test. Previous research has investigated only two administrations; however, many testing programs equate tests…

  19. Item response modeling: an evaluation of the children's fruit and vegetable self-efficacy questionnaire

    Science.gov (United States)

    Perceived self-efficacy (SE) for eating fruit and vegetables (FV) is a key variable mediating FV change in interventions. This study applies item response modeling (IRM) to a fruit, juice and vegetable self-efficacy questionnaire (FVSEQ) previously validated with classical test theory (CTT) procedur...

  20. Dimensionality Assessment Using the Full-Information Item Bifactor Analysis for Graded Response Data: An Illustration with the State Metacognitive Inventory

    Science.gov (United States)

    Immekus, Jason C.; Imbrie, P. K.

    2008-01-01

    Dimensionality assessment using the full-information item bifactor model for graded response data is provided. The model applies to data in which each item relates to a general factor and one group factor. Specifically, alternative model specification within item response theory (IRT) is shown to test a scale's factor structure. For illustrative…

  1. Marginal Maximum Likelihood Estimation of Item Response Models in R

    Directory of Open Access Journals (Sweden)

    Matthew S. Johnson

    2007-02-01

    Full Text Available Item response theory (IRT models are a class of statistical models used by researchers to describe the response behaviors of individuals to a set of categorically scored items. The most common IRT models can be classified as generalized linear fixed- and/or mixed-effect models. Although IRT models appear most often in the psychological testing literature, researchers in other fields have successfully utilized IRT-like models in a wide variety of applications. This paper discusses the three major methods of estimation in IRT and develops R functions utilizing the built-in capabilities of the R environment to find the marginal maximum likelihood estimates of the generalized partial credit model. The currently available R packages ltm is also discussed.

  2. Item Response Theory with Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model

    Science.gov (United States)

    Tay, Louis; Huang, Qiming; Vermunt, Jeroen K.

    2016-01-01

    In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…

  3. Item Response Theory Analysis and Differential Item Functioning across Age, Gender and Country of a Short Form of the Advanced Progressive Matrices

    Science.gov (United States)

    Chiesi, Francesca; Ciancaleoni, Matteo; Galli, Silvia; Morsanyi, Kinga; Primi, Caterina

    2012-01-01

    Item Response Theory (IRT) models were applied to investigate the psychometric properties of the Arthur and Day's Advanced Progressive Matrices-Short Form (APM-SF; 1994) [Arthur and Day (1994). "Development of a short form for the Raven Advanced Progressive Matrices test." "Educational and Psychological Measurement, 54," 395-403] in order to test…

  4. The challenges of fitting an item response theory model to the Social Anhedonia Scale.

    Science.gov (United States)

    Reise, Steven P; Horan, William P; Blanchard, Jack J

    2011-05-01

    This study explored the application of latent variable measurement models to the Social Anhedonia Scale (SAS; Eckblad, Chapman, Chapman, & Mishlove, 1982), a widely used and influential measure in schizophrenia-related research. Specifically, we applied unidimensional and bifactor item response theory (IRT) models to data from a community sample of young adults (n = 2,227). Ordinal factor analyses revealed that identifying a coherent latent structure in the 40-item SAS data was challenging due to (a) the presence of multiple small content clusters (e.g., doublets); (b) modest relations between those clusters, which, in turn, implies a general factor of only modest strength; (c) items that shared little variance with the majority of items; and (d) cross-loadings in bifactor solutions. Consequently, we conclude that SAS responses cannot be modeled accurately by either unidimensional or bifactor IRT models. Although the application of a bifactor model to a reduced 17-item set met with better success, significant psychometric and substantive problems remained. Results highlight the challenges of applying latent variable models to scales that were not originally designed to fit these models.

  5. Bookmark locations and item response model selection in the presence of local item dependence.

    Science.gov (United States)

    Skaggs, Gary

    2007-01-01

    The bookmark standard setting procedure is a popular method for setting performance standards on state assessment programs. This study reanalyzed data from an application of the bookmark procedure to a passage-based test that used the Rasch model to create the item ordered booklet. Several problems were noted in this implementation of the bookmark procedure, including disagreement among the SMEs about the correct order of items in the bookmark booklet, performance level descriptions of the passing standard being based on passage difficulty as well as item difficulty, and the presence of local item dependence within reading passages. Bookmark item locations were recalculated for the IRT three-parameter model and the multidimensional bifactor model. The results showed that the order of item locations was very similar for all three models when items of high difficulty and low discrimination were excluded. However, the items whose positions were the most discrepant between models were not the items that the SMEs disagreed about the most in the original standard setting. The choice of latent trait model did not address problems of item order disagreement. Implications for the use of the bookmark method in the presence of local item dependence are discussed.

  6. Latent Variable Selection for Multidimensional Item Response Theory Models via [Formula: see text] Regularization.

    Science.gov (United States)

    Sun, Jianan; Chen, Yunxiao; Liu, Jingchen; Ying, Zhiliang; Xin, Tao

    2016-12-01

    We develop a latent variable selection method for multidimensional item response theory models. The proposed method identifies latent traits probed by items of a multidimensional test. Its basic strategy is to impose an [Formula: see text] penalty term to the log-likelihood. The computation is carried out by the expectation-maximization algorithm combined with the coordinate descent algorithm. Simulation studies show that the resulting estimator provides an effective way in correctly identifying the latent structures. The method is applied to a real dataset involving the Eysenck Personality Questionnaire.

  7. Adult Attachment Ratings (AAR): an item response theory analysis.

    Science.gov (United States)

    Pilkonis, Paul A; Kim, Yookyung; Yu, Lan; Morse, Jennifer Q

    2014-01-01

    The Adult Attachment Ratings (AAR) include 3 scales for anxious, ambivalent attachment (excessive dependency, interpersonal ambivalence, and compulsive care-giving), 3 for avoidant attachment (rigid self-control, defensive separation, and emotional detachment), and 1 for secure attachment. The scales include items (ranging from 6-16 in their original form) scored by raters using a 3-point format (0 = absent, 1 = present, and 2 = strongly present) and summed to produce a total score. Item response theory (IRT) analyses were conducted with data from 414 participants recruited from psychiatric outpatient, medical, and community settings to identify the most informative items from each scale. The IRT results allowed us to shorten the scales to 5-item versions that are more precise and easier to rate because of their brevity. In general, the effective range of measurement for the scales was 0 to +2 SDs for each of the attachment constructs; that is, from average to high levels of attachment problems. Evidence for convergent and discriminant validity of the scales was investigated by comparing them with the Experiences of Close Relationships-Revised (ECR-R) scale and the Kobak Attachment Q-sort. The best consensus among self-reports on the ECR-R, informant ratings on the ECR-R, and expert judgments on the Q-sort and the AAR emerged for anxious, ambivalent attachment. Given the good psychometric characteristics of the scale for secure attachment, however, this measure alone might provide a simple alternative to more elaborate procedures for some measurement purposes. Conversion tables are provided for the 7 scales to facilitate transformation from raw scores to IRT-calibrated (theta) scores.

  8. Students' proficiency scores within multitrait item response theory

    Science.gov (United States)

    Scott, Terry F.; Schumayer, Daniel

    2015-12-01

    In this paper we present a series of item response models of data collected using the Force Concept Inventory. The Force Concept Inventory (FCI) was designed to poll the Newtonian conception of force viewed as a multidimensional concept, that is, as a complex of distinguishable conceptual dimensions. Several previous studies have developed single-trait item response models of FCI data; however, we feel that multidimensional models are also appropriate given the explicitly multidimensional design of the inventory. The models employed in the research reported here vary in both the number of fitting parameters and the number of underlying latent traits assumed. We calculate several model information statistics to ensure adequate model fit and to determine which of the models provides the optimal balance of information and parsimony. Our analysis indicates that all item response models tested, from the single-trait Rasch model through to a model with ten latent traits, satisfy the standard requirements of fit. However, analysis of model information criteria indicates that the five-trait model is optimal. We note that an earlier factor analysis of the same FCI data also led to a five-factor model. Furthermore the factors in our previous study and the traits identified in the current work match each other well. The optimal five-trait model assigns proficiency scores to all respondents for each of the five traits. We construct a correlation matrix between the proficiencies in each of these traits. This correlation matrix shows strong correlations between some proficiencies, and strong anticorrelations between others. We present an interpretation of this correlation matrix.

  9. Using the Nominal Response Model to Evaluate Response Category Discrimination in the PROMIS Emotional Distress Item Pools

    Science.gov (United States)

    Preston, Kathleen; Reise, Steven; Cai, Li; Hays, Ron D.

    2011-01-01

    The authors used a nominal response item response theory model to estimate category boundary discrimination (CBD) parameters for items drawn from the Emotional Distress item pools (Depression, Anxiety, and Anger) developed in the Patient-Reported Outcomes Measurement Information Systems (PROMIS) project. For polytomous items with ordered response…

  10. Empirically indistinguishable multidimensional IRT and locally dependent unidimensional item response models.

    Science.gov (United States)

    Ip, Edward Haksing

    2010-05-01

    Multidimensionality is a core concept in the measurement and analysis of psychological data. In personality assessment, for example, constructs are mostly theoretically defined as unidimensional, yet responses collected from the real world are almost always determined by multiple factors. Significant research efforts have concentrated on the use of simulated studies to evaluate the robustness of unidimensional item response models when applied to multidimensional data with a dominant dimension. In contrast, in the present paper, I report the result from a theoretical investigation that a multidimensional item response model is empirically indistinguishable from a locally dependent unidimensional model, of which the single dimension represents the actual construct of interest. A practical implication of this result is that multidimensional response data do not automatically require the use of multidimensional models. Circumstances under which the alternative approach of locally dependent unidimensional models may be useful are discussed.

  11. Modeling Local Item Dependence in Cloze and Reading Comprehension Test Items Using Testlet Response Theory

    Science.gov (United States)

    Baghaei, Purya; Ravand, Hamdollah

    2016-01-01

    In this study the magnitudes of local dependence generated by cloze test items and reading comprehension items were compared and their impact on parameter estimates and test precision was investigated. An advanced English as a foreign language reading comprehension test containing three reading passages and a cloze test was analyzed with a…

  12. Analysis of Multiple Partially Ordered Responses to Belief Items with Don't Know Option.

    Science.gov (United States)

    Ip, Edward H; Chen, Shyh-Huei; Quandt, Sara A

    2016-06-01

    Understanding beliefs, values, and preferences of patients is a tenet of contemporary health sciences. This application was motivated by the analysis of multiple partially ordered set (poset) responses from an inventory on layman beliefs about diabetes. The partially ordered set arises because of two features in the data-first, the response options contain a Don't Know (DK) option, and second, there were two consecutive occasions of measurement. As predicted by the common sense model of illness, beliefs about diabetes were not necessarily stable across the two measurement occasions. Instead of analyzing the two occasions separately, we studied the joint responses across the occasions as a poset response. Few analytic methods exist for data structures other than ordered or nominal categories. Poset responses are routinely collapsed and then analyzed as either rank ordered or nominal data, leading to the loss of nuanced information that might be present within poset categories. In this paper we developed a general class of item response models for analyzing the poset data collected from the Common Sense Model of Diabetes Inventory. The inferential object of interest is the latent trait that indicates congruence of belief with the biomedical model. To apply an item response model to the poset diabetes inventory, we proved that a simple coding algorithm circumvents the requirement of writing new codes such that standard IRT software could be directly used for the purpose of item estimation and individual scoring. Simulation experiments were used to examine parameter recovery for the proposed poset model.

  13. Functionally Unidimensional Item Response Models for Multivariate Binary Data.

    Science.gov (United States)

    Ip, Edward H; Molenberghs, Geert; Chen, Shyh-Huei; Goegebeur, Yuri; De Boeck, Paul

    2013-07-01

    The problem of fitting unidimensional item response models to potentially multidimensional data has been extensively studied. The focus of this article is on response data that have a strong dimension but also contain minor nuisance dimensions. Fitting a unidimensional model to such multidimensional data is believed to result in ability estimates that represent a combination of the major and minor dimensions. We conjecture that the underlying dimension for the fitted unidimensional model, which we call the functional dimension, represents a nonlinear projection. In this article we investigate 2 issues: (a) can a proposed nonlinear projection track the functional dimension well, and (b) what are the biases in the ability estimate and the associated standard error when estimating the functional dimension? To investigate the second issue, the nonlinear projection is used as an evaluative tool. An example regarding a construct of desire for physical competency is used to illustrate the functional unidimensional approach.

  14. [Unfolding item response model using best-worst scaling].

    Science.gov (United States)

    Ikehara, Kazuya

    2015-02-01

    In attitude measurement and sensory tests, the unfolding model is typically used. In this model, response probability is formulated by the distance between the person and the stimulus. In this study, we proposed an unfolding item response model using best-worst scaling (BWU model), in which a person chooses the best and worst stimulus among repeatedly presented subsets of stimuli. We also formulated an unfolding model using best scaling (BU model), and compared the accuracy of estimates between the BU and BWU models. A simulation experiment showed that the BWU modell performed much better than the BU model in terms of bias and root mean square errors of estimates. With reference to Usami (2011), the proposed models were apllied to actual data to measure attitudes toward tardiness. Results indicated high similarity between stimuli estimates generated with the proposed models and those of Usami (2011).

  15. Is Bloom's Taxonomy reflected in the response pattern to MCQ items?

    Science.gov (United States)

    Huxham, G J; Naeraa, N

    1980-01-01

    The purpose of this study was to find out whether taxonomic classification of MCQ items reflected differences in student behaviour. The data from one of this University's official open-book exams, in which students answer sixty MCQ items distributed over twelve content-areas of physiology were examined. The responses from all 153 candidates were then subjected to factor analysis. Analysis of individual item scores was unrewarding. Analysis of scores for item-groups based on taxonomy and content resulted in the identification of three factors, which carried predominant loadings from recall or look-up items, interpretation items, and problem-solving items, respectively.

  16. Improving Item Response Theory Model Calibration by Considering Response Times in Psychological Tests

    Science.gov (United States)

    Ranger, Jochen; Kuhn, Jorg-Tobias

    2012-01-01

    Research findings indicate that response times in personality scales are related to the trait level according to the so-called speed-distance hypothesis. Against this background, Ferrando and Lorenzo-Seva proposed a latent trait model for the responses and response times in a test. The model consists of two components, a standard item response…

  17. Influence of Item Direction on Student Responses in Attitude Assessment.

    Science.gov (United States)

    Campbell, Noma Jo; Grissom, Stephen

    To investigate the effects of wording in attitude test items, a five-point Likert-type rating scale was administered to 173 undergraduate education majors. The test measured attitudes toward college and self, and contained 38 positively-worded items. Thirty-eight negatively-worded items were also written to parallel the positive statements.…

  18. Estimation of Item Response Theory Parameters in the Presence of Missing Data

    Science.gov (United States)

    Finch, Holmes

    2008-01-01

    Missing data are a common problem in a variety of measurement settings, including responses to items on both cognitive and affective assessments. Researchers have shown that such missing data may create problems in the estimation of item difficulty parameters in the Item Response Theory (IRT) context, particularly if they are ignored. At the same…

  19. Reevaluation of the Amsterdam Inventory for Auditory Disability and Handicap Using Item Response Theory

    Science.gov (United States)

    Hospers, J. Mirjam Boeschen; Smits, Niels; Smits, Cas; Stam, Mariska; Terwee, Caroline B.; Kramer, Sophia E.

    2016-01-01

    Purpose: We reevaluated the psychometric properties of the Amsterdam Inventory for Auditory Disability and Handicap (AIADH; Kramer, Kapteyn, Festen, & Tobi, 1995) using item response theory. Item response theory describes item functioning along an ability continuum. Method: Cross-sectional data from 2,352 adults with and without hearing…

  20. Limits on Log Odds Ratios for Unidimensional Item Response Theory Models

    Science.gov (United States)

    Haberman, Shelby J.; Holland, Paul W.; Sinharay, Sandip

    2007-01-01

    Bounds are established for log odds ratios (log cross-product ratios) involving pairs of items for item response models. First, expressions for bounds on log odds ratios are provided for one-dimensional item response models in general. Then, explicit bounds are obtained for the Rasch model and the two-parameter logistic (2PL) model. Results are…

  1. Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS: An item response theory approach

    Directory of Open Access Journals (Sweden)

    JOSEPH P. EIMICKE

    2009-06-01

    Full Text Available The aims of this paper are to present findings related to differential item functioning (DIF in the Patient Reported Outcome Measurement Information System (PROMIS depression item bank, and to discuss potential threats to the validity of results from studies of DIF. The 32 depression items studied were modified from several widely used instruments. DIF analyses of gender, age and education were performed using a sample of 735 individuals recruited by a survey polling firm. DIF hypotheses were generated by asking content experts to indicate whether or not they expected DIF to be present, and the direction of the DIF with respect to the studied comparison groups. Primary analyses were conducted using the graded item response model (for polytomous, ordered response category data with likelihood ratio tests of DIF, accompanied by magnitude measures. Sensitivity analyses were performed using other item response models and approaches to DIF detection. Despite some caveats, the items that are recommended for exclusion or for separate calibration were "I felt like crying" and "I had trouble enjoying things that I used to enjoy." The item, "I felt I had no energy," was also flagged as evidencing DIF, and recommended for additional review. On the one hand, false DIF detection (Type 1 error was controlled to the extent possible by ensuring model fit and purification. On the other hand, power for DIF detection might have been compromised by several factors, including sparse data and small sample sizes. Nonetheless, practical and not just statistical significance should be considered. In this case the overall magnitude and impact of DIF was small for the groups studied, although impact was relatively large for some individuals.

  2. Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS): An item response theory approach

    Science.gov (United States)

    Teresi, Jeanne A.; Ocepek-Welikson, Katja; Kleinman, Marjorie; Eimicke, Joseph P.; Crane, Paul K.; Jones, Richard N.; Lai, Jin-shei; Choi, Seung W.; Hays, Ron D.; Reeve, Bryce B.; Reise, Steven P.; Pilkonis, Paul A.; Cella, David

    2009-01-01

    The aims of this paper are to present findings related to differential item functioning (DIF) in the Patient Reported Outcome Measurement Information System (PROMIS) depression item bank, and to discuss potential threats to the validity of results from studies of DIF. The 32 depression items studied were modified from several widely used instruments. DIF analyses of gender, age and education were performed using a sample of 735 individuals recruited by a survey polling firm. DIF hypotheses were generated by asking content experts to indicate whether or not they expected DIF to be present, and the direction of the DIF with respect to the studied comparison groups. Primary analyses were conducted using the graded item response model (for polytomous, ordered response category data) with likelihood ratio tests of DIF, accompanied by magnitude measures. Sensitivity analyses were performed using other item response models and approaches to DIF detection. Despite some caveats, the items that are recommended for exclusion or for separate calibration were “I felt like crying” and “I had trouble enjoying things that I used to enjoy.” The item, “I felt I had no energy,” was also flagged as evidencing DIF, and recommended for additional review. On the one hand, false DIF detection (Type 1 error) was controlled to the extent possible by ensuring model fit and purification. On the other hand, power for DIF detection might have been compromised by several factors, including sparse data and small sample sizes. Nonetheless, practical and not just statistical significance should be considered. In this case the overall magnitude and impact of DIF was small for the groups studied, although impact was relatively large for some individuals. PMID:20336180

  3. Resolving Dimensionality Problems With WHOQOL-BREF Item Responses.

    Science.gov (United States)

    Perera, Harsha N; Izadikhah, Zahra; O'Connor, Peter; McIlveen, Peter

    2016-11-20

    The World Health Organization Quality of Life Scale (WHOQOL-BREF) is predicated on a multidimensional perspective on quality of life (QOL); yet studies are unclear about the latent structure underlying responses. This article reports on a study conducted to investigate the structure of WHOQOL-BREF scores. Competing latent structures of the data were examined in a general population sample. In addition, the complete factorial invariance of the retained model was investigated across gender. We also investigated latent mean differences in the QOL dimensions over age as well as age by gender interactions effects. Based on responses to the WHOQOL-BREF, support was found for a bifactor exploratory structural equation modeling representation of the data. This measurement structure accounts for construct-relevant multidimensionality in item responses due to the presence of general and specific factors underlying the data and the fallibility of indictors as pure reflections of only the single constructs they are purported to measure. Furthermore, support was found for measurement and structural invariance across gender. Finally, evidence was obtained for a curvilinear relationship of age with QOL, characterized by a midlife nadir. Taken together, the results of the study yield important validation data for the WHOQOL-BREF and tentatively resolve the dimensionality issues in the measurement of QOL using this instrument.

  4. Self efficacy for fruit, vegetable and water intakes: Expanded and abbreviated scales from item response modeling analyses

    Directory of Open Access Journals (Sweden)

    Cullen Karen W

    2010-03-01

    Full Text Available Abstract Objective To improve an existing measure of fruit and vegetable intake self efficacy by including items that varied on levels of difficulty, and testing a corresponding measure of water intake self efficacy. Design Cross sectional assessment. Items were modified to have easy, moderate and difficult levels of self efficacy. Classical test theory and item response modeling were applied. Setting One middle school at each of seven participating sites (Houston TX, Irvine CA, Philadelphia PA, Pittsburg PA, Portland OR, rural NC, and San Antonio TX. Subjects 714 6th grade students. Results Adding items to reflect level (low, medium, high of self efficacy for fruit and vegetable intake achieved scale reliability and validity comparable to existing scales, but the distribution of items across the latent variable did not improve. Selecting items from among clusters of items at similar levels of difficulty along the latent variable resulted in an abbreviated scale with psychometric characteristics comparable to the full scale, except for reliability. Conclusions The abbreviated scale can reduce participant burden. Additional research is necessary to generate items that better distribute across the latent variable. Additional items may need to tap confidence in overcoming more diverse barriers to dietary intake.

  5. Determining differential item functioning and its effect on the test scores of selected pib indexes, using item response theory techniques

    Directory of Open Access Journals (Sweden)

    Pieter Schaap

    2001-02-01

    Full Text Available The objective of this article is to present the results of an investigation into the item and test characteristics of two tests of the Potential Index Batteries (PIB in terms of differential item functioning (DIP and the effect thereof on test scores of different race groups. The English Vocabulary (Index 12 and Spelling Tests (Index 22 of the PIB were analysed for white, black and coloured South Africans. Item response theory (IRT methods were used to identify items which function differentially for white, black and coloured race groups. Opsomming Die doel van hierdie artikel is om die resultate van n ondersoek na die item- en toetseienskappe van twee PIB (Potential Index Batteries toetse in terme van itemsydigheid en die invloed wat dit op die toetstellings van rassegroepe het, weer te gee. Die Potential Index Batteries (PIB se Engelse Woordeskat (Index 12 en Spellingtoetse (Index 22 is ten opsigte van blanke, swart en gekleurde Suid-Afrikaners ontleed. Itemresponsteorie (IRT is gebruik om items te identifiseer wat as sydig (DIP vir die onderskeie rassegroepe beskou kan word.

  6. Using item response theory to measure extreme response style in marketing research: a global investigation

    NARCIS (Netherlands)

    Jong, de Martijn G.; Steenkamp, Jan-Benedict E.M.; Fox, Jean-Paul; Baumgartner, Hans

    2008-01-01

    Extreme response style (ERS) is an important threat to the validity of survey-based marketing research. In this article, the authors present a new item response theory–based model for measuring ERS. This model contributes to the ERS literature in two ways. First, the method improves on existing proc

  7. Analyzing Multiple-Choice Questions by Model Analysis and Item Response Curves

    Science.gov (United States)

    Wattanakasiwich, P.; Ananta, S.

    2010-07-01

    In physics education research, the main goal is to improve physics teaching so that most students understand physics conceptually and be able to apply concepts in solving problems. Therefore many multiple-choice instruments were developed to probe students' conceptual understanding in various topics. Two techniques including model analysis and item response curves were used to analyze students' responses from Force and Motion Conceptual Evaluation (FMCE). For this study FMCE data from more than 1000 students at Chiang Mai University were collected over the past three years. With model analysis, we can obtain students' alternative knowledge and the probabilities for students to use such knowledge in a range of equivalent contexts. The model analysis consists of two algorithms—concentration factor and model estimation. This paper only presents results from using the model estimation algorithm to obtain a model plot. The plot helps to identify a class model state whether it is in the misconception region or not. Item response curve (IRC) derived from item response theory is a plot between percentages of students selecting a particular choice versus their total score. Pros and cons of both techniques are compared and discussed.

  8. Estimating Non-Normal Latent Trait Distributions within Item Response Theory Using True and Estimated Item Parameters

    Science.gov (United States)

    Sass, D. A.; Schmitt, T. A.; Walker, C. M.

    2008-01-01

    Item response theory (IRT) procedures have been used extensively to study normal latent trait distributions and have been shown to perform well; however, less is known concerning the performance of IRT with non-normal latent trait distributions. This study investigated the degree of latent trait estimation error under normal and non-normal…

  9. A Comparison of Item Parameter Standard Error Estimation Procedures for Unidimensional and Multidimensional Item Response Theory Modeling

    Science.gov (United States)

    Paek, Insu; Cai, Li

    2014-01-01

    The present study was motivated by the recognition that standard errors (SEs) of item response theory (IRT) model parameters are often of immediate interest to practitioners and that there is currently a lack of comparative research on different SE (or error variance-covariance matrix) estimation procedures. The present study investigated item…

  10. Practical methods for dealing with 'not applicable' item responses in the AMC Linear Disability Score project

    Science.gov (United States)

    Holman, Rebecca; Glas, Cees AW; Lindeboom, Robert; Zwinderman, Aeilko H; de Haan, Rob J

    2004-01-01

    Background Whenever questionnaires are used to collect data on constructs, such as functional status or health related quality of life, it is unlikely that all respondents will respond to all items. This paper examines ways of dealing with responses in a 'not applicable' category to items included in the AMC Linear Disability Score (ALDS) project item bank. Methods The data examined in this paper come from the responses of 392 respondents to 32 items and form part of the calibration sample for the ALDS item bank. The data are analysed using the one-parameter logistic item response theory model. The four practical strategies for dealing with this type of response are: cold deck imputation; hot deck imputation; treating the missing responses as if these items had never been offered to those individual patients; and using a model which takes account of the 'tendency to respond to items'. Results The item and respondent population parameter estimates were very similar for the strategies involving hot deck imputation; treating the missing responses as if these items had never been offered to those individual patients; and using a model which takes account of the 'tendency to respond to items'. The estimates obtained using the cold deck imputation method were substantially different. Conclusions The cold deck imputation method was not considered suitable for use in the ALDS item bank. The other three methods described can be usefully implemented in the ALDS item bank, depending on the purpose of the data analysis to be carried out. These three methods may be useful for other data sets examining similar constructs, when item response theory based methods are used. PMID:15200681

  11. Practical methods for dealing with 'not applicable' item responses in the AMC Linear Disability Score project

    Directory of Open Access Journals (Sweden)

    Zwinderman Aeilko H

    2004-06-01

    Full Text Available Abstract Background Whenever questionnaires are used to collect data on constructs, such as functional status or health related quality of life, it is unlikely that all respondents will respond to all items. This paper examines ways of dealing with responses in a 'not applicable' category to items included in the AMC Linear Disability Score (ALDS project item bank. Methods The data examined in this paper come from the responses of 392 respondents to 32 items and form part of the calibration sample for the ALDS item bank. The data are analysed using the one-parameter logistic item response theory model. The four practical strategies for dealing with this type of response are: cold deck imputation; hot deck imputation; treating the missing responses as if these items had never been offered to those individual patients; and using a model which takes account of the 'tendency to respond to items'. Results The item and respondent population parameter estimates were very similar for the strategies involving hot deck imputation; treating the missing responses as if these items had never been offered to those individual patients; and using a model which takes account of the 'tendency to respond to items'. The estimates obtained using the cold deck imputation method were substantially different. Conclusions The cold deck imputation method was not considered suitable for use in the ALDS item bank. The other three methods described can be usefully implemented in the ALDS item bank, depending on the purpose of the data analysis to be carried out. These three methods may be useful for other data sets examining similar constructs, when item response theory based methods are used.

  12. Quantifying Local, Response Dependence between Two Polytomous Items Using the Rasch Model

    Science.gov (United States)

    Andrich, David; Humphry, Stephen M.; Marais, Ida

    2012-01-01

    Models of modern test theory imply statistical independence among responses, generally referred to as "local independence." One violation of local independence occurs when the response to one item governs the response to a subsequent item. Expanding on a formulation of this kind of violation as a process in the dichotomous Rasch model,…

  13. The dimensionality of the Edinburgh Handedness Inventory: An analysis with models of the item response theory.

    Science.gov (United States)

    Büsch, Dirk; Hagemann, Norbert; Bender, Nils

    2010-11-01

    Handedness is frequently measured with sum scores or quotients taken from laterality questionnaires like the Edinburgh Handedness Inventory (EHI). In classical test theory such data cannot be used to confirm either the unidimensionality (i.e., quantitative differentiation with the poles left-handed and right-handed) or multidimensionality (i.e., typological differentiation between left-, right-, and mixed-handers) of this personal characteristic. This study uses item response theory models to test the construct validity of the EHI on an item level in order to gather empirical support for the differentiation of handedness as well as the appropriateness of the items and the response format. The EHI was given to 540 participants (303 male and 237 female) aged 17-37 years. Results of mixed-Rasch analyses revealed that the best model was a two-class solution; that is, left- and right-handers (types) with quantitative differences between persons. Hence, unlike earlier model tests, this rejects both the unidimensionality of the handedness construct and the need to consider so-called mixed-handers. It is proposed that mixed-Rasch analyses should be applied more frequently to test the construct validity of other as well as more extensive handedness questionnaires.

  14. Dimensionality of the UWES-17: An item response modelling analysis

    Directory of Open Access Journals (Sweden)

    Deon P. de Bruin

    2013-03-01

    Full Text Available Orientation: Questionnaires, particularly the Utrecht Work Engagement Scale (UWES-17, are an almost standard method by which to measure work engagement. Conflicting evidence regarding the dimensionality of the UWES-17 has led to confusion regarding the interpretation of scores.Research purpose: The main focus of this study was to use the Rasch model to provide insight into the dimensionality of the UWES-17, and to assess whether work engagement should be interpreted as one single overall score, three separate scores, or a combination.Motivation for the study: It is unclear whether a summative score is more representative of work engagement or whether scores are more meaningful when interpreted for each dimension separately. Previous work relied on confirmatory factor analysis; the potential of item response models has not been tapped.Research design: A quantitative cross-sectional survey design approach was used. Participants, 2429 employees of a South African Information and Communication Technology (ICT company, completed the UWES-17.Main findings: Findings indicate that work engagement should be treated as a unidimensional construct: individual scores should be interpreted in a summative manner, giving a single global score.Practical/managerial implications: Users of the UWES-17 may interpret a single, summative score for work engagement. Findings of this study should also contribute towards standardising UWES-17 scores, allowing meaningful comparisons to be made.Contribution/value-add: The findings will benefit researchers, organisational consultants and managers. Clarity on dimensionality and interpretation of work engagement will assist researchers in future studies. Managers and consultants will be able to make better-informed decisions when using work engagement data.

  15. Difference in method of administration did not significantly impact item response

    DEFF Research Database (Denmark)

    Bjorner, Jakob B; Rose, Matthias; Gandek, Barbara

    2014-01-01

    PURPOSE: To test the impact of method of administration (MOA) on the measurement characteristics of items developed in the Patient-Reported Outcomes Measurement Information System (PROMIS). METHODS: Two non-overlapping parallel 8-item forms from each of three PROMIS domains (physical function...... assistant (PDA), or personal computer (PC) on the Internet, and a second form by PC, in the same administration. Structural invariance, equivalence of item responses, and measurement precision were evaluated using confirmatory factor analysis and item response theory methods. RESULTS: Multigroup...... confirmatory factor analysis supported equivalence of factor structure across MOA. Analyses by item response theory found no differences in item location parameters and strongly supported the equivalence of scores across MOA. CONCLUSIONS: We found no statistically or clinically significant differences in score...

  16. Modelling non-ignorable missing-data mechanisms with item response theory models

    NARCIS (Netherlands)

    Holman, Rebecca; Glas, Cees A.W.

    2005-01-01

    A model-based procedure for assessing the extent to which missing data can be ignored and handling non-ignorable missing data is presented. The procedure is based on item response theory modelling. As an example, the approach is worked out in detail in conjunction with item response data modelled us

  17. Stochastic Approximation Methods for Latent Regression Item Response Models. Research Report. ETS RR-09-09

    Science.gov (United States)

    von Davier, Matthias; Sinharay, Sandip

    2009-01-01

    This paper presents an application of a stochastic approximation EM-algorithm using a Metropolis-Hastings sampler to estimate the parameters of an item response latent regression model. Latent regression models are extensions of item response theory (IRT) to a 2-level latent variable model in which covariates serve as predictors of the…

  18. The Role of Psychometric Modeling in Test Validation: An Application of Multidimensional Item Response Theory

    Science.gov (United States)

    Schilling, Stephen G.

    2007-01-01

    In this paper the author examines the role of item response theory (IRT), particularly multidimensional item response theory (MIRT) in test validation from a validity argument perspective. The author provides justification for several structural assumptions and interpretations, taking care to describe the role he believes they should play in any…

  19. A Polytomous Item Response Theory Analysis of Social Physique Anxiety Scale

    Science.gov (United States)

    Fletcher, Richard B.; Crocker, Peter

    2014-01-01

    The present study investigated the social physique anxiety scale's factor structure and item properties using confirmatory factor analysis and item response theory. An additional aim was to identify differences in response patterns between groups (gender). A large sample of high school students aged 11-15 years (N = 1,529) consisting of n =…

  20. What is the Ability Emotional Intelligence Test (MSCEIT good for? An evaluation using item response theory.

    Directory of Open Access Journals (Sweden)

    Marina Fiori

    Full Text Available The ability approach has been indicated as promising for advancing research in emotional intelligence (EI. However, there is scarcity of tests measuring EI as a form of intelligence. The Mayer Salovey Caruso Emotional Intelligence Test, or MSCEIT, is among the few available and the most widespread measure of EI as an ability. This implies that conclusions about the value of EI as a meaningful construct and about its utility in predicting various outcomes mainly rely on the properties of this test. We tested whether individuals who have the highest probability of choosing the most correct response on any item of the test are also those who have the strongest EI ability. Results showed that this is not the case for most items: The answer indicated by experts as the most correct in several cases was not associated with the highest ability; furthermore, items appeared too easy to challenge individuals high in EI. Overall results suggest that the MSCEIT is best suited to discriminate persons at the low end of the trait. Results are discussed in light of applied and theoretical considerations.

  1. What is the Ability Emotional Intelligence Test (MSCEIT) good for? An evaluation using item response theory.

    Science.gov (United States)

    Fiori, Marina; Antonietti, Jean-Philippe; Mikolajczak, Moira; Luminet, Olivier; Hansenne, Michel; Rossier, Jérôme

    2014-01-01

    The ability approach has been indicated as promising for advancing research in emotional intelligence (EI). However, there is scarcity of tests measuring EI as a form of intelligence. The Mayer Salovey Caruso Emotional Intelligence Test, or MSCEIT, is among the few available and the most widespread measure of EI as an ability. This implies that conclusions about the value of EI as a meaningful construct and about its utility in predicting various outcomes mainly rely on the properties of this test. We tested whether individuals who have the highest probability of choosing the most correct response on any item of the test are also those who have the strongest EI ability. Results showed that this is not the case for most items: The answer indicated by experts as the most correct in several cases was not associated with the highest ability; furthermore, items appeared too easy to challenge individuals high in EI. Overall results suggest that the MSCEIT is best suited to discriminate persons at the low end of the trait. Results are discussed in light of applied and theoretical considerations.

  2. Secondary Psychometric Examination of the Dimensional Obsessive-Compulsive Scale: Classical Testing, Item Response Theory, and Differential Item Functioning.

    Science.gov (United States)

    Thibodeau, Michel A; Leonard, Rachel C; Abramowitz, Jonathan S; Riemann, Bradley C

    2015-12-01

    The Dimensional Obsessive-Compulsive Scale (DOCS) is a promising measure of obsessive-compulsive disorder (OCD) symptoms but has received minimal psychometric attention. We evaluated the utility and reliability of DOCS scores. The study included 832 students and 300 patients with OCD. Confirmatory factor analysis supported the originally proposed four-factor structure. DOCS total and subscale scores exhibited good to excellent internal consistency in both samples (α = .82 to α = .96). Patient DOCS total scores reduced substantially during treatment (t = 16.01, d = 1.02). DOCS total scores discriminated between students and patients (sensitivity = 0.76, 1 - specificity = 0.23). The measure did not exhibit gender-based differential item functioning as tested by Mantel-Haenszel chi-square tests. Expected response options for each item were plotted as a function of item response theory and demonstrated that DOCS scores incrementally discriminate OCD symptoms ranging from low to extremely high severity. Incremental differences in DOCS scores appear to represent unbiased and reliable differences in true OCD symptom severity.

  3. Examining faking on personality inventories using unfolding item response theory models.

    Science.gov (United States)

    Scherbaum, Charles A; Sabet, Jennifer; Kern, Michael J; Agnello, Paul

    2013-01-01

    A concern about personality inventories in diagnostic and decision-making contexts is that individuals will fake. Although there is extensive research on faking, little research has focused on how perceptions of personality items change when individuals are faking or responding honestly. This research demonstrates how the delta parameter from the generalized graded unfolding item response theory model can be used to examine how individuals' perceptions about personality items might change when responding honestly or when faking. The results indicate that perceptions changed from honest to faking conditions for several neuroticism items. The direction of the change varied, indicating that faking can operate to increase or decrease scores within a personality factor.

  4. An NCME Instructional Module on Estimating Item Response Theory Models Using Markov Chain Monte Carlo Methods

    Science.gov (United States)

    Kim, Jee-Seon; Bolt, Daniel M.

    2007-01-01

    The purpose of this ITEMS module is to provide an introduction to Markov chain Monte Carlo (MCMC) estimation for item response models. A brief description of Bayesian inference is followed by an overview of the various facets of MCMC algorithms, including discussion of prior specification, sampling procedures, and methods for evaluating chain…

  5. Characteristics of Items in the Eysenck Personality Inventory Which Affect Responses When Students Simulate

    Science.gov (United States)

    Power, R. P.; Macrae, K. D.

    1977-01-01

    A large sample of students completed Form A of the Eysenck Personality Inventory, and four subgroups were later asked to simulate extraversion, introversion, neuroticism or stability. It was found that subjects could simulate these four personalities successfully. The changes in individual item responses were correlated with the items' factor…

  6. Scale construction and evaluation in practice : A review of factor analysis versus item response theory applications

    NARCIS (Netherlands)

    Ten Holt, J.C.; van Duijn, M.A.J.; Boomsma, A.

    2010-01-01

    In scale construction and evaluation, factor analysis (FA) and item response theory (IRT) are two methods frequently used to determine whether a set of items reliably measures a latent variable. In a review of 41 published studies we examined which methodology – FA or IRT – was used, and what resear

  7. Asymptotic Properties of Induced Maximum Likelihood Estimates of Nonlinear Models for Item Response Variables: The Finite-Generic-Item-Pool Case.

    Science.gov (United States)

    Jones, Douglas H.

    The progress of modern mental test theory depends very much on the techniques of maximum likelihood estimation, and many popular applications make use of likelihoods induced by logistic item response models. While, in reality, item responses are nonreplicate within a single examinee and the logistic models are only ideal, practitioners make…

  8. A model of hippocampal spiking responses to items during learning of a context-dependent task

    Directory of Open Access Journals (Sweden)

    Florian eRaudies

    2014-09-01

    Full Text Available Single unit recordings in the rat hippocampus have demonstrated shifts in the specificity of spiking activity during learning of a contextual item-reward association task. In this task, rats received reward for responding to different items dependent upon the context an item appeared in, but not dependent upon the location an item appears at. Initially, neurons in the rat hippocampus primarily show firing based on place, but as the rat learns the task this firing became more selective for items. We simulated this effect using a simple circuit model with discrete inputs driving spiking activity representing place and item followed sequentially by a discrete representation of the motor actions involving a response to an item (digging for food or the movement to a different item (movement to a different pot for food. We implemented spiking replay in the network representing neural activity observed during sharp-wave ripple events, and modified synaptic connections based on a simple representation of spike-timing dependent synaptic plasticity. This simple network was able to consistently learn the context-dependent responses, and transitioned from dominant coding of place to a gradual increase in specificity to items consistent with analysis of the experimental data. In addition, the model showed an increase in specificity toward context. The increase of selectivity in the model is accompanied by an increase in binariness of the synaptic weights for cells that are part of the functional network.

  9. Validity of a Diagnostic Scale for Acupuncture: Application of the Item Response Theory to the Five Viscera Score

    Directory of Open Access Journals (Sweden)

    Taro Tomura

    2013-01-01

    Full Text Available In acupuncture therapy, diagnosis, acupoints, and stimulation for patients with the same illness are often inconsistent among between Traditional Chinese Medicine (TCM practitioners. This is in part due to the paucity of evidence-based diagnostic methods in TCM. To solve this problem, establishment of validated diagnostic tool is inevitable. We first applied the Item Response Theory (IRT model to the Five Viscera Score (FVS to test its validity by evaluating the ability of the questionnaire items to identify an individual’s latent traits. Next, the health-related QOL scale (SF-36, a suitable instrument for evaluating acupuncture therapy, was administered to evaluate whether the FVS can be used to make a health-related diagnosis. All 20 items of the FVS had adequate item discrimination, and 13 items had high item discrimination power. Measurement accuracy was suited for application in a range of individuals, from healthy to symptomatic. When the FVS and SF-36 were administered to other subjects, a part of which overlap with the first subjects, we found an association between the two scales, and the same findings were obtained when symptomatic and asymptomatic subjects were compared regardless of age and sex. In conclusion, the FVS may be effective in clinical diagnosis.

  10. Numerical Differentiation Methods for Computing Error Covariance Matrices in Item Response Theory Modeling: An Evaluation and a New Proposal

    Science.gov (United States)

    Tian, Wei; Cai, Li; Thissen, David; Xin, Tao

    2013-01-01

    In item response theory (IRT) modeling, the item parameter error covariance matrix plays a critical role in statistical inference procedures. When item parameters are estimated using the EM algorithm, the parameter error covariance matrix is not an automatic by-product of item calibration. Cai proposed the use of Supplemented EM algorithm for…

  11. Pain and distress caused by endotracheal suctioning in neonates is better quantified by behavioural than physiological items: a comparison based on item response theory modelling.

    Science.gov (United States)

    Välitalo, Pyry A J; van Dijk, Monique; Krekels, Elke H J; Gibbins, Sharyn; Simons, Sinno H P; Tibboel, Dick; Knibbe, Catherijne A J

    2016-08-01

    Pain cannot be directly measured in neonates. Therefore, scores based on indirect behavioural signals such as crying, or physiological signs such as blood pressure, are used to quantify neonatal pain both in clinical practice and in clinical studies. The aim of this study was to determine which of the physiological and behavioural items of 2 validated pain assessment scales (COMFORT and premature infant pain profile) are best able to detect pain during endotracheal and nasal suctioning in ventilated newborns. We analysed a total of 516 PIPP and COMFORT scores from 118 newborns. A graded response model was built to describe the data and item information was calculated for each of the behavioural and physiological items. We found that the graded response model was able to well describe the data, as judged by agreement between the observed data and model simulations. Furthermore, a good agreement was found between the pain estimated by the graded response model and the investigator-assessed visual analogue scale scores (Spearman rho correlation coefficient = 0.80). The information scores for the behavioural items ranged from 1.4 to 27.2 and from 0.0282 to 0.131 for physiological items. In these data with mild to moderate pain levels, behavioural items were vastly more informative of pain and distress than were physiological items. The items that were the most informative of pain are COMFORT items "calmness/agitation," "alertness," and "facial tension."

  12. Modeling and Testing Differential Item Functioning in Unidimensional Binary Item Response Models with a Single Continuous Covariate: A Functional Data Analysis Approach.

    Science.gov (United States)

    Liu, Yang; Magnus, Brooke E; Thissen, David

    2016-06-01

    Differential item functioning (DIF), referring to between-group variation in item characteristics above and beyond the group-level disparity in the latent variable of interest, has long been regarded as an important item-level diagnostic. The presence of DIF impairs the fit of the single-group item response model being used, and calls for either model modification or item deletion in practice, depending on the mode of analysis. Methods for testing DIF with continuous covariates, rather than categorical grouping variables, have been developed; however, they are restrictive in parametric forms, and thus are not sufficiently flexible to describe complex interaction among latent variables and covariates. In the current study, we formulate the probability of endorsing each test item as a general bivariate function of a unidimensional latent trait and a single covariate, which is then approximated by a two-dimensional smoothing spline. The accuracy and precision of the proposed procedure is evaluated via Monte Carlo simulations. If anchor items are available, we proposed an extended model that simultaneously estimates item characteristic functions (ICFs) for anchor items, ICFs conditional on the covariate for non-anchor items, and the latent variable density conditional on the covariate-all using regression splines. A permutation DIF test is developed, and its performance is compared to the conventional parametric approach in a simulation study. We also illustrate the proposed semiparametric DIF testing procedure with an empirical example.

  13. Item response theory analyses of the Cambridge Face Memory Test (CFMT).

    Science.gov (United States)

    Cho, Sun-Joo; Wilmer, Jeremy; Herzmann, Grit; McGugin, Rankin Williams; Fiset, Daniel; Van Gulick, Ana E; Ryan, Kaitlin F; Gauthier, Isabel

    2015-06-01

    We evaluated the psychometric properties of the Cambridge Face Memory Test (CFMT; Duchaine & Nakayama, 2006). First, we assessed the dimensionality of the test with a bifactor exploratory factor analysis (EFA). This EFA analysis revealed a general factor and 3 specific factors clustered by targets of CFMT. However, the 3 specific factors appeared to be minor factors that can be ignored. Second, we fit a unidimensional item response model. This item response model showed that the CFMT items could discriminate individuals at different ability levels and covered a wide range of the ability continuum. We found the CFMT to be particularly precise for a wide range of ability levels. Third, we implemented item response theory (IRT) differential item functioning (DIF) analyses for each gender group and 2 age groups (age ≤ 20 vs. age > 21). This DIF analysis suggested little evidence of consequential differential functioning on the CFMT for these groups, supporting the use of the test to compare older to younger, or male to female, individuals. Fourth, we tested for a gender difference on the latent facial recognition ability with an explanatory item response model. We found a significant but small gender difference on the latent ability for face recognition, which was higher for women than men by 0.184, at age mean 23.2, controlling for linear and quadratic age effects. Finally, we discuss the practical considerations of the use of total scores versus IRT scale scores in applications of the CFMT.

  14. Explanatory multidimensional multilevel random item response model: an application to simultaneous investigation of word and person contributions to multidimensional lexical representations.

    Science.gov (United States)

    Cho, Sun-Joo; Gilbert, Jennifer K; Goodwin, Amanda P

    2013-10-01

    This paper presents an explanatory multidimensional multilevel random item response model and its application to reading data with multilevel item structure. The model includes multilevel random item parameters that allow consideration of variability in item parameters at both item and item group levels. Item-level random item parameters were included to model unexplained variance remaining when item related covariates were used to explain variation in item difficulties. Item group-level random item parameters were included to model dependency in item responses among items having the same item stem. Using the model, this study examined the dimensionality of a person's word knowledge, termed lexical representation, and how aspects of morphological knowledge contributed to lexical representations for different persons, items, and item groups.

  15. A comparison of item response theory-based methods for examining differential item functioning in object naming test by language of assessment among older Latinos

    Directory of Open Access Journals (Sweden)

    Frances M. Yang

    2011-12-01

    Full Text Available Object naming tests are commonly included in neuropsychological test batteries. Differential item functioning (DIF in these tests due to cultural and language differences may compromise the validity of cognitive measures in diverse populations. We evaluated 26 object naming items for DIF due to Spanish and English language translations among Latinos (n=1,159, mean age of 70.5 years old (Standard Deviation (SD±7.2, using the following four item response theory-based ap-proaches: Mplus/Multiple Indicator, Multiple Causes (Mplus/MIMIC; Muthén & Muthén, 1998-2011, Item Response Theory Likelihood Ratio Differential Item Functioning (IRTLRDIF/MULTILOG; Thissen, 1991, 2001, difwithpar/Parscale (Crane, Gibbons, Jolley, & van Belle, 2006; Muraki & Bock, 2003, and Differential Functioning of Items and Tests/MULTILOG (DFIT/MULTILOG; Flowers, Oshima, & Raju, 1999; Thissen, 1991. Overall, there was moderate to near perfect agreement across methods. Fourteen items were found to exhibit DIF and 5 items observed consistently across all methods, which were more likely to be answered correctly by individuals tested in Spanish after controlling for overall ability.

  16. A comparison of item response theory-based methods for examining differential item functioning in object naming test by language of assessment among older Latinos.

    Science.gov (United States)

    Yang, Frances M; Heslin, Kevin C; Mehta, Kala M; Yang, Cheng-Wu; Ocepek-Welikson, Katja; Kleinman, Marjorie; Morales, Leo S; Hays, Ron D; Stewart, Anita L; Mungas, Dan; Jones, Richard N; Teresi, Jeanne A

    2011-01-01

    Object naming tests are commonly included in neuropsychological test batteries. Differential item functioning (DIF) in these tests due to cultural and language differences may compromise the validity of cognitive measures in diverse populations. We evaluated 26 object naming items for DIF due to Spanish and English language translations among Latinos (n=1,159), mean age of 70.5 years old (Standard Deviation (SD)±7.2), using the following four item response theory-based approaches: Mplus/Multiple Indicator, Multiple Causes (Mplus/MIMIC; Muthén & Muthén, 1998-2011), Item Response Theory Likelihood Ratio Differential Item Functioning (IRTLRDIF/MULTILOG; Thissen, 1991, 2001), difwithpar/Parscale (Crane, Gibbons, Jolley, & van Belle, 2006; Muraki & Bock, 2003), and Differential Functioning of Items and Tests/MULTILOG (DFIT/MULTILOG; Flowers, Oshima, & Raju, 1999; Thissen, 1991). Overall, there was moderate to near perfect agreement across methods. Fourteen items were found to exhibit DIF and 5 items observed consistently across all methods, which were more likely to be answered correctly by individuals tested in Spanish after controlling for overall ability.

  17. Testing the ruler with item response theory: increasing precision of measurement for relationship satisfaction with the Couples Satisfaction Index.

    Science.gov (United States)

    Funk, Janette L; Rogge, Ronald D

    2007-12-01

    The present study took a critical look at a central construct in couples research: relationship satisfaction. Eight well-validated self-report measures of relationship satisfaction, including the Marital Adjustment Test (MAT; H. J. Locke & K. M. Wallace, 1959), the Dyadic Adjustment Scale (DAS; G. B. Spanier, 1976), and an additional 75 potential satisfaction items, were given to 5,315 online participants. Using item response theory, the authors demonstrated that the MAT and DAS provided relatively poor levels of precision in assessing satisfaction, particularly given the length of those scales. Principal-components analysis and item response theory applied to the larger item pool were used to develop the Couples Satisfaction Index (CSI) scales. Compared with the MAS and the DAS, the CSI scales were shown to have higher precision of measurement (less noise) and correspondingly greater power for detecting differences in levels of satisfaction. The CSI scales demonstrated strong convergent validity with other measures of satisfaction and excellent construct validity with anchor scales from the nomological net surrounding satisfaction, suggesting that they assess the same theoretical construct as do prior scales. Implications for research are discussed.

  18. Power analysis in randomized clinical trials based on item response theory

    NARCIS (Netherlands)

    Holman, Rebecca; Glas, Cees A.W.; Haan, de Rob J.

    2003-01-01

    Patient relevant outcomes, measured using questionnaires, are becoming increasingly popular endpoints in randomized clinical trials (RCTs). Recently, interest in the use of item response theory (IRT) to analyze the responses to such questionnaires has increased. In this paper, we used a simulation s

  19. Confidence Bands for the Three-Parameter Logistic Item Response Curve.

    Science.gov (United States)

    Lord, Frederic M.; Pashley, Peter J.

    A large sample method for obtaining asymptotic simultaneous confidence bands for a three-parameter logistic response curve is described. Simultaneous confidence bands indicate the sampling variation of item response curves relative to a fitted function. A procedure is given which requires as input maximum likelihood parameter estimates and an…

  20. Fitting Item Response Theory Models to Two Personality Inventories: Issues and Insights.

    Science.gov (United States)

    Chernyshenko, Oleksandr S.; Stark, Stephen; Chan, Kim-Yin; Drasgow, Fritz; Williams, Bruce

    2001-01-01

    Compared the fit of several Item Response Theory (IRT) models to two personality assessment instruments using data from 13,059 individuals responding to one instrument and 1,770 individuals responding to the other. Two- and three-parameter logistic models fit some scales reasonably well, but not others, and the graded response model generally did…

  1. Measuring organizational effectiveness in information and communication technology companies using item response theory.

    Science.gov (United States)

    Trierweiller, Andréa Cristina; Peixe, Blênio César Severo; Tezza, Rafael; Pereira, Vera Lúcia Duarte do Valle; Pacheco, Waldemar; Bornia, Antonio Cezar; de Andrade, Dalton Francisco

    2012-01-01

    The aim of this paper is to measure the effectiveness of the organizations Information and Communication Technology (ICT) from the point of view of the manager, using Item Response Theory (IRT). There is a need to verify the effectiveness of these organizations which are normally associated to complex, dynamic, and competitive environments. In academic literature, there is disagreement surrounding the concept of organizational effectiveness and its measurement. A construct was elaborated based on dimensions of effectiveness towards the construction of the items of the questionnaire which submitted to specialists for evaluation. It demonstrated itself to be viable in measuring organizational effectiveness of ICT companies under the point of view of a manager through using Two-Parameter Logistic Model (2PLM) of the IRT. This modeling permits us to evaluate the quality and property of each item placed within a single scale: items and respondents, which is not possible when using other similar tools.

  2. mirt: A Multidimensional Item Response Theory Package for the R Environment

    Directory of Open Access Journals (Sweden)

    R. Philip Chalmers

    2012-05-01

    Full Text Available Item response theory (IRT is widely used in assessment and evaluation research to explain how participants respond to item level stimuli. Several R packages can be used to estimate the parameters in various IRT models, the most flexible being the ltm (Rizopoulos 2006, eRm (Mair and Hatzinger 2007, and MCMCpack (Martin, Quinn, and Park 2011 packages. However these packages have limitations in that ltm and eRm can only analyze unidimensional IRT models effectively and the exploratory multidimensional extensions available in MCMCpack requires prior understanding of Bayesian estimation convergence diagnostics and are computationally intensive. Most importantly, multidimensional confirmatory item factor analysis methods have not been implemented in any R package.The mirt package was created for estimating multidimensional item response theory parameters for exploratory and confirmatory models by using maximum-likelihood meth- ods. The Gauss-Hermite quadrature method used in traditional EM estimation (e.g., Bock and Aitkin 1981 is presented for exploratory item response models as well as for confirmatory bifactor models (Gibbons and Hedeker 1992. Exploratory and confirmatory models are estimated by a stochastic algorithm described by Cai (2010a,b. Various program comparisons are presented and future directions for the package are discussed.

  3. Cognitive Diagnostic Models for Tests with Multiple-Choice and Constructed-Response Items

    Science.gov (United States)

    Kuo, Bor-Chen; Chen, Chun-Hua; Yang, Chih-Wei; Mok, Magdalena Mo Ching

    2016-01-01

    Traditionally, teachers evaluate students' abilities via their total test scores. Recently, cognitive diagnostic models (CDMs) have begun to provide information about the presence or absence of students' skills or misconceptions. Nevertheless, CDMs are typically applied to tests with multiple-choice (MC) items, which provide less diagnostic…

  4. Automated Scoring of Constructed-Response Science Items: Prospects and Obstacles

    Science.gov (United States)

    Liu, Ou Lydia; Brew, Chris; Blackmore, John; Gerard, Libby; Madhok, Jacquie; Linn, Marcia C.

    2014-01-01

    Content-based automated scoring has been applied in a variety of science domains. However, many prior applications involved simplified scoring rubrics without considering rubrics representing multiple levels of understanding. This study tested a concept-based scoring tool for content-based scoring, c-rater™, for four science items with rubrics…

  5. Target Rotations and Assessing the Impact of Model Violations on the Parameters of Unidimensional Item Response Theory Models

    Science.gov (United States)

    Reise, Steven; Moore, Tyler; Maydeu-Olivares, Alberto

    2011-01-01

    Reise, Cook, and Moore proposed a "comparison modeling" approach to assess the distortion in item parameter estimates when a unidimensional item response theory (IRT) model is imposed on multidimensional data. Central to their approach is the comparison of item slope parameter estimates from a unidimensional IRT model (a restricted model), with…

  6. Development and Standardization of the Diagnostic Adaptive Behavior Scale: Application of Item Response Theory to the Assessment of Adaptive Behavior

    Science.gov (United States)

    Tassé, Marc J.; Schalock, Robert L.; Thissen, David; Balboni, Giulia; Bersani, Henry, Jr.; Borthwick-Duffy, Sharon A.; Spreat, Scott; Widaman, Keith F.; Zhang, Dalun; Navas, Patricia

    2016-01-01

    The Diagnostic Adaptive Behavior Scale (DABS) was developed using item response theory (IRT) methods and was constructed to provide the most precise and valid adaptive behavior information at or near the cutoff point of making a decision regarding a diagnosis of intellectual disability. The DABS initial item pool consisted of 260 items. Using IRT…

  7. Predicting gender differences as latent variables: summed scores, and individual item responses: a methods case study

    Directory of Open Access Journals (Sweden)

    Jacobs Danny O

    2004-10-01

    Full Text Available Abstract Background Modeling latent variables such as physical disability is challenging since its measurement is performed through proxies. This poses significant methodological challenges. The objective of this article is to present three different methods to predict latent variables based on classical summed scores, individual item responses, and latent variable models. Methods This is a review of the literature and data analysis using "layers of information". Data was collected from the North Carolina Back Pain Project, using a modified version of the Roland Questionnaire. Results The three models are compared in relation to their goals and underlying concepts, previous clinical applications, data requirements, statistical theory, and practical applications. Initial linear regression models demonstrated a difference in disability between genders of 1.32 points (95% CI 0.65, 2.00 on a scale from 0–23. Subsequent item analysis found contradictory results across items, with no clear pattern. Finally, IRT models demonstrated three items were demonstrated to present differential item functioning. After these items were removed, the difference between genders was reduced to 0.78 points (95% CI, -0.99, 1.23. These results were shown to be robust with re-sampling methods. Conclusions Purported differences in the levels of a latent variable should be tested using different models to verify whether these differences are real or simply distorted by model assumptions.

  8. Are vocabulary tests measurement invariant between age groups? An item response analysis of three popular tests.

    Science.gov (United States)

    Fox, Mark C; Berry, Jane M; Freeman, Sara P

    2014-12-01

    Relatively high vocabulary scores of older adults are generally interpreted as evidence that older adults possess more of a common ability than younger adults. Yet, this interpretation rests on empirical assumptions about the uniformity of item-response functions between groups. In this article, we test item response models of differential responding against datasets containing younger-, middle-aged-, and older-adult responses to three popular vocabulary tests (the Shipley, Ekstrom, and WAIS-R) to determine whether members of different age groups who achieve the same scores have the same probability of responding in the same categories (e.g., correct vs. incorrect) under the same conditions. Contrary to the null hypothesis of measurement invariance, datasets for all three tests exhibit substantial differential responding. Members of different age groups who achieve the same overall scores exhibit differing response probabilities in relation to the same items (differential item functioning) and appear to approach the tests in qualitatively different ways that generalize across items. Specifically, younger adults are more likely than older adults to leave items unanswered for partial credit on the Ekstrom, and to produce 2-point definitions on the WAIS-R. Yet, older adults score higher than younger adults, consistent with most reports of vocabulary outcomes in the cognitive aging literature. In light of these findings, the most generalizable conclusion to be drawn from the cognitive aging literature on vocabulary tests is simply that older adults tend to score higher than younger adults, and not that older adults possess more of a common ability.

  9. Applied orienting response research: some examples.

    Science.gov (United States)

    Tremayne, P; Barry, R J

    1990-01-01

    The development of orienting response (OR) theory has not been accompanied by many applications of the concept--most research still appears to be lab-based and "pure," rather than "applied." We present some examples from our own work in which the OR perspective has been applied in a wider context. These cover the exploration of processing deficits in autistic children, aspects of the "repression" of anxiety in elite athletes, and the locus of alcohol effects. Such applications of the OR concept in real-life situations seem a logical and, indeed, necessary step in the evolution of this area of psychophysiology.

  10. The Value of Item Response Theory in Clinical Assessment: A Review

    Science.gov (United States)

    Thomas, Michael L.

    2011-01-01

    Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. Although IRT has become prevalent in the measurement of ability and achievement, its contributions to clinical domains have been less extensive. Applications of IRT to clinical…

  11. Assessing Model Data Fit of Unidimensional Item Response Theory Models in Simulated Data

    Science.gov (United States)

    Kose, Ibrahim Alper

    2014-01-01

    The purpose of this paper is to give an example of how to assess the model-data fit of unidimensional IRT models in simulated data. Also, the present research aims to explain the importance of fit and the consequences of misfit by using simulated data sets. Responses of 1000 examinees to a dichotomously scoring 20 item test were simulated with 25…

  12. Can a Multidimensional Test Be Evaluated with Unidimensional Item Response Theory?

    Science.gov (United States)

    Wiberg, Marie

    2012-01-01

    The aim of this study was to evaluate possible consequences of using unidimensional item response theory (UIRT) on a multidimensional college admission test. The test consists of 5 subscales and can be divided into two sections, that is, it can be considered both as a unidimensional and a multidimensional test. The test was examined with both UIRT…

  13. Bayesian modeling of measurement error in predictor variables using item response theory

    NARCIS (Netherlands)

    Fox, Jean-Paul; Glas, Cees A.W.

    2000-01-01

    This paper focuses on handling measurement error in predictor variables using item response theory (IRT). Measurement error is of great important in assessment of theoretical constructs, such as intelligence or the school climate. Measurement error is modeled by treating the predictors as unobserved

  14. Bayesian modeling of measurement error in predictor variables using item response theory

    NARCIS (Netherlands)

    Fox, Jean-Paul; Glas, Cees A.W.

    2003-01-01

    It is shown that measurement error in predictor variables can be modeled using item response theory (IRT). The predictor variables, that may be defined at any level of an hierarchical regression model, are treated as latent variables. The normal ogive model is used to describe the relation between t

  15. Mokken scale analysis : Between the Guttman scale and parametric item response theory

    NARCIS (Netherlands)

    van Schuur, Wijbrandt H.

    2003-01-01

    This article introduces a model of ordinal unidimensional measurement known as Mokken scale analysis. Mokken scaling is based on principles of Item Response Theory (IRT) that originated in the Guttman scale. I compare the Mokken model with both Classical Test Theory (reliability or factor analysis)

  16. Optimal and Most Exact Confidence Intervals for Person Parameters in Item Response Theory Models

    Science.gov (United States)

    Doebler, Anna; Doebler, Philipp; Holling, Heinz

    2013-01-01

    The common way to calculate confidence intervals for item response theory models is to assume that the standardized maximum likelihood estimator for the person parameter [theta] is normally distributed. However, this approximation is often inadequate for short and medium test lengths. As a result, the coverage probabilities fall below the given…

  17. Measuring Integration of Information and Communication Technology in Education: An Item Response Modeling Approach

    Science.gov (United States)

    Peeraer, Jef; Van Petegem, Peter

    2012-01-01

    This research describes the development and validation of an instrument to measure integration of Information and Communication Technology (ICT) in education. After literature research on definitions of integration of ICT in education, a comparison is made between the classical test theory and the item response modeling approach for the…

  18. Extended Mixed-Efects Item Response Models with the MH-RM Algorithm

    Science.gov (United States)

    Chalmers, R. Philip

    2015-01-01

    A mixed-effects item response theory (IRT) model is presented as a logical extension of the generalized linear mixed-effects modeling approach to formulating explanatory IRT models. Fixed and random coefficients in the extended model are estimated using a Metropolis-Hastings Robbins-Monro (MH-RM) stochastic imputation algorithm to accommodate for…

  19. Assessing Dimensionality of Noncompensatory Multidimensional Item Response Theory with Complex Structures

    Science.gov (United States)

    Svetina, Dubravka

    2013-01-01

    The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in noncompensatory multidimensional item response models using dimensionality assessment procedures based on DETECT (dimensionality evaluation to enumerate contributing traits) and NOHARM (normal ogive harmonic analysis robust method). Five…

  20. The Impact of Outliers on Cronbach's Coefficient Alpha Estimate of Reliability: Ordinal/Rating Scale Item Responses

    Science.gov (United States)

    Liu, Yan; Wu, Amery D.; Zumbo, Bruno D.

    2010-01-01

    In a recent Monte Carlo simulation study, Liu and Zumbo showed that outliers can severely inflate the estimates of Cronbach's coefficient alpha for continuous item response data--visual analogue response format. Little, however, is known about the effect of outliers for ordinal item response data--also commonly referred to as Likert, Likert-type,…

  1. A modular approach for item response theory modeling with the R package flirt.

    Science.gov (United States)

    Jeon, Minjeong; Rijmen, Frank

    2016-06-01

    The new R package flirt is introduced for flexible item response theory (IRT) modeling of psychological, educational, and behavior assessment data. flirt integrates a generalized linear and nonlinear mixed modeling framework with graphical model theory. The graphical model framework allows for efficient maximum likelihood estimation. The key feature of flirt is its modular approach to facilitate convenient and flexible model specifications. Researchers can construct customized IRT models by simply selecting various modeling modules, such as parametric forms, number of dimensions, item and person covariates, person groups, link functions, etc. In this paper, we describe major features of flirt and provide examples to illustrate how flirt works in practice.

  2. Applicability of Item Response Theory to the Korean Nurses' Licensing Examination

    Directory of Open Access Journals (Sweden)

    Geum-Hee Jeong

    2005-06-01

    Full Text Available To test the applicability of item response theory (IRT to the Korean Nurses' Licensing Examination (KNLE, item analysis was performed after testing the unidimensionality and goodness-of-fit. The results were compared with those based on classical test theory. The results of the 330-item KNLE administered to 12,024 examinees in January 2004 were analyzed. Unidimensionality was tested using DETECT and the goodness-of-fit was tested using WINSTEPS for the Rasch model and Bilog-MG for the two-parameter logistic model. Item analysis and ability estimation were done using WINSTEPS. Using DETECT, Dmax ranged from 0.1 to 0.23 for each subject. The mean square value of the infit and outfit values of all items using WINSTEPS ranged from 0.1 to 1.5, except for one item in pediatric nursing, which scored 1.53. Of the 330 items, 218 (42.7% were misfit using the two-parameter logistic model of Bilog-MG. The correlation coefficients between the difficulty parameter using the Rasch model and the difficulty index from classical test theory ranged from 0.9039 to 0.9699. The correlation between the ability parameter using the Rasch model and the total score from classical test theory ranged from 0.9776 to 0.9984. Therefore, the results of the KNLE fit unidimensionality and goodness-of-fit for the Rasch model. The KNLE should be a good sample for analysis according to the IRT Rasch model, so further research using IRT is possible.

  3. Modeling the World Health Organization Disability Assessment Schedule II using non-parametric item response models.

    Science.gov (United States)

    Galindo-Garre, Francisca; Hidalgo, María Dolores; Guilera, Georgina; Pino, Oscar; Rojo, J Emilio; Gómez-Benito, Juana

    2015-03-01

    The World Health Organization Disability Assessment Schedule II (WHO-DAS II) is a multidimensional instrument developed for measuring disability. It comprises six domains (getting around, self-care, getting along with others, life activities and participation in society). The main purpose of this paper is the evaluation of the psychometric properties for each domain of the WHO-DAS II with parametric and non-parametric Item Response Theory (IRT) models. A secondary objective is to assess whether the WHO-DAS II items within each domain form a hierarchy of invariantly ordered severity indicators of disability. A sample of 352 patients with a schizophrenia spectrum disorder is used in this study. The 36 items WHO-DAS II was administered during the consultation. Partial Credit and Mokken scale models are used to study the psychometric properties of the questionnaire. The psychometric properties of the WHO-DAS II scale are satisfactory for all the domains. However, we identify a few items that do not discriminate satisfactorily between different levels of disability and cannot be invariantly ordered in the scale. In conclusion the WHO-DAS II can be used to assess overall disability in patients with schizophrenia, but some domains are too general to assess functionality in these patients because they contain items that are not applicable to this pathology.

  4. Mild to severe social fears: ranking types of feared social situations using item response theory.

    Science.gov (United States)

    Crome, Erica; Baillie, Andrew

    2014-06-01

    Social anxiety disorder is one of the most common mental disorders, and is associated with long term impairment, distress and vulnerability to secondary disorders. Certain types of social fears are more common than others, with public speaking fears typically the most prevalent in epidemiological surveys. The distinction between performance- and interaction-based fears has been the focus of long-standing debate in the literature, with evidence performance-based fears may reflect more mild presentations of social anxiety. This study aims to explicitly test whether different types of social fears differ in underlying social anxiety severity using item response theory techniques. Different types of social fears were assessed using items from three different structured diagnostic interviews in four different epidemiological surveys in the United States (n=2261, n=5411) and Australia (n=1845, n=1497); and ranked using 2-parameter logistic item response theory models. Overall, patterns of underlying severity indicated by different fears were consistent across the four samples with items functioning across a range of social anxiety. Public performance fears and speaking at meetings/classes indicated the lowest levels of social anxiety, with increasing severity indicated by situations such as being assertive or attending parties. Fears of using public bathrooms or eating, drinking or writing in public reflected the highest levels of social anxiety. Understanding differences in the underlying severity of different types of social fears has important implications for the underlying structure of social anxiety, and may also enhance the delivery of social anxiety treatment at a population level.

  5. Development of an abbreviated Career Indecision Profile-65 using item response theory: The CIP-Short.

    Science.gov (United States)

    Xu, Hui; Tracey, Terence J G

    2017-03-01

    The current study developed an abbreviated version of the Career Indecision Profile-65 (CIP-65; Hacker, Carr, Abrams, & Brown, 2013) by using item response theory. In order to improve the efficiency of the CIP-65 in measuring career indecision, the individual item performance of the CIP-65 was examined with respect to the ordering of response occurrence and gender differential item functioning. The best 5 items of each scale of the CIP-65 (i.e., neuroticism/negative affectivity, choice/commitment anxiety, lack of readiness, and interpersonal conflicts) were retained in the CIP-Short using a sample of 588 college students. A validation sample (N = 174) supported the reliability and structural validity of the CIP-Short. The convergent and divergent validity of the CIP-Short was additionally supported in the findings of a hypothesized differential relational pattern in a separate sample (N = 360). While the current study supported the CIP-Short being a sound brief measure of career indecision, the limitations of this study and suggestions for future research were discussed as well. (PsycINFO Database Record

  6. 单维项目因素分析:CCFA与IRT估计方法的比较%Unidimensional Item Factor Analysis: A Comparison of Categorical Confirmation Factor Analysis and the Item Response Theory

    Institute of Scientific and Technical Information of China (English)

    刘红云; 李美娟; 骆方; 李小山

    2012-01-01

    item factor load and of item discrimination parameter was influenced by the size of the whole factor load (discrimination). (5) The distribution of the threshold of test item affected the precision of the parameter estimate, and item discrimination was the most sensitive parameter to the threshold. (6) On the whole, the precision of item parameter estimate in SEM framework was higher than that in IRT framework. Both structural equation modeling (SEM) and the item response theory (IRT) could be used for factor analysis of dichotomous item responses. In this case, the measurement models of both approaches were formally equivalent. They were refined within and across different disciplines, and made complementary contributions to central measurement problems encountered in almost all empirical social science research fields. The authors concluded with considerations for categorical item factor analysis and gave some advice for applied researchers.

  7. The Psychological Effect of Errors in Standardized Language Test Items on EFL Students' Responses to the Following Item

    Science.gov (United States)

    Khaksefidi, Saman

    2017-01-01

    This study investigates the psychological effect of a wrong question with wrong items on answering to the next question in a test of structure. Forty students selected through stratified random sampling are given 15 questions of a standardized test namely a TOEFL structure test in which questions number 7 and number 11 are wrong and their answers…

  8. Modeling a Composite Score in Parkinson's Disease Using Item Response Theory.

    Science.gov (United States)

    Gottipati, Gopichand; Karlsson, Mats O; Plan, Elodie L

    2017-02-28

    In the current work, we present the methodology for development of an Item Response Theory model within a non-linear mixed effects framework to characterize the longitudinal changes of the Movement Disorder Society (sponsored revision) of Unified Parkinson's Disease Rating Scale (MDS-UPDRS) endpoint in Parkinson's disease (PD). The data were obtained from Parkinson's Progression Markers Initiative database and included 163,070 observations up to 48 months from 430 subjects belonging to De Novo PD cohort. The probability of obtaining a score, reported for each of the items in the questionnaire, was modeled as a function of the subject's disability. Initially, a single latent variable model was explored to characterize the disease progression over time. However, based on the understanding of the questionnaire set-up and the results of a residuals-based diagnostic tool, a three latent variable model with a mixture implementation was able to adequately describe longitudinal changes not only at the total score level but also at each individual item level. The linear progression rates obtained for the patient-reported items and the non-sided items were similar, each of which roughly take about 50 months for a typical subject to progress linearly from the baseline by one standard deviation. However for the sided items, it was found that the better side deteriorates quicker than the disabled side. This study presents a framework for analyzing MDS-UPDRS data, which can be adapted to more traditional UPDRS data collected in PD clinical trials and result in more efficient designs and analyses of such studies.

  9. On the Relationship between Classical Test Theory and Item Response Theory: From One to the Other and Back

    Science.gov (United States)

    Raykov, Tenko; Marcoulides, George A.

    2016-01-01

    The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…

  10. KernSmoothIRT: An R Package for Kernel Smoothing in Item Response Theory

    Directory of Open Access Journals (Sweden)

    Angelo Mazza

    2014-06-01

    Full Text Available Item response theory (IRT models are a class of statistical models used to describe the response behaviors of individuals to a set of items having a certain number of options. They are adopted by researchers in social science, particularly in the analysis of performance or attitudinal data, in psychology, education, medicine, marketing and other fields where the aim is to measure latent constructs. Most IRT analyses use parametric models that rely on assumptions that often are not satisfied. In such cases, a nonparametric approach might be preferable; nevertheless, there are not many software implementations allowing to use that. To address this gap, this paper presents the R package KernSmoothIRT . It implements kernel smoothing for the estimation of option characteristic curves, and adds several plotting and analytical tools to evaluate the whole test/questionnaire, the items, and the subjects. In order to show the package's capabilities, two real datasets are used, one employing multiple-choice responses, and the other scaled responses.

  11. Sample Size Requirements for Estimation of Item Parameters in the Multidimensional Graded Response Model

    Directory of Open Access Journals (Sweden)

    Shengyu eJiang

    2016-02-01

    Full Text Available Likert types of rating scales in which a respondent chooses a response from an ordered set of response options are used to measure a wide variety of psychological, educational, and medical outcome variables. The most appropriate item response theory model for analyzing and scoring these instruments when they provide scores on multiple scales is the multidimensional graded response model (MGRM. A simulation study was conducted to investigate the variables that might affect item parameter recovery for the MGRM. Data were generated based on different sample sizes, test lengths, and scale intercorrelations. Parameter estimates were obtained through the flexiMIRT software. The quality of parameter recovery was assessed by the correlation between true and estimated parameters as well as bias and root- mean-square-error. Results indicated that for the vast majority of cases studied a sample size of N = 500 provided accurate parameter estimates, except for tests with 240 items when 1,000 examinees were necessary to obtain accurate parameter estimates. Increasing sample size beyond N = 1,000 did not increase the accuracy of MGRM parameter estimates.

  12. Is a single-item visual analogue scale as valid, reliable and responsive as multi-item scales in measuring quality of life?

    NARCIS (Netherlands)

    Boer, A.G.E.M. de; Lanschot, J.J.B. van; Stalmeier, P.F.M.; Sandick, J.W. van; Hulscher, J.B.F.; Haes, J.C.J.M. de; Sprangers, M.A.G.

    2004-01-01

    PURPOSE: To compare the validity, reliability and responsiveness of a single, global quality of life question to multi-item scales. METHOD: Data were obtained from 83 consecutive patients with oesophageal adenocarcinoma undergoing either transhiatal or transthoracic oesophagectomy. Quality of life w

  13. An Evaluation of the Brief Symptom Inventory-18 Using Item Response Theory : Which Items Are Most Strongly Related to Psychological Distress?

    NARCIS (Netherlands)

    Meijer, Rob R.; de Vries, Rivka M.; van Bruggen, Vincent

    2011-01-01

    The psychometric structure of the Brief Symptom Inventory-18 (BSI-18; Derogatis, 2001) was investigated using Mokken scaling and parametric item response theory. Data of 487 outpatients, 266 students, and 207 prisoners were analyzed. Results of the Mokken analysis indicated that the BSI-18 formed a

  14. An evaluation of the Brief Symptom Inventory-18 using item response theory: which items are most strongly related to psychological distress?

    NARCIS (Netherlands)

    Meijer, Rob R.; Vries, de Rivka M.; Bruggen, van Vincent

    2011-01-01

    The psychometric structure of the Brief Symptom Inventory–18 (BSI-18; Derogatis, 2001) was investigated using Mokken scaling and parametric item response theory. Data of 487 outpatients, 266 students, and 207 prisoners were analyzed. Results of the Mokken analysis indicated that the BSI-18 formed a

  15. An Evaluation of the Brief Symptom Inventory-18 Using Item Response Theory: Which Items Are Most Strongly Related to Psychological Distress?

    Science.gov (United States)

    Meijer, Rob R.; de Vries, Rivka M.; van Bruggen, Vincent

    2011-01-01

    The psychometric structure of the Brief Symptom Inventory-18 (BSI-18; Derogatis, 2001) was investigated using Mokken scaling and parametric item response theory. Data of 487 outpatients, 266 students, and 207 prisoners were analyzed. Results of the Mokken analysis indicated that the BSI-18 formed a strong Mokken scale for outpatients and…

  16. Re-evaluating a vision-related quality of life questionnaire with item response theory (IRT) and differential item functioning (DIF) analyses.

    NARCIS (Netherlands)

    Nispen, R.M.A. van; Knol, D.L.; Langelaan, M.; Rens, G.H.M.B. van

    2011-01-01

    Background: For the Low Vision Quality Of Life questionnaire (LVQOL) it is unknown whether the psychometric properties are satisfactory when an item response theory (IRT) perspective is considered. This study evaluates some essential psychometric properties of the LVQOL questionnaire in an IRT model

  17. Capturing Abnormal Personality With Normal Personality Inventories: An Item Response Theory Approach

    OpenAIRE

    2008-01-01

    Correlational and factor-analytic methods indicate that abnormal and normal personality constructs may be tapping the same underlying latent trait. However, they do not systematically demonstrate that measures of abnormal personality capture more extreme ranges of the latent trait than measures of normal range personality. Item Response Theory (IRT) methods, in contrast, do provide this information. In the present study, we use IRT methods to evaluate the range of the latent trait assessed wi...

  18. Reading ability and print exposure: item response theory analysis of the author recognition test.

    Science.gov (United States)

    Moore, Mariah; Gordon, Peter C

    2015-12-01

    In the author recognition test (ART), participants are presented with a series of names and foils and are asked to indicate which ones they recognize as authors. The test is a strong predictor of reading skill, and this predictive ability is generally explained as occurring because author knowledge is likely acquired through reading or other forms of print exposure. In this large-scale study (1,012 college student participants), we used item response theory (IRT) to analyze item (author) characteristics in order to facilitate identification of the determinants of item difficulty, provide a basis for further test development, and optimize scoring of the ART. Factor analysis suggested a potential two-factor structure of the ART, differentiating between literary and popular authors. Effective and ineffective author names were identified so as to facilitate future revisions of the ART. Analyses showed that the ART is a highly significant predictor of the time spent encoding words, as measured using eyetracking during reading. The relationship between the ART and time spent reading provided a basis for implementing a higher penalty for selecting foils, rather than the standard method of ART scoring (names selected minus foils selected). The findings provide novel support for the view that the ART is a valid indicator of reading volume. Furthermore, they show that frequency data can be used to select items of appropriate difficulty, and that frequency data from corpora based on particular time periods and types of texts may allow adaptations of the test for different populations.

  19. The diagnostic utility of separation anxiety disorder symptoms: an item response theory analysis.

    Science.gov (United States)

    Cooper-Vince, Christine E; Emmert-Aronson, Benjamin O; Pincus, Donna B; Comer, Jonathan S

    2014-01-01

    At present, it is not clear whether the current definition of separation anxiety disorder (SAD) is the optimal classification of developmentally inappropriate, severe, and interfering separation anxiety in youth. Much remains to be learned about the relative contributions of individual SAD symptoms for informing diagnosis. Two-parameter logistic Item Response Theory analyses were conducted on the eight core SAD symptoms in an outpatient anxiety sample of treatment-seeking children (N = 359, 59.3 % female, M Age = 11.2) and their parents to determine the diagnostic utility of each of these symptoms. Analyses considered values of item threshold, which characterize the SAD severity level at which each symptom has a 50 % chance of being endorsed, and item discrimination, which characterize how well each symptom distinguishes individuals with higher and lower levels of SAD. Distress related to separation and fear of being alone without major attachment figures showed the strongest discrimination properties and the lowest thresholds for being endorsed. In contrast, worry about harm befalling attachment figures showed the poorest discrimination properties, and nightmares about separation showed the highest threshold for being endorsed. Distress related to separation demonstrated crossing differential item functioning associated with age-at lower separation anxiety levels excessive fear at separation was more likely to be endorsed for children ≥9 years, whereas at higher levels this symptom was more likely to be endorsed by children <9 years. Implications are discussed for optimizing the taxonomy of SAD in youth.

  20. ltm: An R Package for Latent Variable Modeling and Item Response Analysis

    Directory of Open Access Journals (Sweden)

    Dimitris Rizopoulos

    2006-11-01

    Full Text Available The R package ltm has been developed for the analysis of multivariate dichotomous and polytomous data using latent variable models, under the Item Response Theory approach. For dichotomous data the Rasch, the Two-Parameter Logistic, and Birnbaum's Three-Parameter models have been implemented, whereas for polytomous data Semejima's Graded Response model is available. Parameter estimates are obtained under marginal maximum likelihood using the Gauss-Hermite quadrature rule. The capabilities and features of the package are illustrated using two real data examples.

  1. Gender differences in posttraumatic stress symptoms among OEF/OIF veterans: an item response theory analysis.

    Science.gov (United States)

    King, Matthew W; Street, Amy E; Gradus, Jaimie L; Vogt, Dawne S; Resick, Patricia A

    2013-04-01

    Establishing whether men and women tend to express different symptoms of posttraumatic stress in reaction to trauma is important for both etiological research and the design of assessment instruments. Use of item response theory (IRT) can reveal how symptom reporting varies by gender and help determine if estimates of symptom severity for men and women are equally reliable. We analyzed responses to the PTSD Checklist (PCL) from 2,341 U.S. military veterans (51% female) who completed deployments in support of operations in Afghanistan and Iraq (Operation Enduring Freedom/Operation Iraqi Freedom [OEF/OIF]), and tested for differential item functioning by gender with an IRT-based approach. Among men and women with the same overall posttraumatic stress severity, women tended to report more frequent concentration difficulties and distress from reminders whereas men tended to report more frequent nightmares, emotional numbing, and hypervigilance. These item-level gender differences were small (on average d = 0.05), however, and had little impact on PCL measurement precision or expected total scores. For practical purposes, men's and women's severity estimates had similar reliability. This provides evidence that men and women veterans demonstrate largely similar profiles of posttraumatic stress symptoms following exposure to military-related stressors, and some theoretical perspectives suggest this may hold in other traumatized populations.

  2. A New Item Response Theory Model for Open-Ended Online Homework with Multiple Allowed Attempts

    CERN Document Server

    Gönülateş, Emre

    2015-01-01

    Item Response Theory (IRT) was originally developed in traditional exam settings, and it has been shown that the model does not readily transfer to formative assessment in the form of online homework. We investigate if this is mostly due to learner traits that do not become apparent in exam settings, namely random guessing due to lack of diligence or dedication, and copying work from other students or resources. Both of these traits mask the true ability of the learner, which is the only trait considered in most mainstream unidimensional IRT models. We find that indeed the introduction of these traits allows to better assess the true ability of the learners, as well as to better gauge the quality of assessment items. Correspondence of the model traits to self-reported behavior is investigated and confirmed. We find that of these two traits, copying answers has a larger influence on initial homework attempts than random guessing.

  3. Limited information estimation of the diffusion-based item response theory model for responses and response times.

    Science.gov (United States)

    Ranger, Jochen; Kuhn, Jörg-Tobias; Szardenings, Carsten

    2016-05-01

    Psychological tests are usually analysed with item response models. Recently, some alternative measurement models have been proposed that were derived from cognitive process models developed in experimental psychology. These models consider the responses but also the response times of the test takers. Two such models are the Q-diffusion model and the D-diffusion model. Both models can be calibrated with the diffIRT package of the R statistical environment via marginal maximum likelihood (MML) estimation. In this manuscript, an alternative approach to model calibration is proposed. The approach is based on weighted least squares estimation and parallels the standard estimation approach in structural equation modelling. Estimates are determined by minimizing the discrepancy between the observed and the implied covariance matrix. The estimator is simple to implement, consistent, and asymptotically normally distributed. Least squares estimation also provides a test of model fit by comparing the observed and implied covariance matrix. The estimator and the test of model fit are evaluated in a simulation study. Although parameter recovery is good, the estimator is less efficient than the MML estimator.

  4. A Teoria da Resposta ao Item: possíveis contribuições aos estudos em marketing The Item Response Theory: possible contributions to marketing studies

    Directory of Open Access Journals (Sweden)

    Danielle Ramos de Miranda Pereira

    2011-01-01

    Full Text Available A constatação da ampla utilização de escalas multidimensionais por parte dos pesquisadores da área de marketing motivou a elaboração de um artigo com o propósito de discutir a aplicação da Teoria da Resposta ao Item (TRI, bem como apresentar a essa área um método que tem se mostrado bastante eficaz na estimação de construtos comportamentais. Sendo assim, o artigo apresenta uma discussão sobre a TRI, ressaltando seus avanços em relação à Teoria Clássica do Teste (TCT e suas aplicações tradicionais no campo da psicometria e da avaliação educacional. Para verificar sua aplicabilidade nos estudos de marketing, julgou-se adequado conduzir uma aplicação prática da TRI em um estudo envolvendo uma escala já bastante utilizada pelos pesquisadores - a de orientação de mercado (Escala MkTor proposta por Narver e Slater (1990. Os resultados da aplicação demonstraram que, embora o modelo da TRI proposto possa ser considerado satisfatório para a aplicação no contexto da Orientação para o Mercado, existem muitos desafios a serem enfrentados por novos estudos como a construção de uma escala com interpretação prática, indicando o que significa para uma empresa possuir um nível de maturidade associado a um determinado construto. As considerações finais ressaltam que a grande contribuição do artigo aos estudos em marketing é a apresentação de um método alternativo para estimar de forma mais apurada os construtos e avaliar a qualidade dos itens das escalas.The widespread utilization of multidimensional scales by researchers in field of marketing have motivated the conduction of a study to discuss the application of the Item Response Theory (IRT as well as presenting a method that has proved very effective in the estimation of behavioral constructs. Therefore, this article presents a discussion about IRT highlighting its advances regarding the Classical Theory of Tests (CTT and its traditional applications in the

  5. Factor and item response theory analysis of the Protean and Boundaryless Career Attitude Scales

    Directory of Open Access Journals (Sweden)

    Gideon P. de Bruin

    2010-12-01

    Full Text Available Orientation: The concepts of the Protean Career and the Boundaryless Career show potential as frameworks for research and practice in the contemporary world of work. Briscoe, Hall and DeMuth (2006 developed the Protean and Boundaryless Career Attitude Scales, which consist of the Self-Directed Career Management, Values Driven, Boundaryless Mindset and Mobility Preference subscales. However, the standardisation and replication studies conducted by Briscoe et al., left some questions unanswered in terms of the psychometric properties of the subscales.Research purpose: This study examines the psychometric properties of the Protean and Boundaryless Career Attitude Scales with the aim of clarifying the structure of the scales, examining the quality of the items and evaluating the measurement precision of the scales.Research design, approach and method: Responses of adults to the items of the Protean and Boundaryless Career Attitude Scales were analysed with factor analytic and Rasch item response model techniques.Main findings: Factor and Rasch analyses revealed that three of the four postulated dimensions were replicated, but the Values Driven dimension split into two factors. Misfitting items were identified and sources of their misfit were uncovered. The Rasch analysis showed that three of the four subscales provide most of their psychometric information at the lower ends of their respective latent traits (where relatively few persons are located. Hence, the trait estimates of persons with low scores are more precise than those of persons with high scores.Practical/managerial implications: Overall, the quality of the Protean and Boundaryless Career Attitude Scales is satisfactory, but some aspects that may be improved are identified. Researchers may use at least three of the four subscales with confidence, but more work is possibly needed on the Values Driven subscale.Contribution/value-add: The study provides researchers with information on the

  6. Using Cochran's Z Statistic to Test the Kernel-Smoothed Item Response Function Differences between Focal and Reference Groups

    Science.gov (United States)

    Zheng, Yinggan; Gierl, Mark J.; Cui, Ying

    2010-01-01

    This study combined the kernel smoothing procedure and a nonparametric differential item functioning statistic--Cochran's Z--to statistically test the difference between the kernel-smoothed item response functions for reference and focal groups. Simulation studies were conducted to investigate the Type I error and power of the proposed…

  7. Estimation of a Ramsay-Curve Item Response Theory Model by the Metropolis-Hastings Robbins-Monro Algorithm

    Science.gov (United States)

    Monroe, Scott; Cai, Li

    2014-01-01

    In Ramsay curve item response theory (RC-IRT) modeling, the shape of the latent trait distribution is estimated simultaneously with the item parameters. In its original implementation, RC-IRT is estimated via Bock and Aitkin's EM algorithm, which yields maximum marginal likelihood estimates. This method, however, does not produce the…

  8. Psychometric Examination of an Inventory of Self-Efficacy for the Holland Vocational Themes Using Item Response Theory

    Science.gov (United States)

    Turner, Brandon M.; Betz, Nancy E.; Edwards, Michael C.; Borgen, Fred H.

    2010-01-01

    The psychometric properties of measures of self-efficacy for the six themes of Holland's theory were examined using item response theory. Item and scale quality were compared across levels of the trait continuum; all the scales were highly reliable but differentiated better at some levels of the continuum than others. Applications for adaptive…

  9. Why Japanese workers show low work engagement: An item response theory analysis of the Utrecht Work Engagement scale

    Directory of Open Access Journals (Sweden)

    Iwata Noboru

    2010-11-01

    Full Text Available Abstract With the globalization of occupational health psychology, more and more researchers are interested in applying employee well-being like work engagement (i.e., a positive, fulfilling, work-related state of mind that is characterized by vigor, dedication, and absorption to diverse populations. Accurate measurement contributes to our further understanding and to the generalizability of the concept of work engagement across different cultures. The present study investigated the measurement accuracy of the Japanese and the original Dutch versions of the Utrecht Work Engagement Scale (9-item version, UWES-9 and the comparability of this scale between both countries. Item Response Theory (IRT was applied to the data from Japan (N = 2,339 and the Netherlands (N = 13,406. Reliability of the scale was evaluated at various levels of the latent trait (i.e., work engagement based the test information function (TIF and the standard error of measurement (SEM. The Japanese version had difficulty in differentiating respondents with extremely low work engagement, whereas the original Dutch version had difficulty in differentiating respondents with high work engagement. The measurement accuracy of both versions was not similar. Suppression of positive affect among Japanese people and self-enhancement (the general sensitivity to positive self-relevant information among Dutch people may have caused decreased measurement accuracy. Hence, we should be cautious when interpreting low engagement scores among Japanese as well as high engagement scores among western employees.

  10. Using item response theory to explore the psychometric properties of extended matching questions examination in undergraduate medical education

    Directory of Open Access Journals (Sweden)

    Lawton Gemma

    2005-03-01

    Full Text Available Abstract Background As assessment has been shown to direct learning, it is critical that the examinations developed to test clinical competence in medical undergraduates are valid and reliable. The use of extended matching questions (EMQ has been advocated to overcome some of the criticisms of using multiple-choice questions to test factual and applied knowledge. Methods We analysed the results from the Extended Matching Questions Examination taken by 4th year undergraduate medical students in the academic year 2001 to 2002. Rasch analysis was used to examine whether the set of questions used in the examination mapped on to a unidimensional scale, the degree of difficulty of questions within and between the various medical and surgical specialties and the pattern of responses within individual questions to assess the impact of the distractor options. Results Analysis of a subset of items and of the full examination demonstrated internal construct validity and the absence of bias on the majority of questions. Three main patterns of response selection were identified. Conclusion Modern psychometric methods based upon the work of Rasch provide a useful approach to the calibration and analysis of EMQ undergraduate medical assessments. The approach allows for a formal test of the unidimensionality of the questions and thus the validity of the summed score. Given the metric calibration which follows fit to the model, it also allows for the establishment of items banks to facilitate continuity and equity in exam standards.

  11. Social Support Scale (MOS-SSS: Analysis of the Psychometric Properties via Item Response Theory

    Directory of Open Access Journals (Sweden)

    Daniela Sacramento Zanini

    Full Text Available Abstract The study on social relationships that influence health, as well as the development of reliable measures to assess this construct has been highlighted in the academic literature. The aim of this study was to estimate new evidence of validity based on the internal structure and reliability of the MOS-SSS, as well as the parameters of items and participants by Item response theory. The sample consisted of 998 people (age: M = 27.18, SD = 9.90, 65.1% women from different sampling strata. Confirmatory factor analysis (CFA revealed better goodness of fit of the four-factor model when compared to factor structures shown in other Brazilian studies. The multigroup CFA demonstrated invariance of the factor model when comparing the different sampling strata. The partial credit model indicated items with mean difficulty and appropriate adjustments indices (infit/outfit and desirable reliability for the factors. The analysis of the maps indicated the tool's strengths and limitations to assess the construct.

  12. Racial/ethnic differences in responses to the everyday discrimination scale: a differential item functioning analysis.

    Science.gov (United States)

    Lewis, Tené T; Yang, Frances M; Jacobs, Elizabeth A; Fitchett, George

    2012-03-01

    The authors examined the impact of race/ethnicity on responses to the Everyday Discrimination Scale, one of the most widely used discrimination scales in epidemiologic and public health research. Participants were 3,295 middle-aged US women (African-American, Caucasian, Chinese, Hispanic, and Japanese) from the Study of Women's Health Across the Nation (SWAN) baseline examination (1996-1997). Multiple-indicator, multiple-cause models were used to examine differential item functioning (DIF) on the Everyday Discrimination Scale by race/ethnicity. After adjustment for age, education, and language of interview, meaningful DIF was observed for 3 (out of 10) items: "receiving poorer service in restaurants or stores," "being treated as if you are dishonest," and "being treated with less courtesy than other people" (all P's discrimination differed slightly for women of different racial/ethnic groups, with certain "public" experiences appearing to have more salience for African-American and Chinese women and "dishonesty" having more salience for racial/ethnic minority women overall. "Courtesy" appeared to have more salience for Hispanic women only in comparison with African-American women. Findings suggest that the Everyday Discrimination Scale could potentially be used across racial/ethnic groups as originally intended. However, researchers should use caution with items that demonstrated DIF.

  13. Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating

    Science.gov (United States)

    He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei

    2013-01-01

    Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…

  14. Measuring Consumers’ Environmental Responsibility: A Synthesis of Constructs and Measurement Scale Items

    Directory of Open Access Journals (Sweden)

    K. M. R. Taufique

    2014-04-01

    Full Text Available It is universal that central to all production is consumption. Without proper management, production along with consumption is likely to be the main sources of environmental problems. This very reality calls for consumers to be environmentally responsible in their consumption behavior. The objective of this paper is to prepare a synthesis of all the possible factors and measurement scale items to be used for assessing consumers’ environmental responsibility. For making such synthesis, all major works done on the field have been thoroughly reviewed.The paper comes up with a total of six parameters that include knowledge & awareness, attitude, green consumer value, emotional affinity toward nature, willingness to act and environment related past behavior. These tentative, yet inclusive set of parameters are thought to be useful for guiding the designing of large scale future empirical researches for developing a dependable inclusive set of parameters to test consumer’ environmental responsibility. A conceptual model and possible measurement items are proposed for further empirical research.

  15. Which person variables predict how people benefit from True-False over Constructed Response items?

    Directory of Open Access Journals (Sweden)

    Stella Bollmann

    2015-06-01

    Full Text Available The aim of this study was the investigation of the variable Benefit from TF, which we assumed to be additionally measured when using True-False instead of Constructed Response tests. Subjects who benefit from True-False have an advantage over other subjects in answering Multiple Choice or True-False exams. We expected it to be related to partial knowledge and examined its relation to other personal abilities and traits in a total of n = 106 psychology students. They completed a statistics exam in Constructed Response and True-False format and benefit items were defined as those to which the associated constructed response answer was not correct. Additionally, verbal intelligence and Big 5 measures were obtained. Results confirm the existence of the person variable Benefit from TF and its relation to partial knowledge. Furthermore, benefiters differed from others in conscientiousness and openness to experience variables. However, contrary to expectations, they did not differ in verbal IQ.

  16. Bayesian Modal Estimation of the Four-Parameter Item Response Model in Real, Realistic, and Idealized Data Sets.

    Science.gov (United States)

    Waller, Niels G; Feuerstahler, Leah

    2017-03-17

    In this study, we explored item and person parameter recovery of the four-parameter model (4PM) in over 24,000 real, realistic, and idealized data sets. In the first analyses, we fit the 4PM and three alternative models to data from three Minnesota Multiphasic Personality Inventory-Adolescent form factor scales using Bayesian modal estimation (BME). Our results indicated that the 4PM fits these scales better than simpler item Response Theory (IRT) models. Next, using the parameter estimates from these real data analyses, we estimated 4PM item parameters in 6,000 realistic data sets to establish minimum sample size requirements for accurate item and person parameter recovery. Using a factorial design that crossed discrete levels of item parameters, sample size, and test length, we also fit the 4PM to an additional 18,000 idealized data sets to extend our parameter recovery findings. Our combined results demonstrated that 4PM item parameters and parameter functions (e.g., item response functions) can be accurately estimated using BME in moderate to large samples (N ⩾ 5, 000) and person parameters can be accurately estimated in smaller samples (N ⩾ 1, 000). In the supplemental files, we report annotated [Formula: see text] code that shows how to estimate 4PM item and person parameters in [Formula: see text] (Chalmers, 2012 ).

  17. Harmonization of Neuroticism and Extraversion phenotypes across inventories and cohorts in the Genetics of Personality Consortium: an application of Item Response Theory

    DEFF Research Database (Denmark)

    van den Berg, S. M.; de Moor, M. H. M.; McGue, Matt

    2014-01-01

    -analyses can be employed. Within the Genetics of Personality Consortium, we demonstrate for two clinically relevant personality traits, Neuroticism and Extraversion, how Item-Response Theory (IRT) can be applied to map item data from different inventories to the same underlying constructs. Personality item...... data were analyzed in > 160,000 individuals from 23 cohorts across Europe, USA and Australia in which Neuroticism and Extraversion were assessed by nine different personality inventories. Results showed that harmonization was very successful for most personality inventories and moderately successful...... for some. Neuroticism and Extraversion inventories were largely measurement invariant across cohorts, in particular when comparing cohorts from countries where the same language is spoken. The IRT-based scores for Neuroticism and Extraversion were heritable (48 and 49 %, respectively, based on a meta...

  18. An item response theory analysis of self-report measures of adult attachment.

    Science.gov (United States)

    Fraley, R C; Waller, N G; Brennan, K A

    2000-02-01

    Self-report measures of adult attachment are typically scored in ways (e.g., averaging or summing items) that can lead to erroneous inferences about important theoretical issues, such as the degree of continuity in attachment security and the differential stability of insecure attachment patterns. To determine whether existing attachment scales suffer from scaling problems, the authors conducted an item response theory (IRT) analysis of 4 commonly used self-report inventories: Experiences in Close Relationships scales (K. A. Brennan, C. L. Clark, & P. R. Shaver, 1998), Adult Attachment Scales (N. L. Collins & S. J. Read, 1990), Relationship Styles Questionnaire (D. W. Griffin & K. Bartholomew, 1994) and J. Simpson's (1990) attachment scales. Data from 1,085 individuals were analyzed using F. Samejima's (1969) graded response model. The authors' findings indicate that commonly used attachment scales can be improved in a number of important ways. Accordingly, the authors show how IRT techniques can be used to develop new attachment scales with desirable psychometric properties.

  19. Incorporating Mobility in Growth Modeling for Multilevel and Longitudinal Item Response Data.

    Science.gov (United States)

    Choi, In-Hee; Wilson, Mark

    2016-01-01

    Multilevel data often cannot be represented by the strict form of hierarchy typically assumed in multilevel modeling. A common example is the case in which subjects change their group membership in longitudinal studies (e.g., students transfer schools; employees transition between different departments). In this study, cross-classified and multiple membership models for multilevel and longitudinal item response data (CCMM-MLIRD) are developed to incorporate such mobility, focusing on students' school change in large-scale longitudinal studies. Furthermore, we investigate the effect of incorrectly modeling school membership in the analysis of multilevel and longitudinal item response data. Two types of school mobility are described, and corresponding models are specified. Results of the simulation studies suggested that appropriate modeling of the two types of school mobility using the CCMM-MLIRD yielded good recovery of the parameters and improvement over models that did not incorporate mobility properly. In addition, the consequences of incorrectly modeling the school effects on the variance estimates of the random effects and the standard errors of the fixed effects depended upon mobility patterns and model specifications. Two sets of large-scale longitudinal data are analyzed to illustrate applications of the CCMM-MLIRD for each type of school mobility.

  20. Mathematical literacy examination items and student errors: An analysis of English Second Language students’ responses

    Directory of Open Access Journals (Sweden)

    Pamela Vale

    2013-04-01

    Full Text Available Mathematical literacy is a real-world practical attribute yet students write a high-stakes examination in order to pass the subject Mathematical Literacy in the National Certificates (Vocational (NC(V. In these examinations, all sources of information are contextualised in language. It can be effortful for English second language students to decode text. The deliberate processing that is required saturates working memory and prevents these students from optimally engaging in problem solving. In this study, 15 items from an NC(V Level 4 Mathematical Literacy examination are selected, as well as 15 student responses to each of these questions. From these responses, those which are incorrect are analysed to determine whether the error is due to insufficient mathematical literacy or a lack of English language proficiency. These results are used as an indication as to whether the examination is fair and valid for this group of students.

  1. Disaster Preparedness and Response: Applied Exposure Science

    Science.gov (United States)

    In 2007, the ISEA, predecessor to ISES, held a special roundtable to discuss lessons learned for exposure science during and following environmental disasters, especially the 9/11 attacks and Hurricane Katrina. Since then, environmental agencies have been involved in responses to...

  2. Capturing abnormal personality with normal personality inventories: an item response theory approach.

    Science.gov (United States)

    Walton, Kate E; Roberts, Brent W; Krueger, Robert F; Blonigen, Daniel M; Hicks, Brian M

    2008-12-01

    Correlational and factor-analytic methods indicate that abnormal and normal personality constructs may be tapping the same underlying latent trait. However, they do not systematically demonstrate that measures of abnormal personality capture more extreme ranges of the latent trait than measures of normal range personality. Item Response Theory (IRT) methods, in contrast, do provide this information. In the present study, we use IRT methods to evaluate the range of the latent trait assessed with a normal personality measure and a measure of psychopathy as one example of an abnormal personality construct. Contrary to the expectation that the measure of psychopathy would be more extreme than the measure of normal personality traits, the measures overlapped substantially in terms of the regions of the latent trait for which they provide information. Moreover, both types of inventories were limited in terms of measurement bandwidth, such that they did not provide information across the entire latent trait continuum. Implications and future directions are discussed.

  3. Construction of a memory battery for computerized administration, using item response theory.

    Science.gov (United States)

    Ferreira, Aristides I; Almeida, Leandro S; Prieto, Gerardo

    2012-10-01

    In accordance with Item Response Theory, a computer memory battery with six tests was constructed for use in the Portuguese adult population. A factor analysis was conducted to assess the internal structure of the tests (N = 547 undergraduate students). According to the literature, several confirmatory factor models were evaluated. Results showed better fit of a model with two independent latent variables corresponding to verbal and non-verbal factors, reproducing the initial battery organization. Internal consistency reliability for the six tests were alpha = .72 to .89. IRT analyses (Rasch and partial credit models) yielded good Infit and Outfit measures and high precision for parameter estimation. The potential utility of these memory tasks for psychological research and practice willbe discussed.

  4. Evaluation of Buss-Perry aggression Questionnaire with item response theory (IRT

    Directory of Open Access Journals (Sweden)

    Dinić Bojana

    2012-01-01

    Full Text Available The aim of this research was to examine the psychometric properties of the Buss-Perry Aggression Questionnaire on Serbian sample, using the IRT model for graded responses. AQ contains four subscales: Physical aggression, Verbal aggression, Hostility and Anger. The sample included 1272 participants, both gender and age ranged from 18 to 68 years, with average age of 31.39 (SD = 12.63 years. Results of IRT analysis suggested that the subscales had greater information in the range of above-average scores, namely in participants with higher level of aggressiveness. The exception was Hostilisty subscale, because it was informative in the wider range of trait. On the other hand, this subscale contains two items which violate assumption of homogenity. Implications for measurement of aggressiveness are discussed.

  5. Quantifying diagnostic uncertainty using item response theory: the Posterior Probability of Diagnosis Index.

    Science.gov (United States)

    Lindhiem, Oliver; Kolko, David J; Yu, Lan

    2013-06-01

    Using traditional Diagnostic and Statistical Manual of Mental Disorders, fourth edition, text revision (American Psychiatric Association, 2000) diagnostic criteria, clinicians are forced to make categorical decisions (diagnosis vs. no diagnosis). This forced choice implies that mental and behavioral health disorders are categorical and does not fully characterize varying degrees of uncertainty associated with a particular diagnosis. Using an item response theory (latent trait model) framework, we describe the development of the Posterior Probability of Diagnosis (PPOD) Index, which answers the question: What is the likelihood that a patient meets or exceeds the latent trait threshold for a diagnosis? The PPOD Index is based on the posterior distribution of θ (latent trait score) for each patient's profile of symptoms. The PPOD Index allows clinicians to quantify and communicate the degree of uncertainty associated with each diagnosis in probabilistic terms. We illustrate the advantages of the PPOD Index in a clinical sample (N = 321) of children and adolescents with oppositional defiant disorder.

  6. Multi-Sensory Cognitive Learning as Facilitated in a Multimedia Tutorial for Item Response Theory

    Directory of Open Access Journals (Sweden)

    Chong Ho Yu

    2007-08-01

    Full Text Available The objective of this paper is to introduce an application of multi-sensory cognitive learning theory into the development of a multimedia tutorial for Item Response Theory. The cognitive multimedia theory suggests that the visual and auditory material should be presented simultaneously to reinforce the retention of learned materials. A computer-assisted module is carefully designed based upon the preceding theory and also an experiment was conducted to examine the effect of audio types (human audio, computer audio, and no audio on learner performance measured by an objective test. It was found that while there is no significant performance gap between the human audio and the no audio group, the two groups substantively outperform the computer audio group. A plausible explanation is that un-natural audio requires additional cognitive power to process the information and thus this distraction affects the performance.

  7. 非参数项目反应理论回顾与展望%The Retrospect and Prospect of Non-parametric Item Response Theory

    Institute of Scientific and Technical Information of China (English)

    陈婧; 康春花; 钟晓玲

    2013-01-01

      相比参数项目反应理论,非参数项目反应理论提供了更吻合实践情境的理论框架。目前非参数项目反应理论研究主要关注参数估计方法及其比较、数据-模型拟合验证等方面,其应用研究则集中于量表修订及个性数据和项目功能差异分析,而在认知诊断理论基础上发展起来的非参数认知诊断理论更是凸显其应用优势。未来研究应更多侧重于非参数项目反应理论的实践应用,对非参数认知诊断理论的研究也值得关注,以充分发挥非参数方法在实践领域的应用优势。%  Compared to parametric item response theory, non-parametric item response theory provide a more appropriate theoretical framework of practice situations. Non-parametric item response theory research focuses on parameter estimation methods and its comparison, data- model fitting verify etc. currently.Its applied research concentrate on scale amendments, personalized data and differential item functioning analysis. Non-parametric cognitive diagnostic theory which based on the parametric cognitive diagnostic theory gives prominence to the advantages of its application.To give full play to the advantages of non-parametric methods in practice,future studies should emphasis on the application of non-parametric item response theory while cognitive diagnosis of the non-parametric study is also worth of attention.

  8. Item Response Theory Analyses of the Parent and Teacher Ratings of the DSM-IV ADHD Rating Scale

    Science.gov (United States)

    Gomez, Rapson

    2008-01-01

    The graded response model (GRM), which is based on item response theory (IRT), was used to evaluate the psychometric properties of the inattention and hyperactivity/impulsivity symptoms in an ADHD rating scale. To accomplish this, parents and teachers completed the DSM-IV ADHD Rating Scale (DARS; Gomez et al., "Journal of Child Psychology and…

  9. A psychometric analysis of the Trait Emotional Intelligence Questionnaire-Short Form (TEIQue-SF) using item response theory.

    Science.gov (United States)

    Cooper, Andrew; Petrides, K V

    2010-09-01

    Trait emotional intelligence refers to a constellation of emotional self-perceptions located at the lower levels of personality hierarchies. In 2 studies, we sought to examine the psychometric properties of the Trait Emotional Intelligence Questionnaire-Short Form (TEIQue-SF; Petrides, 2009) using item response theory (IRT). Study 1 (N= 1,119, 455 men) showed that most items had good discrimination and threshold parameters and high item information values. At the global level, the TEIQue-SF showed very good precision across most of the latent trait range. Study 2 (N= 866, 432 men) used similar IRT techniques in a new sample based on the latest version of the TEIQue-SF (version 1.50). Results replicated Study 1, with the instrument showing good psychometric properties at the item and global level. Overall, the 2 studies suggest the TEIQue-SF can be recommended when a rapid assessment of trait emotional intelligence is required.

  10. Analysis of Differential Item Functioning in the NAEP History Assessment.

    Science.gov (United States)

    Zwick, Rebecca; Ercikan, Kadriye

    The Mantel-Haenszel approach for investigating differential item functioning (DIF) was applied to U.S. history items that were administered as part of the National Assessment of Educational Progress (NAEP). DIF analyses were based on the responses of 7,743 students in grade 11. On some items, Blacks, Hispanics, and females performed more poorly…

  11. Use of Item Response Theory to Examine a Cardiovascular Health Knowledge Measure for Adolescents with Elevated Blood Pressure

    Directory of Open Access Journals (Sweden)

    Stephanie L. Fitzpatrick

    2012-10-01

    Full Text Available The purpose of this study was to assess the psychometric properties of a cardiovascular health knowledge measure for adolescents using item response theory. The measure was developed in the context of a cardiovascular lifestyle intervention for adolescents with elevated blood pressure. Sample consisted of 167 adolescents (mean age = 16.2 years who completed the Cardiovascular Health Knowledge Assessment (CHKA, a 34-item multiple choice test, at baseline and post-intervention. The CHKA was unidimensional and internal consistency was .65 at pretest and .74 at posttest. Rasch analysis results indicated that at pretest the items targeted adolescents with variable levels of health knowledge. However, based on results at posttest, additional hard items are needed to account for the increase in level of cardiovascular health knowledge at post-intervention. Change in knowledge scores was examined using Rasch analysis. Findings indicated there was significant improvement in health knowledge over time [t(119 = -10.3, p< .0001]. In summary, the CHKA appears to contain items that are good approximations of the construct cardiovascular health knowledge and items that target adolescents with moderate levels of knowledge.  DOI: 10.2458/azu_jmmss.v3i1.16111

  12. Test of item-response bias in the CES-D scale. experience from the New Haven EPESE study.

    Science.gov (United States)

    Cole, S R; Kawachi, I; Maller, S J; Berkman, L F

    2000-03-01

    We present results of item-response bias analyses of the exogenous variables age, gender, and race for all items from the Center for Epidemiologic Studies Depression (CES-D) scale using data (N = 2340) from the New Haven component of the Established Populations for Epidemiologic Studies of the Elderly (EPESE). The proportional odds of blacks responding higher on the CES-D items "people are unfriendly" and "people dislike me" were 2.29 (95% confidence interval: 1.74, 3.02) and 2.96 (95% confidence interval: 2.15, 4.07) times that of whites matched on overall depressive symptoms, respectively. In addition, the proportional odds of women responding higher on the CES-D item "crying spells" were 2.14 (95% confidence interval: 1.60, 2.82) times that of men matched on overall depressive symptoms. Our data indicate the CES-D would have greater validity among this diverse group of older men and women after removal of the crying item and two interpersonal items.

  13. Proposta de um instrumento de medida para avaliar a satisfação de clientes de bancos utilizando a Teoria da Resposta ao Item Proposal of tool to assess the satisfaction of bank customers using the Item Response Theory

    Directory of Open Access Journals (Sweden)

    Alceu Balbim Junior

    2011-01-01

    Full Text Available Este artigo apresenta um instrumento de medida para avaliação da satisfação de clientes de bancos utilizando a Teoria da Resposta ao Item (TRI. Satisfazer os clientes tem sido uma busca constante das organizações que procuram manterem-se competitivas no mercado. Estudos constatam a relação entre a qualidade percebida pelos clientes, a satisfação e fidelidade. A avaliação da satisfação pode ser realizada por meio da qualidade percebida pelos clientes e a construção de ferramentas de avaliação deve contemplar características específicas da atividade em questão. Embasando-se em artigos que avaliam a satisfação de clientes de bancos, propõe-se um instrumento formado por 29 itens. Os itens foram aplicados a 240 clientes a fim de avaliar a satisfação com o banco de maior relacionamento. Utilizando a Teoria da Resposta ao Item, foram identificados os parâmetros dos itens e a curva de informação. A análise do grau de discriminação dos itens indicou que todos são apropriados. A curva de informação obtida evidenciou o intervalo no qual o instrumento apresenta melhores estimativas para níveis de satisfação. O trabalho apresentou o nível médio de satisfação da amostra e a concentração de clientes nos diferentes níveis de satisfação da escala.This paper presents a model for assessing the satisfaction of bank customers using the Item Response Theory (IRT. Organizations are constantly making effort to satisfy customers seeking to remain competitive. Several studies have reported on the relationship between perceived quality, satisfaction, and loyalty. The assessment of satisfaction can be accomplished through the perceived quality, and the development of assessment tools should address specific features of the activity in question. Based on articles that assess the satisfaction of bank customers, this study proposes an assessment tool consisting of 29 items. The items were applied to 240 clients to assess their

  14. Item-response-theory analysis of two scales for self-efficacy for exercise behavior in people with arthritis.

    Science.gov (United States)

    Mielenz, Thelma J; Edwards, Michael C; Callahan, Leigh F

    2011-07-01

    Benefits of physical activity for those with arthritis are clear, yet physical activity is difficult to initiate and maintain. Self-efficacy is a key modifiable psychosocial determinant of physical activity. This study examined two scales for self-efficacy for exercise behavior (SEEB) to identify their strengths and weaknesses using item response theory (IRT) from community-based randomized controlled trials of physical activity programs in adults with arthritis. The 2 SEEB scales included the 9-item scale by Resnick developed with older adults and the 5-item scale by Marcus developed with employed adults. All IRT analyses were conducted using the graded-response model. IRT assumptions were assessed using both exploratory and confirmatory factor analysis. The IRT analyses indicated that these scales are precise and reliable measures for identifying people with arthritis and low SEEB. The Resnick SEEB scale is slightly more precise at lower levels of self-efficacy in older adults with arthritis.

  15. Bifactor and Item Response Theory Analyses of Interviewer Report Scales of Cognitive Impairment in Schizophrenia

    Science.gov (United States)

    Reise, Steven P.; Ventura, Joseph; Keefe, Richard S. E.; Baade, Lyle E.; Gold, James M.; Green, Michael F.; Kern, Robert S.; Mesholam-Gately, Raquelle; Nuechterlein, Keith H.; Seidman, Larry J.; Bilder, Robert

    2011-01-01

    A psychometric analysis of 2 interview-based measures of cognitive deficits was conducted: the 21-item Clinical Global Impression of Cognition in Schizophrenia (CGI-CogS; Ventura et al., 2008), and the 20-item Schizophrenia Cognition Rating Scale (SCoRS; Keefe et al., 2006), which were administered on 2 occasions to a sample of people with…

  16. Interpreting gains and losses in conceptual test using Item Response Theory

    CERN Document Server

    Lamine, Brahim

    2015-01-01

    Conceptual tests are widely used by physics instructors to assess students' conceptual understanding and compare teaching methods. It is common to look at students' changes in their answers between a pre-test and a post-test to quantify a transition in student's conceptions. This is often done by looking at the proportion of incorrect answers in the pre-test that changes to correct answers in the post-test -- the gain -- and the proportion of correct answers that changes to incorrect answers -- the loss. By comparing theoretical predictions to experimental data on the Force Concept Inventory, we shown that Item Response Theory (IRT) is able to fairly well predict the observed gains and losses. We then use IRT to quantify the student's changes in a test-retest situation when no learning occurs and show that $i)$ up to 25\\% of total answers can change due to the non-deterministic nature of student's answer and that $ii)$ gains and losses can go from 0\\% to 100\\%. Still using IRT, we highlight the conditions tha...

  17. A note on monotone likelihood ratio of the total score variable in unidimensional item response theory.

    Science.gov (United States)

    Unlü, Ali

    2008-05-01

    This note provides a direct, elementary proof of the fundamental result on monotone likelihood ratio of the total score variable in unidimensional item response theory (IRT). This result is very important for practical measurement in IRT, because it justifies the use of the total score variable to order participants on the latent trait. The proof relies on a basic inequality for elementary symmetric functions which is proved by means of few purely algebraic, straightforward transformations. In particular, flaws in a proof of this result by Huynh [(1994). A new proof for monotone likelihood ratio for the sum of independent Bernoulli random variables. Psychometrika, 59, 77-79] are pointed out and corrected, and a natural generalization of the fundamental result to non-linear (quasi-ordered) latent trait spaces is presented. This may be useful for multidimensional IRT or knowledge space theory, in which the latent 'ability' spaces are partially ordered with respect to, for instance, coordinate-wise vector-ordering or set-inclusion, respectively.

  18. A multidimensional item response model : Constrained latent class analysis using the Gibbs sampler and posterior predictive checks

    NARCIS (Netherlands)

    Hoijtink, H; Molenaar, IW

    1997-01-01

    In this paper it will be shown that a certain class of constrained latent class models may be interpreted as a special case of nonparametric multidimensional item response models. The parameters of this latent class model will be estimated using an application of the Gibbs sampler. It will be illust

  19. Negative affectivity and social inhibition in cardiovascular disease: evaluating type-D personality and its assessment using item response theory

    NARCIS (Netherlands)

    Emons, Wilco H.M.; Meijer, Rob R.; Denollet, Johan

    2007-01-01

    Objective: Individuals with increased levels of both negative affectivity (NA) and social inhibition (SI)—referred to as type-D personality—are at increased risk of adverse cardiac events. We used item response theory (IRT) to evaluate NA, SI, and type-D personality as measured by the DS14. The obje

  20. An Analysis of Cross Racial Identity Scale Scores Using Classical Test Theory and Rasch Item Response Models

    Science.gov (United States)

    Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie

    2013-01-01

    Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…

  1. Computer adaptive practice of Maths ability using a new item response model for on the fly ability and difficulty estimation

    NARCIS (Netherlands)

    Klinkenberg, S.; Straatemeier, M.; van der Maas, H.L.J.

    2011-01-01

    In this paper we present a model for computerized adaptive practice and monitoring. This model is used in the Maths Garden, a web-based monitoring system, which includes a challenging web environment for children to practice arithmetic. Using a new item response model based on the Elo (1978) rating

  2. Development of new physical activity and sedentary behavior change self-efficacy questionnaires using item response modeling

    Science.gov (United States)

    Theoretically, increased levels of physical activity self-efficacy (PASE) should lead to increased physical activity, but few studies have reported this effect among youth. This failure may be at least partially attributable to measurement limitations. In this study, Item Response Modeling (IRM) was...

  3. Development of an Abbreviated Social Phobia and Anxiety Inventory (SPAI) Using Item Response Theory: The SPAI-23

    Science.gov (United States)

    Roberson-Nay, Roxann; Strong, David R.; Nay, William T.; Beidel, Deborah C.; Turner, Samuel M.

    2007-01-01

    An abbreviated version of the Social Phobia and Anxiety Inventory (SPAI) was developed using methods based in nonparametric item response theory. Participants included a nonclinical sample of 1,482 undergraduates (52% female, mean age = 19.4 years) as well as a clinical sample of 105 individuals (56% female, mean age = 36.4 years) diagnosed with…

  4. An Exploratory Study of the Applicability of Item Response Theory Methods to the Graduate Management Admission Test.

    Science.gov (United States)

    Kingston, Neal; And Others

    A necessary prerequisite to the operational use of item response theory (IRT) in any testing program is the investigation of the feasibility of such an approach. This report presents the results of such research for the Graduate Management Admission Test (GMAT). Despite the fact that GMAT data appear to violate a basic assumption of the…

  5. Increasing the Number of Replications in Item Response Theory Simulations: Automation through SAS and Disk Operating System

    Science.gov (United States)

    Gagne, Phill; Furlow, Carolyn; Ross, Terris

    2009-01-01

    In item response theory (IRT) simulation research, it is often necessary to use one software package for data generation and a second software package to conduct the IRT analysis. Because this can substantially slow down the simulation process, it is sometimes offered as a justification for using very few replications. This article provides…

  6. Modeling Unproductive Behavior in Online Homework in Terms of Latent Student Traits: An Approach Based on Item Response Theory

    Science.gov (United States)

    Gönülateş, Emre; Kortemeyer, Gerd

    2016-10-01

    Homework is an important component of most physics courses. One of the functions it serves is to provide meaningful formative assessment in preparation for examinations. However, correlations between homework and examination scores tend to be low, likely due to unproductive student behavior such as copying and random guessing of answers. In this study, we attempt to model these two counterproductive learner behaviors within the framework of Item Response Theory in order to provide an ability measurement that strongly correlates with examination scores. We find that introducing additional item parameters leads to worse predictions of examination grades, while introducing additional learner traits is a more promising approach.

  7. The nature of phonological awareness throughout the elementary grades: An item response theory perspective

    NARCIS (Netherlands)

    Vloedgraven, J.M.T.; Verhoeven, L.T.W.

    2009-01-01

    In the present study, the nature of Dutch children's phonological awareness was examined throughout the elementary school grades. Phonological awareness was assessed using five different sets of items that measured rhyming, phoneme identification, phoneme blending, phoneme segmentation, and phoneme

  8. The Effect of Using Different Weights for Multiple-Choice and Free-Response Item Sections

    Science.gov (United States)

    Hendrickson, Amy; Patterson, Brian; Melican, Gerald

    2008-01-01

    Presented at the Annual National Council on Measurement in Education (NCME) in New York in March 2008. This presentation explores how different item weighting can affect the effective weights, validity coefficents and test reliability of composite scores among test takers.

  9. Evaluating and Refining the Construct of Sexual Quality With Item Response Theory: Development of the Quality of Sex Inventory.

    Science.gov (United States)

    Shaw, Amanda M; Rogge, Ronald D

    2016-02-01

    This study took a critical look at the construct of sexual quality. The 65 items of four well-validated self-report measures of sexual satisfaction (the Index of Sexual Satisfaction [ISS], Hudson, Harrison, & Crosscup, 1981; the Global Measure of Sexual Satisfaction [GMSEX], Lawrance & Byers, 1995; the Pinney Sexual Satisfaction Inventory [PSSI], Pinney, Gerrard, & Denney, 1987; the Young Sexual Satisfaction Scale [YSSS], Young, Denny, Luquis, & Young, 1998) and an additional 74 potential sexual quality items were given to 3060 online participants. Using Item Response Theory (IRT), we demonstrated that the ISS, YSSS, and PSSI scales provided suboptimal levels of precision in assessing sexual quality, particularly given the length of those scales. Exploratory factor analyses, IRT, differential item functioning analyses, and longitudinal responsiveness analyses were used to develop and evaluate the Quality of Sex Inventory. Results suggested that, in comparison to existing scales, the QSI (1) offers investigators and clinicians more theoretically focused scales, (2) distinguishes sexual satisfaction from sexual dissatisfaction, and (3) offers greater precision and power for detecting differences with (4) comparably high levels of responsiveness for detecting change over time despite being notably shorter than most of the existing scales. The QSI-satisfaction subscales demonstrated strong convergent validity with other measures of sexual satisfaction and excellent construct validity with anchor scales from the nomological net surrounding that construct, suggesting that they continue to assess the same theoretical construct as prior scales. Implications for research are discussed.

  10. Applying the Nominal Response Model within a Longitudinal Framework to Construct the Positive Family Relationships Scale

    Science.gov (United States)

    Preston, Kathleen Suzanne Johnson; Parral, Skye N.; Gottfried, Allen W.; Oliver, Pamella H.; Gottfried, Adele Eskeles; Ibrahim, Sirena M.; Delany, Danielle

    2015-01-01

    A psychometric analysis was conducted using the nominal response model under the item response theory framework to construct the Positive Family Relationships scale. Using data from the Fullerton Longitudinal Study, this scale was constructed within a long-term longitudinal framework spanning middle childhood through adolescence. Items tapping…

  11. Reversed item bias: an integrative model.

    Science.gov (United States)

    Weijters, Bert; Baumgartner, Hans; Schillewaert, Niels

    2013-09-01

    In the recent methodological literature, various models have been proposed to account for the phenomenon that reversed items (defined as items for which respondents' scores have to be recoded in order to make the direction of keying consistent across all items) tend to lead to problematic responses. In this article we propose an integrative conceptualization of three important sources of reversed item method bias (acquiescence, careless responding, and confirmation bias) and specify a multisample confirmatory factor analysis model with 2 method factors to empirically test the hypothesized mechanisms, using explicit measures of acquiescence and carelessness and experimentally manipulated versions of a questionnaire that varies 3 item arrangements and the keying direction of the first item measuring the focal construct. We explain the mechanisms, review prior attempts to model reversed item bias, present our new model, and apply it to responses to a 4-item self-esteem scale (N = 306) and the 6-item Revised Life Orientation Test (N = 595). Based on the literature review and the empirical results, we formulate recommendations on how to use reversed items in questionnaires.

  12. The Nature of Phonological Awareness throughout the Elementary Grades: An Item Response Theory Perspective

    Science.gov (United States)

    Vloedgraven, Judith; Verhoeven, Ludo

    2009-01-01

    In the present study, the nature of Dutch children's phonological awareness was examined throughout the elementary school grades. Phonological awareness was assessed using five different sets of items that measured rhyming, phoneme identification, phoneme blending, phoneme segmentation, and phoneme deletion. A sample of 1405 children from…

  13. A Substantive Process Analysis of Responses to Items from the Multistate Bar Examination

    Science.gov (United States)

    Bonner, Sarah M.; D'Agostino, Jerome V.

    2012-01-01

    We investigated examinees' cognitive processes while they solved selected items from the Multistate Bar Exam (MBE), a high-stakes professional certification examination. We focused on ascertaining those mental processes most frequently used by examinees, and the most common types of errors in their thinking. We compared the relationships between…

  14. Partially Compensatory Multidimensional Item Response Theory Models: Two Alternate Model Forms

    Science.gov (United States)

    DeMars, Christine E.

    2016-01-01

    Partially compensatory models may capture the cognitive skills needed to answer test items more realistically than compensatory models, but estimating the model parameters may be a challenge. Data were simulated to follow two different partially compensatory models, a model with an interaction term and a product model. The model parameters were…

  15. World Health Organization Quality-of-Life Scale (WHOQOL-BREF: Analyses Of Their Item Response Theory Properties Based On The Graded Responses Model

    Directory of Open Access Journals (Sweden)

    Shahrum Vahedi

    2010-11-01

    Full Text Available "nObjective: This study has used Item Response Theory (IRT to examine the psychometric properties of Health-Related Quality-of-Life. "nMethod: This investigation is a descriptive- analytic study. Subjects were 370 undergraduate students of nursing and midwifery who were selected from Tabriz University of Medical Sciences. All participants were asked to complete the Farsi version of WHOQOL-BREF. Samejima's graded response model was used for the analyses. "nResults: The results revealed that the discrimination parameters for all items in the four scales were low to moderate. The threshold parameters showed adequate representation of the relevant traits from low to the mean trait level. With the exception of 15, 18, 24 and 26 items, all other items showed low item information function values, and thus relatively high reliability from low trait levels to moderate levels. "nConclusions: The results of this study indicate that although there was general support for the psychometric properties of the WHOQOL-BREF from an IRT perspective, this measure can be further improved. IRT analyses provided useful measurement information and demonstrated to be a better methodological approach for enhancing our knowledge of the functionality of WHOQOL-BREF.

  16. World Health Organization Quality-of-Life Scale (WHOQOL-BREF): Analyses of Their Item Response Theory Properties Based on the Graded Responses Model

    Science.gov (United States)

    2010-01-01

    Objective This study has used Item Response Theory (IRT) to examine the psychometric properties of Health-Related Quality-of-Life. Method This investigation is a descriptive- analytic study. Subjects were 370 undergraduate students of nursing and midwifery who were selected from Tabriz University of Medical Sciences. All participants were asked to complete the Farsi version of WHOQOL-BREF. Samejima's graded response model was used for the analyses. Results The results revealed that the discrimination parameters for all items in the four scales were low to moderate. The threshold parameters showed adequate representation of the relevant traits from low to the mean trait level. With the exception of 15, 18, 24 and 26 items, all other items showed low item information function values, and thus relatively high reliability from low trait levels to moderate levels. Conclusions The results of this study indicate that although there was general support for the psychometric properties of the WHOQOL-BREF from an IRT perspective, this measure can be further improved. IRT analyses provided useful measurement information and demonstrated to be a better methodological approach for enhancing our knowledge of the functionality of WHOQOL-BREF. PMID:22952508

  17. A responsible agenda for applied linguistics: Confessions of a philosopher

    Directory of Open Access Journals (Sweden)

    Albert Weideman

    2011-08-01

    Full Text Available When we undertake academic, disciplinary work, we rely on philosophical starting points. Several straightforward illustrations of this can be found in the history of applied linguistics. It is evident from the history of our field that various historically influential approaches to our discipline base themselves upon different academic confessions. This paper examines the effects of basing our applied linguistic work on the idea that applied linguistics is a discipline concerned with design. Such a characterisation does justice to both modernist and postmodernist emphases in applied linguistics. Conceptualisations of applied linguistics that came with the proposals for communicative language teaching (CLT some thirty to forty years ago propelled the discipline squarely into postmodern times. To account for this, we need to develop a theory of applied linguistics which shows what constitutive and regulative conditions exist for doing applied linguistic designs. A responsible agenda for applied linguistics today has as its first responsibility to free the users of its designs from toil and drudgery, as well as from becoming victims of fashion, ideology or theory. Secondly, it should design solutions to language problems in such a way that the technical imagination of the designer is not restricted but supported by theory and empirical investigation, and that the productive pedagogical fantasy of the implementers of such plans is set free. Thirdly, it must seek to become accountable by designing theoretically and socially defensible solutions to language problems, solutions that relieve some of the suffering, pain, poverty and injustice in our world.

  18. Assessing Understanding of the Concept of Function: A Study Comparing Prospective Secondary Mathematics Teachers' Responses to Multiple-Choice and Constructed-Response Items

    Science.gov (United States)

    Feeley, Susan Jane

    2013-01-01

    The purpose of this study was to determine whether multiple-choice and constructed-response items assessed prospective secondary mathematics teachers' understanding of the concept of function. The conceptual framework for the study was the Dreyfus and Eisenberg (1982) Function Block. The theoretical framework was Sierpinska's (1992, 1994)…

  19. Development of new physical activity and sedentary behavior change self-efficacy questionnaires using item response modeling

    Directory of Open Access Journals (Sweden)

    Venditti Elizabeth

    2009-03-01

    Full Text Available Abstract Background Theoretically, increased levels of physical activity self-efficacy (PASE should lead to increased physical activity, but few studies have reported this effect among youth. This failure may be at least partially attributable to measurement limitations. In this study, Item Response Modeling (IRM was used to develop new physical activity and sedentary behavior change self-efficacy scales. The validity of the new scales was compared with accelerometer assessments of physical activity and sedentary behavior. Methods New PASE and sedentary behavior change (TV viewing, computer video game use, and telephone use self-efficacy items were developed. The scales were completed by 714, 6th grade students in seven US cities. A limited number of participants (83 also wore an accelerometer for five days and provided at least 3 full days of complete data. The new scales were analyzed using Classical Test Theory (CTT and IRM; a reduced set of items was produced with IRM and correlated with accelerometer counts per minute and minutes of sedentary, light and moderate to vigorous activity per day after school. Results The PASE items discriminated between high and low levels of PASE. Full and reduced scales were weakly correlated (r = 0.18 with accelerometer counts per minute after school for boys, with comparable associations for girls. Weaker correlations were observed between PASE and minutes of moderate to vigorous activity (r = 0.09 – 0.11. The uni-dimensionality of the sedentary scales was established by both exploratory factor analysis and the fit of items to the underlying variable and reliability was assessed across the length of the underlying variable with some limitations. The reduced sedentary behavior scales had poor reliability. The full scales were moderately correlated with light intensity physical activity after school (r = 0.17 to 0.33 and sedentary behavior (r = -0.29 to -0.12 among the boys, but not for girls. Conclusion New

  20. Assessing Goodness of Fit in Item Response Theory with Nonparametric Models: A Comparison of Posterior Probabilities and Kernel-Smoothing Approaches

    Science.gov (United States)

    Sueiro, Manuel J.; Abad, Francisco J.

    2011-01-01

    The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…

  1. Recovery of Item Parameters in the Nominal Response Model: A Comparison of Marginal Maximum Likelihood Estimation and Markov Chain Monte Carlo Estimation.

    Science.gov (United States)

    Wollack, James A.; Bolt, Daniel M.; Cohen, Allan S.; Lee, Young-Sun

    2002-01-01

    Compared the quality of item parameter estimates for marginal maximum likelihood (MML) and Markov Chain Monte Carlo (MCMC) with the nominal response model using simulation. The quality of item parameter recovery was nearly identical for MML and MCMC, and both methods tended to produce good estimates. (SLD)

  2. Estimation of a Ramsay-Curve Item Response Theory Model by the Metropolis-Hastings Robbins-Monro Algorithm. CRESST Report 834

    Science.gov (United States)

    Monroe, Scott; Cai, Li

    2013-01-01

    In Ramsay curve item response theory (RC-IRT, Woods & Thissen, 2006) modeling, the shape of the latent trait distribution is estimated simultaneously with the item parameters. In its original implementation, RC-IRT is estimated via Bock and Aitkin's (1981) EM algorithm, which yields maximum marginal likelihood estimates. This method, however,…

  3. Testing whether the DSM-5 personality disorder trait model can be measured with a reduced set of items: An item response theory investigation of the Personality Inventory for DSM-5.

    Science.gov (United States)

    Maples, Jessica L; Carter, Nathan T; Few, Lauren R; Crego, Cristina; Gore, Whitney L; Samuel, Douglas B; Williamson, Rachel L; Lynam, Donald R; Widiger, Thomas A; Markon, Kristian E; Krueger, Robert F; Miller, Joshua D

    2015-12-01

    The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) includes an alternative model of personality disorders (PDs) in Section III, consisting in part of a pathological personality trait model. To date, the 220-item Personality Inventory for DSM-5 (PID-5; Krueger, Derringer, Markon, Watson, & Skodol, 2012) is the only extant self-report instrument explicitly developed to measure this pathological trait model. The present study used item response theory-based analyses in a large sample (n = 1,417) to investigate whether a reduced set of 100 items could be identified from the PID-5 that could measure the 25 traits and 5 domains. This reduced set of PID-5 items was then tested in a community sample of adults currently receiving psychological treatment (n = 109). Across a wide range of criterion variables including NEO PI-R domains and facets, DSM-5 Section II PD scores, and externalizing and internalizing outcomes, the correlational profiles of the original and reduced versions of the PID-5 were nearly identical (rICC = .995). These results provide strong support for the hypothesis that an abbreviated set of PID-5 items can be used to reliably, validly, and efficiently assess these personality disorder traits. The ability to assess the DSM-5 Section III traits using only 100 items has important implications in that it suggests these traits could still be measured in settings in which assessment-related resources (e.g., time, compensation) are limited.

  4. Item Response Theory Analysis of Two Questionnaire Measures of Arthritis-Related Self-Efficacy Beliefs from Community-Based US Samples

    Directory of Open Access Journals (Sweden)

    Thelma J. Mielenz

    2010-01-01

    Full Text Available Using item response theory (IRT, we examined the Rheumatoid Arthritis Self-efficacy scale (RASE collected from a People with Arthritis Can Exercise RCT (346 participants and 2 subscales of the Arthritis Self-efficacy scale (ASE collected from an Active Living Every Day (ALED RCT (354 participants to determine which one better identifies low arthritis self-efficacy in community-based adults with arthritis. The item parameters were estimated in Multilog using the graded response model. The 2 ASE subscales are adequately explained by one factor. There was evidence for 2 locally dependent item pairs; two items from these pairs were removed when we reran the model. The exploratory factor analysis results for RASE showed a multifactor solution which led to a 9-factor solution. In order to perform IRT analysis, one item from each of the 9 subfactors was selected. Both scales were effective at measuring a range of arthritis SE.

  5. Development of the Knee Quality of Life (KQoL-26 26-item questionnaire: data quality, reliability, validity and responsiveness

    Directory of Open Access Journals (Sweden)

    Atwell Chris

    2008-07-01

    Full Text Available Abstract Background This article describes the development and validation of a self-reported questionnaire, the KQoL-26, that is based on the views of patients with a suspected ligamentous or meniscal injury of the knee that assesses the impact of their knee problem on the quality of their lives. Methods Patient interviews and focus groups were used to derive questionnaire content. The instrument was assessed for data quality, reliability, validity, and responsiveness using data from a randomised trial and patient survey about general practitioners' use of Magnetic Resonance Imaging for patients with a suspected ligamentous or meniscal injury. Results Interview and focus group data produced a 40-item questionnaire designed for self-completion. 559 trial patients and 323 survey patients responded to the questionnaire. Following principal components analysis and Rasch analysis, 26 items were found to contribute to three scales of knee-related quality of life: physical functioning, activity limitations, and emotional functioning. Item-total correlations ranged from 0.60–0.82. Cronbach's alpha and test retest reliability estimates were 0.91–0.94 and 0.80–0.93 respectively. Hypothesised correlations with the Lysholm Knee Scale, EQ-5D, SF-36 and knee symptom questions were evidence for construct validity. The instrument produced highly significant change scores for 65 trial patients indicating that their knee was a little or somewhat better at six months. The new instrument had higher effect sizes (range 0.86–1.13 and responsiveness statistics (range 1.50–2.13 than the EQ-5D and SF-36. Conclusion The KQoL-26 has good evidence for internal reliability, test-retest reliability, validity and responsiveness, and is recommended for use in randomised trials and other evaluative studies of patients with a suspected ligamentous or meniscal injury.

  6. Item Banking with Embedded Standards

    Science.gov (United States)

    MacCann, Robert G.; Stanley, Gordon

    2009-01-01

    An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…

  7. What range of trait levels can the Autism-Spectrum Quotient (AQ) measure reliably? An item response theory analysis.

    Science.gov (United States)

    Murray, Aja Louise; Booth, Tom; McKenzie, Karen; Kuenssberg, Renate

    2016-06-01

    It has previously been noted that inventories measuring traits that originated in a psychopathological paradigm can often reliably measure only a very narrow range of trait levels that are near and above clinical cutoffs. Much recent work has, however, suggested that autism spectrum disorder traits are on a continuum of severity that extends well into the nonclinical range. This implies a need for inventories that can capture individual differences in autistic traits from very high levels all the way to the opposite end of the continuum. The Autism-Spectrum Quotient (AQ) was developed based on a closely related rationale, but there has, to date, been no direct test of the range of trait levels that the AQ can reliably measure. To assess this, we fit a bifactor item response theory model to the AQ. Results suggested that AQ measures moderately low to moderately high levels of a general autistic trait with good measurement precision. The reliable range of measurement was significantly improved by scoring the instrument using its 4-point response scale, rather than dichotomizing responses. These results support the use of the AQ in nonclinical samples, but suggest that items measuring very low and very high levels of autistic traits would be beneficial additions to the inventory. (PsycINFO Database Record

  8. Factors affecting study efficiency and item non-response in health surveys in developing countries: the Jamaica national healthy lifestyle survey

    Directory of Open Access Journals (Sweden)

    Bennett Franklyn

    2007-02-01

    Full Text Available Abstract Background Health surveys provide important information on the burden and secular trends of risk factors and disease. Several factors including survey and item non-response can affect data quality. There are few reports on efficiency, validity and the impact of item non-response, from developing countries. This report examines factors associated with item non-response and study efficiency in a national health survey in a developing Caribbean island. Methods A national sample of participants aged 15–74 years was selected in a multi-stage sampling design accounting for 4 health regions and 14 parishes using enumeration districts as primary sampling units. Means and proportions of the variables of interest were compared between various categories. Non-response was defined as failure to provide an analyzable response. Linear and logistic regression models accounting for sample design and post-stratification weighting were used to identify independent correlates of recruitment efficiency and item non-response. Results We recruited 2012 15–74 year-olds (66.2% females at a response rate of 87.6% with significant variation between regions (80.9% to 97.6%; p Conclusion Informative health surveys are possible in developing countries. While survey response rates may be satisfactory, item non-response was high in respect of income and sexual practice. In contrast to developed countries, non-response to questions on income is higher and has different correlates. These findings can inform future surveys.

  9. Differential sensitivity theory applied to movement of maxima responses. [LMFBR

    Energy Technology Data Exchange (ETDEWEB)

    Maudlin, P.J.; Parks, C.V.; Cacuci, D.G.

    1981-01-01

    Differential sensitivity theory (DST) is a recently developed methodology to evaluate response derivatives dR/d..cap alpha.. by using adjoint functions which correspond to the differentiated (with respect to an arbitrary parameter ..cap alpha..) linear or nonlinear physical system of equations. However, for many problems, where responses of importance are local maxima such as peak temperature, power, or heat flux, changes in the phase space location of the peak itself are of interest. This summary will present the DST procedure for predicting phase space shifts of maxima responses as applied to the MELT-III fast reactor safety code. An FFTF protected transient involving a $.23/s ramp reactivity insertion with scram on high power was selected for investigation.

  10. The Effect of Response Format on the Psychometric Properties of the Narcissistic Personality Inventory: Consequences for Item Meaning and Factor Structure.

    Science.gov (United States)

    Ackerman, Robert A; Donnellan, M Brent; Roberts, Brent W; Fraley, R Chris

    2016-04-01

    The Narcissistic Personality Inventory (NPI) is currently the most widely used measure of narcissism in social/personality psychology. It is also relatively unique because it uses a forced-choice response format. We investigate the consequences of changing the NPI's response format for item meaning and factor structure. Participants were randomly assigned to one of three conditions: 40 forced-choice items (n = 2,754), 80 single-stimulus dichotomous items (i.e., separate true/false responses for each item; n = 2,275), or 80 single-stimulus rating scale items (i.e., 5-point Likert-type response scales for each item; n = 2,156). Analyses suggested that the "narcissistic" and "nonnarcissistic" response options from the Entitlement and Superiority subscales refer to independent personality dimensions rather than high and low levels of the same attribute. In addition, factor analyses revealed that although the Leadership dimension was evident across formats, dimensions with entitlement and superiority were not as robust. Implications for continued use of the NPI are discussed.

  11. Magnetic response to applied electrostatic field in external magnetic field

    Energy Technology Data Exchange (ETDEWEB)

    Adorno, T.C. [Universidade de Sao Paulo, Instituto de Fisica, Caixa Postal 66318, Sao Paulo, SP (Brazil); University of Florida, Department of Physics, Gainesville, FL (United States); Gitman, D.M. [Universidade de Sao Paulo, Instituto de Fisica, Caixa Postal 66318, Sao Paulo, SP (Brazil); Tomsk State University, Department of Physics, Tomsk (Russian Federation); Shabad, A.E. [P. N. Lebedev Physics Institute, Moscow (Russian Federation)

    2014-04-15

    We show, within QED and other possible nonlinear theories, that a static charge localized in a finite domain of space becomes a magnetic dipole, if it is placed in an external (constant and homogeneous) magnetic field in the vacuum. The magnetic moment is quadratic in the charge, depends on its size and is parallel to the external field, provided the charge distribution is at least cylindrically symmetric. This magneto-electric effect is a nonlinear response of the magnetized vacuum to an applied electrostatic field. Referring to the simple example of a spherically symmetric applied field, the nonlinearly induced current and its magnetic field are found explicitly throughout the space; the pattern of the lines of force is depicted, both inside and outside the charge, which resembles that of a standard solenoid of classical magnetostatics. (orig.)

  12. Magnetic response to applied electrostatic field in external magnetic field

    CERN Document Server

    Adorno, T C; Shabad, A E

    2014-01-01

    We show, within QED and other possible nonlinear theories, that a static charge localized in a finite domain of space becomes a magnetic dipole, if it is placed in an external (constant and homogeneous) magnetic field in the vacuum. The magnetic moment is quadratic in the charge, depends on its size and is parallel to the external field, provided the charge distribution is at least cylindrically symmetric. This magneto-electric effect is a nonlinear response of the magnetized vacuum to an applied electrostatic field. Referring to a simple example of a spherically-symmetric applied field, the nonlinearly induced current and its magnetic field are found explicitly throughout the space, the pattern of lines of force is depicted, both inside and outside the charge, which resembles that of a standard solenoid of classical magnetostatics.

  13. Multilevel Modeling of Item Position Effects

    Science.gov (United States)

    Albano, Anthony D.

    2013-01-01

    In many testing programs it is assumed that the context or position in which an item is administered does not have a differential effect on examinee responses to the item. Violations of this assumption may bias item response theory estimates of item and person parameters. This study examines the potentially biasing effects of item position. A…

  14. An autoregressive growth model for longitudinal item analysis.

    Science.gov (United States)

    Jeon, Minjeong; Rabe-Hesketh, Sophia

    2016-09-01

    A first-order autoregressive growth model is proposed for longitudinal binary item analysis where responses to the same items are conditionally dependent across time given the latent traits. Specifically, the item response probability for a given item at a given time depends on the latent trait as well as the response to the same item at the previous time, or the lagged response. An initial conditions problem arises because there is no lagged response at the initial time period. We handle this problem by adapting solutions proposed for dynamic models in panel data econometrics. Asymptotic and finite sample power for the autoregressive parameters are investigated. The consequences of ignoring local dependence and the initial conditions problem are also examined for data simulated from a first-order autoregressive growth model. The proposed methods are applied to longitudinal data on Korean students' self-esteem.

  15. A new look at the psychometrics of the parenting scale through the lens of item response theory.

    Science.gov (United States)

    Lorber, Michael F; Xu, Shu; Slep, Amy M Smith; Bulling, Lisanne; O'Leary, Susan G

    2014-01-01

    The psychometrics of the Parenting Scale's Overreactivity and Laxness subscales were evaluated using item response theory (IRT) techniques. The IRT analyses were based on 2 community samples of cohabiting parents of 3- to 8-year-old children, combined to yield a total sample size of 852 families. The results supported the utility of the Overreactivity and Laxness subscales, particularly in discriminating among parents in the mid to upper reaches of each construct. The original versions of the Overreactivity and Laxness subscales were more reliable than alternative, shorter versions identified in replicated factor analyses from previously published research and in IRT analyses in the present research. Moreover, in several cases, the original versions of these subscales, in comparison with the shortened versions, exhibited greater 6-month stabilities and correlations with child externalizing behavior and couple relationship satisfaction. Reliability was greater for the Laxness than for the Overreactivity subscale. Item performance on each subscale was highly variable. Together, the present findings are generally supportive of the psychometrics of the Parenting Scale, particularly for clinical research and practice. They also suggest areas for further development.

  16. Performance of Accounting students on the Enade/2012 test: an application of the Item-Response Theory

    Directory of Open Access Journals (Sweden)

    Raphael Vinicius Weigert Camargo

    2016-08-01

    Full Text Available The objective in this study was to measure Accounting students’ performance (proficiency on the Enade test using the Item Response Theory (IRT. The students’ performance was measured using the three parameter logistic model (3PL, based on data related to the Enade test/2012, taken from the website of the National Institute for Educational Studies and Research Anísio Teixeira (Inep, concerning 47,098 students. Through the scale, three levels of student performance could be distinguished. Level 1 students master the reading and interpretation of texts and quantitative reasoning. In addition, Level 2 students should present logical reasoning and systemic and holistic perspective. Furthermore, at Level 3, students should present interdisciplinary knowledge, covering accounting contents, critical-analytic skills and practical application of the content mastered. The results also appointed that the items of the Enade test were very difficulty for the group that took the test. Independently of the student characteristics analyzed, overall, the proficiency scores were very low. This result suggests that the HEI need to take actions and that public policies are needed that can contribute to improve the students’ performance.

  17. Guide to good practices for the development of test items

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-01-01

    While the methodology used in developing test items can vary significantly, to ensure quality examinations, test items should be developed systematically. Test design and development is discussed in the DOE Guide to Good Practices for Design, Development, and Implementation of Examinations. This guide is intended to be a supplement by providing more detailed guidance on the development of specific test items. This guide addresses the development of written examination test items primarily. However, many of the concepts also apply to oral examinations, both in the classroom and on the job. This guide is intended to be used as guidance for the classroom and laboratory instructor or curriculum developer responsible for the construction of individual test items. This document focuses on written test items, but includes information relative to open-reference (open book) examination test items, as well. These test items have been categorized as short-answer, multiple-choice, or essay. Each test item format is described, examples are provided, and a procedure for development is included. The appendices provide examples for writing test items, a test item development form, and examples of various test item formats.

  18. An overview of the Normal Ogive Harmonic Analysis Robust Method (NOHARM approach to item response theory

    Directory of Open Access Journals (Sweden)

    Lee, J. J.

    2016-01-01

    Full Text Available Here we provide a description of the IRT estimation method known as Normal Ogive Harmonic Analysis Robust Method (NOHARM. Although in some ways this method has been superseded by new computer programs that also adopt a specifically factor-analytic approach, its fundamental principles remain useful in certain applications, which include calculating the residual covariance matrix and rescaling the distribution of the common factor (latent trait. These principles can be applied to parameter estimates obtained by any method.

  19. The Elements of Item Response Theory and its Framework in Analyzing Introductory Astronomy College Student Misconceptions. I. Galaxies

    CERN Document Server

    Favia, Andrej; Thorpe, Geoffrey L

    2013-01-01

    This is the first in a series of papers that analyze college student beliefs in realms where common astronomy misconceptions are prevalent. Data was collected through administration of an inventory distributed at the end of an introductory college astronomy course. In this paper, we present the basic mathematics of item response theory (IRT), and then we use it to explore concepts related to galaxies. We show how IRT determines the difficulty of each galaxy topic under consideration. We find that the concept of galaxy spatial distribution presents the greatest challenge to students of all the galaxy topics. We also find and present the most logical sequence to teach galaxy topics as a function of the audience's age.

  20. The Children's Behavior Questionnaire very short scale: psychometric properties and development of a one-item temperament scale.

    Science.gov (United States)

    Sleddens, Ester F C; Hughes, Sheryl O; O'Connor, Teresia M; Beltran, Alicia; Baranowski, Janice C; Nicklas, Theresa A; Baranowski, Tom

    2012-02-01

    Little research has been conducted on the psychometrics of the very short scale (36 items) of the Children's Behavior Questionnaire, and no one-item temperament scale has been tested for use in applied work. In this study, 237 United States caregivers completed a survey to define their child's behavioral patterns (i.e., Surgency, Negative Affectivity Effortful Control) using both scales. Psychometrics of the 36-item Children's Behavior Questionnaire were examined using classical test theory, principal factor analysis, and item response modeling. Classical test theory analysis demonstrated adequate internal consistency and factor analysis confirmed a three-factor structure. Potential improvements to the measure were identified using item response modeling. A one-item (three response categories) temperament scale was validated against the three temperament factors of the 36-item scale. The temperament response categories correlated with the temperament factors of the 36-item scale, as expected. The one-item temperament scale may be applicable for clinical use.

  1. Maximizing measurement efficiency of behavior rating scales using Item Response Theory: An example with the Social Skills Improvement System - Teacher Rating Scale.

    Science.gov (United States)

    Anthony, Christopher J; DiPerna, James C; Lei, Pui-Wa

    2016-04-01

    Measurement efficiency is an important consideration when developing behavior rating scales for use in research and practice. Although most published scales have been developed within a Classical Test Theory (CTT) framework, Item Response Theory (IRT) offers several advantages for developing scales that maximize measurement efficiency. The current study provides an example of using IRT to maximize rating scale efficiency with the Social Skills Improvement System - Teacher Rating Scale (SSIS - TRS), a measure of student social skills frequently used in practice and research. Based on IRT analyses, 27 items from the Social Skills subscales and 14 items from the Problem Behavior subscales of the SSIS - TRS were identified as maximally efficient. In addition to maintaining similar content coverage to the published version, these sets of maximally efficient items demonstrated similar psychometric properties to the published SSIS - TRS.

  2. RATING CREATION FOR PROFESSIONAL EDUCATIONAL ORGANIZATIONS BASED ON THE ITEM RESPONSE THEORY

    Directory of Open Access Journals (Sweden)

    N. E. Erganova

    2016-01-01

    Full Text Available The aim of the investigation is to theoretically justify and describe approval of the measurement of the level of provision of educational services, education qualities and rating of vocational educational organizations.Methods. The fundamentals of methodology of the research conducted by authors are made by provisions of system approach; research on a schematization and modeling of pedagogical objects; the provision of the theory of measurement of latent variables. As the main methods of research the analysis, synthesis, the comparative analysis, statistical methods of processing of results of research are applied.Results. The paper gives a short comparative analysis of potentials of qualitative approach and strong points of the theory of latent variables in evaluating the quality of education and ratings of the investigated object. The technique of measurement of level of rendering educational services at creation of a rating of the professional educational organizations is stated.Scientific novelty. Pedagogical opportunities of the theory of measurement of latent variables are investigated; the principles of creation of ratings of the professional educational organizations are designated.Practical significance. The operational construct of the latent variable «quality of education» for the secondary professional education (SPE approved in the Perm Territory which can form base of formation of similar constructs for creation of a rating of the professional educational organizations in other regions is developed.

  3. An item response theory analysis of DSM-IV diagnostic criteria for personality disorders: findings from the national epidemiologic survey on alcohol and related conditions.

    Science.gov (United States)

    Harford, Thomas C; Chen, Chiung M; Saha, Tulshi D; Smith, Sharon M; Hasin, Deborah S; Grant, Bridget F

    2013-01-01

    The purpose of this study was to evaluate the psychometric properties of DSM-IV symptom criteria for assessing personality disorders (PDs) in a national population and to compare variations in proposed symptom coding for social and/or occupational dysfunction. Data were obtained from a total sample of 34,653 respondents from Waves 1 and 2 of the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). For each personality disorder, confirmatory factor analysis (CFA) established a 1-factor latent factor structure for the respective symptom criteria. A 2-parameter item response theory (IRT) model was applied to the symptom criteria for each PD to assess the probabilities of symptom item endorsements across different values of the underlying trait (latent factor). Findings were compared with a separate IRT model using an alternative coding of symptom criteria that requires distress/impairment to be related to each criterion. The CFAs yielded a good fit for a single underlying latent dimension for each PD. Findings from the IRT indicated that DSM-IV PD symptom criteria are clustered in the moderate to severe range of the underlying latent dimension for each PD and are peaked, indicating high measurement precision only within a narrow range of the underlying trait and lower measurement precision at lower and higher levels of severity. Compared with the NESARC symptom coding, the IRT results for the alternative symptom coding are shifted toward the more severe range of the latent trait but generally have lower measurement precision for each PD. The IRT findings provide support for a reliable assessment of each PD for both NESARC and alternative coding for distress/impairment. The use of symptom dysfunction for each criterion, however, raises a number of issues and implications for the DSM-5 revision currently proposed for Axis II disorders (American Psychiatric Association, 2010).

  4. Developmental changes in reading do not alter the development of visual processing skills: An application of explanatory item response models in grades K-2

    Directory of Open Access Journals (Sweden)

    Kristi L Santi

    2015-02-01

    Full Text Available Visual processing has been widely studied in regard to its impact on a students’ ability to read. A less researched area is the role of reading in the development of visual processing skills. A cohort-sequential, accelerated-longitudinal design was utilized with 932 kindergarten, first, and second grade students to examine the impact of reading acquisition on the processing of various types of visual discrimination and visual motor test items. Students were assessed four times per year on a variety of reading measures and reading precursors and two popular measures of visual processing over a three-year period. Explanatory item response models were used to examine the roles of person and item characteristics on changes in visual processing abilities and changes in item difficulties over time. Results showed different developmental patterns for five types of visual processing test items, but most importantly failed to show consistent effects of learning to read on changes in item difficulty. Thus, the present study failed to find support for the hypothesis that learning to read alters performance on measures of visual processing. Rather, visual processing and reading ability improved together over time with no evidence to suggest cross-domain influences from reading to visual processing. Results are discussed in the context of developmental theories of visual processing and brain-based research on the role of visual skills in learning to read.

  5. Immediate List Recall as a Measure of Short-Term Episodic Memory: Insights from the Serial Position Effect and Item Response Theory

    Science.gov (United States)

    Gavett, Brandon E.; Horwitz, Julie E.

    2012-01-01

    The serial position effect shows that two interrelated cognitive processes underlie immediate recall of a supraspan word list. The current study used item response theory (IRT) methods to determine whether the serial position effect poses a threat to the construct validity of immediate list recall as a measure of verbal episodic memory. Archival data were obtained from a national sample of 4,212 volunteers aged 28–84 in the Midlife Development in the United States study. Telephone assessment yielded item-level data for a single immediate recall trial of the Rey Auditory Verbal Learning Test (RAVLT). Two parameter logistic IRT procedures were used to estimate item parameters and the Q1 statistic was used to evaluate item fit. A two-dimensional model better fit the data than a unidimensional model, supporting the notion that list recall is influenced by two underlying cognitive processes. IRT analyses revealed that 4 of the 15 RAVLT items (1, 12, 14, and 15) were misfit (p < .05). Item characteristic curves for items 14 and 15 decreased monotonically, implying an inverse relationship between the ability level and the probability of recall. Elimination of the four misfit items provided better fit to the data and met necessary IRT assumptions. Performance on a supraspan list learning test is influenced by multiple cognitive abilities; failure to account for the serial position of words decreases the construct validity of the test as a measure of episodic memory and may provide misleading results. IRT methods can ameliorate these problems and improve construct validity. PMID:22138320

  6. Desenvolvimento de uma escala para medir o potencial empreendedor utilizando a Teoria da Resposta ao Item (TRI Development of a scale to measure the entrepreneurial potential using the Item Response Theory (IRT

    Directory of Open Access Journals (Sweden)

    Luciano Ricardo Rath Alves

    2011-01-01

    Full Text Available Diversas variáveis estão relacionadas ao desenvolvimento da atividade empreendedora, verifica-se, entre elas, a importância do agente empreendedor. Dos estudos que contribuem para o seu entendimento, este segue a linha que defende que o empreendedor tem características e traços de personalidade singulares em relação à população, os quais são propícios ao sucesso do empreendedorismo. O objetivo deste trabalho é desenvolver uma escala para medir o potencial empreendedor utilizando a Teoria da Resposta ao Item. Foi utilizado o modelo logístico de dois parâmetros da TRI. As estimativas dos parâmetros foram obtidas a partir da amostra com 764 pessoas que responderam a um instrumento composto por 103 itens. A curva de informação e do erro padrão do teste e a interpretação qualitativa de níveis da escala permitiram determinar o intervalo mais apropriado para utilização do instrumento. Os resultados mostraram que a escala é mais adequada para avaliar indivíduos com baixo até moderadamente alto potencial empreendedor. Por isso, sugere-se que novos itens sejam incorporados ao instrumento para mensurar e interpretar níveis ainda mais elevados. A Teoria da Resposta ao Item permite que novos itens sejam calibrados a fim de mensurar os empreendedores com alto potencial empreendedor, aproveitando os dados já obtidos.Several variables are related to the development of entrepreneurial activities. An important one among them is the entrepreneurial agent. This study is one of many that contribute to the understanding of the entrepreneurial agent. In its line of thought, it upholds the idea that the entrepreneur has characteristics and personality traits that stand out from the general population and that are favorable to the success of the entrepreneurship. This study aims at developing a measurement scale for entrepreneurial potential using the Item Response Theory. The items were generated by Santos (2008 based on a theoretical model

  7. A Practical Guide to Check the Consistency of Item Response Patterns in Clinical Research Through Person-Fit Statistics: Examples and a Computer Program.

    Science.gov (United States)

    Meijer, Rob R; Niessen, A Susan M; Tendeiro, Jorge N

    2016-02-01

    Although there are many studies devoted to person-fit statistics to detect inconsistent item score patterns, most studies are difficult to understand for nonspecialists. The aim of this tutorial is to explain the principles of these statistics for researchers and clinicians who are interested in applying these statistics. In particular, we first explain how invalid test scores can be detected using person-fit statistics; second, we provide the reader practical examples of existing studies that used person-fit statistics to detect and to interpret inconsistent item score patterns; and third, we discuss a new R-package that can be used to identify and interpret inconsistent score patterns.

  8. Understanding and quantifying cognitive complexity level in mathematical problem solving items

    Directory of Open Access Journals (Sweden)

    SUSAN E. EMBRETSON

    2008-09-01

    Full Text Available The linear logistic test model (LLTM; Fischer, 1973 has been applied to a wide variety of new tests. When the LLTM application involves item complexity variables that are both theoretically interesting and empirically supported, several advantages can result. These advantages include elaborating construct validity at the item level, defining variables for test design, predicting parameters of new items, item banking by sources of complexity and providing a basis for item design and item generation. However, despite the many advantages of applying LLTM to test items, it has been applied less often to understand the sources of complexity for large-scale operational test items. Instead, previously calibrated item parameters are modeled using regression techniques because raw item response data often cannot be made available. In the current study, both LLTM and regression modeling are applied to mathematical problem solving items from a widely used test. The findings from the two methods are compared and contrasted for their implications for continued development of ability and achievement tests based on mathematical problem solving items.

  9. An Item Response Theory-Based, Computerized Adaptive Testing Version of the MacArthur-Bates Communicative Development Inventory: Words & Sentences (CDI:WS)

    Science.gov (United States)

    Makransky, Guido; Dale, Philip S.; Havmose, Philip; Bleses, Dorthe

    2016-01-01

    Purpose: This study investigated the feasibility and potential validity of an item response theory (IRT)-based computerized adaptive testing (CAT) version of the MacArthur-Bates Communicative Development Inventory: Words & Sentences (CDI:WS; Fenson et al., 2007) vocabulary checklist, with the objective of reducing length while maintaining…

  10. An item response theory analysis of Harter's Self-Perception Profile for Children or why strong clinical scales should be distrusted

    NARCIS (Netherlands)

    Egberink, Iris J. L.; Meijer, Rob R.

    2011-01-01

    The authors investigated the psychometric properties of the subscales of the Self-Perception Profile for Children with item response theory (IRT) models using a sample of 611 children. Results from a nonparametric Mokken analysis and a parametric IRT approach for boys (n = 268) and girls (n = 343) w

  11. Investigating the Population Sensitivity Assumption of Item Response Theory True-Score Equating across Two Subgroups of Examinees and Two Test Formats

    Science.gov (United States)

    von Davier, Alina A.; Wilson, Christine

    2008-01-01

    Dorans and Holland (2000) and von Davier, Holland, and Thayer (2003) introduced measures of the degree to which an observed-score equating function is sensitive to the population on which it is computed. This article extends the findings of Dorans and Holland and of von Davier et al. to item response theory (IRT) true-score equating methods that…

  12. Improving the Reliability of Student Scores from Speeded Assessments: An Illustration of Conditional Item Response Theory Using a Computer-Administered Measure of Vocabulary

    Science.gov (United States)

    Petscher, Yaacov; Mitchell, Alison M.; Foorman, Barbara R.

    2015-01-01

    A growing body of literature suggests that response latency, the amount of time it takes an individual to respond to an item, may be an important factor to consider when using assessment data to estimate the ability of an individual. Considering that tests of passage and list fluency are being adapted to a computer administration format, it is…

  13. Item Response Theory. Research Report. ETS RR-13-28. ETS R&D Scientific and Policy Contributions Series. ETS SPC-13-05

    Science.gov (United States)

    Carlson, James E.; von Davier, Matthias

    2013-01-01

    Few would doubt that ETS researchers have contributed more to the general topic of item response theory (IRT) than individuals from any other institution. In this report, we briefly review most of those contributions, dividing them into sections by decades of publication, beginning with early work by Fred Lord and Bert Green in the 1950s and…

  14. Innovative application of a multidimensional item response model in assessing the influence of social desirability on the pseudo-relationship between self-efficacy and behavior

    Science.gov (United States)

    This study examined multidimensional item response theory (MIRT) modeling to assess social desirability (SocD) influences on self-reported physical activity self-efficacy (PASE) and fruit and vegetable self-efficacy (FVSE). The observed sample included 473 Houston-area adolescent males (10–14 years)...

  15. A General Program for Item-Response Analysis That Employs the Stabilized Newton-Raphson Algorithm. Research Report. ETS RR-13-32

    Science.gov (United States)

    Haberman, Shelby J.

    2013-01-01

    A general program for item-response analysis is described that uses the stabilized Newton-Raphson algorithm. This program is written to be compliant with Fortran 2003 standards and is sufficiently general to handle independent variables, multidimensional ability parameters, and matrix sampling. The ability variables may be either polytomous or…

  16. How to compare scores from different depression scales: equating the Patient Health Questionnaire (PHQ) and the ICD-10-Symptom Rating (ISR) using Item Response Theory.

    Science.gov (United States)

    Fischer, H Felix; Tritt, Karin; Klapp, Burghard F; Fliege, Herbert

    2011-12-01

    A wide range of questionnaires for measuring depression are available. Item Response Theory models can help to evaluate the questionnaires exceeding the boundaries of Classical Test Theory and provide an opportunity to equate the questionnaires. In this study after checking for unidimensionality, a General Partial Credit Model was applied to data from two different depression scales [Patient Health Questionnaire (PHQ-9) and ICD-10-Symptom Rating (ISR)] obtained in clinical settings from a consecutive sample, including 4517 observations from a total of 2999 inpatients and outpatients of a psychosomatic clinic. The precision of each questionnaire was compared and the model was used to transform scores based on the assumed underlying latent trait. Both instruments were constructed to measure the same construct and their estimates of depression severity are highly correlated. Our analysis showed that the predicted scores provided by the conversion tables are similar to the observed scores in a validation sample. The PHQ-9 and ISR depression scales measure depression severity across a broad range with similar precision. While the PHQ-9 shows advantages in measuring low or high depression severity, the ISR is more parsimonious and also suitable for clinical purposes. Furthermore, the equation tables derived in this study enhance the comparability of studies using either one of the instruments, but due to substantial statistical spread the comparison of individual scores is imprecise.

  17. Multiple-choice versus open-ended response formats of reading test items: A two-dimensional IRT analysis

    Directory of Open Access Journals (Sweden)

    Dominique P. Rauch

    2010-12-01

    Full Text Available The dimensionality of a reading comprehension assessment with non-stem equivalent multiple-choice (MC items and open-ended (OE items was analyzed with German test data of 8523 9th-graders. We found that a two-dimensional IRT model with within-item multidimensionality, where MC and OE items load on a general latent dimension and OE items additionally load on a nested latent dimension, had a superior fit compared to an unidimensional model (p ≤ .05. Correlations between general cognitive abilities, orthography and vocabulary and the general latent dimension were significantly higher than with the nested latent dimension (p ≤ .05. Drawing back on experimental studies on the effect of item format on reading processes, we suppose that the general latent dimension measures abilities necessary to master basic reading processes and the nested latent dimension captures abilities necessary to master higher reading processes. Including gender, language spoken at home, and school track as predictors in latent regression models showed that the well known advantage of girls and mother-tongue students is found only for the nested latent dimension.

  18. Dynamic and Comprehensive Item Selection Strategies for Computerized Adaptive Testing Based on Graded Response Model%多级评分计算机化自适应测验动态综合选题策略

    Institute of Scientific and Technical Information of China (English)

    罗芬; 丁树良; 王晓庆

    2012-01-01

    Item selection strategy (ISS) is a core component in Computerized Adaptive Testing (CAT). Polytomous items can provide more information about examinee compared with dichotomous items, and adopting polytomously scored items in test is a research direction of CAT. As we know, the most widely used ISS is the maximum Fisher information (MFI) criterion, which raises concerns about cost-efficiency of the pool utilization and poses security risks for CAT programs. Chang & Ying (1999) and Chang, Qian, & Ying (2001) proposed two alternative item selection procedures, the a-stratified method (a-STR) and the a-stratified with b blocking method (&-STR) based on dichotomous model, with the goal to remedy the problems of item overexposure and item underexposure produced by MFI. However, the technology of a-STR and fc-STR is static because the items are stratified according to the given information at the beginning of test. Based on graded response model (GRM), a technique of the reduction dimensionality of difficulty (or step) parameters was employed[0] to construct some ISSs recently. The limitation of this dimension reduction technique is that it loses a lot of information. Thus, in order to improve MFI, two new item selection methods are proposed based on GRM: (1) modify the technique of the reduction dimensionality of difficulty (or step) parameters by integrating the interval estimation; (2) dynamic a-STR and dynamic fc-STR methods are implemented in the testing process. On one hand, these new ISSs can avoid and remedy the limitations of MFI and make good use of the advantages of the Fisher information function (FIF); FIF compresses all item parameters and ability parameters, so it is a comprehensive tool for all parameters in nature. On the other hand, the new ISSs employ the property that FIF could represent the inverse of the variance of the ability estimation, let £ be the square root of the reciprocal ofthe Fisher information, d be the absolute deviation between the

  19. "Are vocabulary tests measurement invariant between age groups? An item response analysis of three popular tests": Correction to Fox, Berry, and Freeman (2014).

    Science.gov (United States)

    2016-08-01

    Reports an error in "Are vocabulary tests measurement invariant between age groups? An item response analysis of three popular tests" by Mark C. Fox, Jane M. Berry and Sara P. Freeman (Psychology and Aging, 2014[Dec], Vol 29[4], 925-938). In the article, unneeded zeros were inadvertently included at the beginnings of some numbers in Tables 1–4. In addition, the right column in Table 4 includes three unnecessary zeros after asterisks. (The following abstract of the original article appeared in record 2014-49140-001.) Relatively high vocabulary scores of older adults are generally interpreted as evidence that older adults possess more of a common ability than younger adults. Yet, this interpretation rests on empirical assumptions about the uniformity of item-response functions between groups. In this article, we test item response models of differential responding against datasets containing younger-, middle-aged-, and older-adult responses to three popular vocabulary tests (the Shipley, Ekstrom, and WAIS–R) to determine whether members of different age groups who achieve the same scores have the same probability of responding in the same categories (e.g., correct vs. incorrect) under the same conditions. Contrary to the null hypothesis of measurement invariance, datasets for all three tests exhibit substantial differential responding. Members of different age groups who achieve the same overall scores exhibit differing response probabilities in relation to the same items (differential item functioning) and appear to approach the tests in qualitatively different ways that generalize across items. Specifically, younger adults are more likely than older adults to leave items unanswered for partial credit on the Ekstrom, and to produce 2-point definitions on the WAIS–R. Yet, older adults score higher than younger adults, consistent with most reports of vocabulary outcomes in the cognitive aging literature. In light of these findings, the most generalizable

  20. 艾森克人格问卷项目质量的项目反应理论分析%AN ITEM ANALYSIS OF EPQ ON THE ITEM RESPONSE THEORY

    Institute of Scientific and Technical Information of China (English)

    杨建原; 何壮; 赵守盈

    2012-01-01

    It is 30 years since Eysenek' s personality (EPQ) theory was first introduced to China, and the latest norm was published in 2000. Exposed too much in the past 10 years, its applicability needs to be tested empirically again with present samples. The aim of this paper is to analyze EPQ' s items' properties under IRT, focusing on the measurement accuracy of the items. The program MULTILOG 7. 03 was employed as the tool for parameter estimation. In the estimating procedure, maximum likelihood estimation and two parameter logistic model were utilized. The parameters of difficulty, discrimination and information curve were analyzed in detail. The results indicated that the data accorded with the basic assumptions in IRT; unidimensionality, monotone increasing and invariance of parameter estimation etc. The difficulty and discrimination of most of the EPQ' s items met the theoretical requirements, which demonstrated the revision of the questionnaire was quite successful; many erroneous judgments, nevertheless, should be aroused if it is applied to make assessments or interventions to the subject due to its limited amount of information obtained from the subject. As an essential complement to CTT to analyze items, IRT will be widely used in psychological and education test studies in the future.%使用MULTILOG 7.03软件的边际极大似然估计法,选取双参数Logistic模型对某大学2011级新生的EPQ测试数据进行分析,针对项目区分度、项目难度、信息量等指标对各项目及各分量表进行深入探讨.结果显示:数据符合项目反应理论的基本假设;多数项目的区分度、难度达到理论要求.但E、P、N三个分测验在划界分数点上得到的信息量有限,难以对被试做出良好的区分;三个分测验各自的总信息量未达到理论要求.

  1. A New Extension of the Binomial Error Model for Responses to Items of Varying Difficulty in Educational Testing and Attitude Surveys.

    Directory of Open Access Journals (Sweden)

    James A Wiley

    Full Text Available We put forward a new item response model which is an extension of the binomial error model first introduced by Keats and Lord. Like the binomial error model, the basic latent variable can be interpreted as a probability of responding in a certain way to an arbitrarily specified item. For a set of dichotomous items, this model gives predictions that are similar to other single parameter IRT models (such as the Rasch model but has certain advantages in more complex cases. The first is that in specifying a flexible two-parameter Beta distribution for the latent variable, it is easy to formulate models for randomized experiments in which there is no reason to believe that either the latent variable or its distribution vary over randomly composed experimental groups. Second, the elementary response function is such that extensions to more complex cases (e.g., polychotomous responses, unfolding scales are straightforward. Third, the probability metric of the latent trait allows tractable extensions to cover a wide variety of stochastic response processes.

  2. A New Extension of the Binomial Error Model for Responses to Items of Varying Difficulty in Educational Testing and Attitude Surveys.

    Science.gov (United States)

    Wiley, James A; Martin, John Levi; Herschkorn, Stephen J; Bond, Jason

    2015-01-01

    We put forward a new item response model which is an extension of the binomial error model first introduced by Keats and Lord. Like the binomial error model, the basic latent variable can be interpreted as a probability of responding in a certain way to an arbitrarily specified item. For a set of dichotomous items, this model gives predictions that are similar to other single parameter IRT models (such as the Rasch model) but has certain advantages in more complex cases. The first is that in specifying a flexible two-parameter Beta distribution for the latent variable, it is easy to formulate models for randomized experiments in which there is no reason to believe that either the latent variable or its distribution vary over randomly composed experimental groups. Second, the elementary response function is such that extensions to more complex cases (e.g., polychotomous responses, unfolding scales) are straightforward. Third, the probability metric of the latent trait allows tractable extensions to cover a wide variety of stochastic response processes.

  3. Unidimensional Interpretations for Multidimensional Test Items

    Science.gov (United States)

    Kahraman, Nilufer

    2013-01-01

    This article considers potential problems that can arise in estimating a unidimensional item response theory (IRT) model when some test items are multidimensional (i.e., show a complex factorial structure). More specifically, this study examines (1) the consequences of model misfit on IRT item parameter estimates due to unintended minor item-level…

  4. The Next Big Steps for Munitions Response. Classification Applied to Munitions Response - Development

    Science.gov (United States)

    2011-11-01

    ADDRESS(ES) U.S. Army Engineering and Support Center, Huntsville (USAESCH), 4820 University Square,Huntsville,AL,35816 8. PERFORMING ORGANIZATION...MUNITIONS RESPONSE MR. ANDREW SCHWARTZ U.S. Army Engineering and Support Center, Huntsville (USAESCH) 4820 University Square Huntsville, AL 35816...remaining anomalies Map Entire  MRS Blind seed  ISO Apply adjustments to  procedures or decision  boundaries, as appropriate,  to improve performance

  5. Utilização da Teoria da Resposta ao Item (TRI para a organização de um banco de itens destinados a avaliação do raciocínio verbal Using the Item Response Theory (IRT in the construction of an item bank for the evaluation of verbal reasoning

    Directory of Open Access Journals (Sweden)

    Wagner Bandeira Andriola

    1998-01-01

    Full Text Available Esta pesquisa objetivou a organização de um banco de itens destinados a avaliação do raciocínio verbal, utilizando a Teoria de Respostas ao Item (TRI. Com as respostas de 730 alunos do 2º grau, cuja idade média foi de 17,7 anos (DP = 3,12 fornecidas a um grupo de 51 itens em formato de analogias verbais, estimou-se a dificuldade e a discriminação através do modelo longístico de dois parâmetros. Também foram determinadas as curvas características dos itens (CCIs.The purpose of this research was to organize an item bank for the evaluation of verbal reasoning using the Item Response Theory (IRT. With the responses of 730 high school students, average age 17,7 (SD = 3,12, to a group of 51 itens in the form of verbal analogies, the difficulty and discrimination were estimated using the longistic model of two parameters. The itens characteristic curves (ICC’s were also determined.

  6. Developing a Numerical Ability Test for Students of Education in Jordan: An Application of Item Response Theory

    Science.gov (United States)

    Abed, Eman Rasmi; Al-Absi, Mohammad Mustafa; Abu shindi, Yousef Abdelqader

    2016-01-01

    The purpose of the present study is developing a test to measure the numerical ability for students of education. The sample of the study consisted of (504) students from 8 universities in Jordan. The final draft of the test contains 45 items distributed among 5 dimensions. The results revealed that acceptable psychometric properties of the test;…

  7. Item response theory was used to shorten EORTC QLQ-C30 scales for use in palliative care

    NARCIS (Netherlands)

    M.A. Petersen; M. Groenvold; N. Aaronson; J. Blazeby; Y. Brandberg; A. de Graeff; P. Fayers; E. Hammerlid; M. Sprangers; G. Velikova; J.B. Bjorner

    2006-01-01

    Background and Objective: The goal was to develop a shortened version of the EORTC QLQ-C30 for use in palliative care. We wanted to keep as few items as possible in each scale while still being able to compare results with studies using the original scales. We examined the possibilities of shortenin

  8. The D-Optimality Item Selection Criterion in the Early Stage of CAT: A Study with the Graded Response Model

    Science.gov (United States)

    Passos, Valeria Lima; Berger, Martijn P. F.; Tan, Frans E. S.

    2008-01-01

    During the early stage of computerized adaptive testing (CAT), item selection criteria based on Fisher"s information often produce less stable latent trait estimates than the Kullback-Leibler global information criterion. Robustness against early stage instability has been reported for the D-optimality criterion in a polytomous CAT with the…

  9. Detection of Differential Item Functioning Using the Lasso Approach

    Science.gov (United States)

    Magis, David; Tuerlinckx, Francis; De Boeck, Paul

    2015-01-01

    This article proposes a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis, we propose the "LR lasso DIF method": logistic regression (LR) model is formulated for all item responses. The model contains item-specific intercepts,…

  10. Structural dynamic responses analysis applying differential quadrature method

    Institute of Scientific and Technical Information of China (English)

    PU Jun-ping; ZHENG Jian-jun

    2006-01-01

    Unconditionally stable higher-order accurate time step integration algorithms based on the differential quadrature method (DQM) for second-order initial value problems were applied and the quadrature rules of DQM, computing of the weighting coefficients and choices of sampling grid points were discussed. Some numerical examples dealing with the heat transfer problem, the second-order differential equation of imposed vibration of linear single-degree-of-freedom systems and double-degree-of-freedom systems, the nonlinear move differential equation and a beam forced by a changing load were computed,respectively. The results indicated that the algorithm can produce highly accurate solutions with minimal time consumption, and that the system total energy can remain conservative in the numerical computation.

  11. Applying Bayesian belief networks in rapid response situations

    Energy Technology Data Exchange (ETDEWEB)

    Gibson, William L [Los Alamos National Laboratory; Deborah, Leishman, A. [Los Alamos National Laboratory; Van Eeckhout, Edward [Los Alamos National Laboratory

    2008-01-01

    The authors have developed an enhanced Bayesian analysis tool called the Integrated Knowledge Engine (IKE) for monitoring and surveillance. The enhancements are suited for Rapid Response Situations where decisions must be made based on uncertain and incomplete evidence from many diverse and heterogeneous sources. The enhancements extend the probabilistic results of the traditional Bayesian analysis by (1) better quantifying uncertainty arising from model parameter uncertainty and uncertain evidence, (2) optimizing the collection of evidence to reach conclusions more quickly, and (3) allowing the analyst to determine the influence of the remaining evidence that cannot be obtained in the time allowed. These extended features give the analyst and decision maker a better comprehension of the adequacy of the acquired evidence and hence the quality of the hurried decisions. They also describe two example systems where the above features are highlighted.

  12. Natural History of Dependency in the Elderly: A 24-Year Population-Based Study Using a Longitudinal Item Response Theory Model.

    Science.gov (United States)

    Edjolo, Arlette; Proust-Lima, Cécile; Delva, Fleur; Dartigues, Jean-François; Pérès, Karine

    2016-02-15

    We aimed to describe the hierarchical structure of Instrumental Activities of Daily Living (IADL) and basic Activities of Daily Living (ADL) and trajectories of dependency before death in an elderly population using item response theory methodology. Data were obtained from a population-based French cohort study, the Personnes Agées QUID (PAQUID) Study, of persons aged ≥65 years at baseline in 1988 who were recruited from 75 randomly selected areas in Gironde and Dordogne. We evaluated IADL and ADL data collected at home every 2-3 years over a 24-year period (1988-2012) for 3,238 deceased participants (43.9% men). We used a longitudinal item response theory model to investigate the item sequence of 11 IADL and ADL combined into a single scale and functional trajectories adjusted for education, sex, and age at death. The findings confirmed the earliest losses in IADL (shopping, transporting, finances) at the partial limitation level, and then an overlapping of concomitant IADL and ADL, with bathing and dressing being the earliest ADL losses, and finally total losses for toileting, continence, eating, and transferring. Functional trajectories were sex-specific, with a benefit of high education that persisted until death in men but was only transient in women. An in-depth understanding of this sequence provides an early warning of functional decline for better adaptation of medical and social care in the elderly.

  13. Teoria de Resposta ao Item na análise de uma prova de estatística em universitários Item Response Theory to analyze a statistics test in university students

    Directory of Open Access Journals (Sweden)

    Claudette Maria Medeiros Vendramini

    2005-12-01

    Full Text Available Este estudo objetivou aplicar a Teoria de Resposta ao Item na análise das 15 questões de múltipla escolha de uma prova de estatística apresentada na forma de gráficos ou de tabelas estatísticas. Participaram 413 universitários, selecionados por conveniência, de duas instituições da rede particular de ensino superior, predominantemente do curso de Psicologia (91,5%. Os universitários foram 80% do gênero feminino e do período diurno (69,8%, com idades de 16 a 53 anos, média 24,4 e desvio padrão 7,4. A prova é predominantemente unidimensional e os itens são mais bem ajustados ao modelo logístico de três parâmetros. Os índices de discriminação, dificuldade e correlação bisserial apresentam valores aceitáveis. Os resultados mostram as dificuldades apresentadas pelos estudantes com relação aos conceitos matemáticos e estatísticos, dificuldades essas observadas em outras pesquisas desde o ensino fundamental. Sugere-se que esses conceitos sejam tratados mais profundamente no ensino superior.This study aimed to use the Item Response Theory to analyze the 15 multiple-choice questions of a statistics test presented in the statistics graphics or tables form. The 414 university students were selected by convenience from two private universities, predominantly psychology students (91.5%. The university students were 80% female and with 16-53 years old, mean 24.4 and standard deviation 7.4. The test has predominantly one dimension and the items can be better fitting to the model of three parameters. The indexes of difficulty, discrimination and bisserial correlation presented acceptable values. The results indicate the difficulties of university students in the mathematic and statistic concepts, that difficulties are observed in the other studies since the elementary education. One suggests making more profound studies of these concepts in higher education.

  14. Making Life Easier with Effort: Basic Findings and Applied Research on Response Effort.

    Science.gov (United States)

    Friman, Patrick C.; Poling, Alan

    1995-01-01

    This paper summarizes basic research on response effort in diverse applied areas including deceleration of aberrant behavior, attention deficit-hyperactivity disorder, oral habits, littering, and problem solving. The paper concludes that response effort as an independent variable has potent effects, and research exploring the applied benefits of…

  15. Item calibration in incomplete testing designs

    NARCIS (Netherlands)

    Eggen, Theo J.H.M.; Verhelst, Norman D.

    2011-01-01

    This study discusses the justifiability of item parameter estimation in incomplete testing designs in item response theory. Marginal maximum likelihood (MML) as well as conditional maximum likelihood (CML) procedures are considered in three commonly used incomplete designs: random incomplete, multis

  16. 医生工作站电子申请检验项目时附加费用的智能收取%Additional fees charged intelligently while test items electronically applied for on Doctor workstation

    Institute of Scientific and Technical Information of China (English)

    范久波; 刘海菊; 刘晓东

    2011-01-01

    Objective: To explore the realization and the value of additional fees charged intelligently while test items electronically applied for on Doctor workstation. Methods: The single item or combination tests grouped into many types. In each group first the cost for blood collection and materials expenses to be added one time into list and then compare sample type between each items in the group and that been added previously. If different, the material expense is added again. When there are multiple groups, the material expense is added one time for each group first, and then each subjected to the same judgment process. Results: When clinicians select the test items, the cost for blood collection and materials expenses needed be added into Doctor's advice automatically. When removing an item, only need deleting the item directly, additional fees needed to retain for the remaining items determined by judgment program, unnecessary deleted automatically. In special circumstances additional fees charged according to actual situation. Conclusion: Additional fees charged intelligently while test items electronically applied for on Doctor workstation, indicates that while pays great attention on big and complete function in His construction, also needs to pay more attention on small and the fine detail, in order to facilitate the daily work.%目的:探讨医生工作站电子申请检验项目时附加费用的智能收取的实现及应用价值.方法:检验单项和检验组套进行分组;每组内先根据标本类型收取采血费和一次卫材费,然后将标本类型同组内样本类型进行比较,不同则加收一次.有多个分组时,每组先加收一次卫材费,然后组内判断循环同上.特殊情况设置特殊的处理方案.结果:临床医生选中检验项目时,采样所需要收取的卫材费及采血费,自动添加到医嘱中.当去掉某一项目时只需删除项目,卫材费及采血费由收费程序判断保留,多余的自动删

  17. How we know it hurts: item analysis of written narratives reveals distinct neural responses to others' physical pain and emotional suffering.

    Directory of Open Access Journals (Sweden)

    Emile Bruneau

    Full Text Available People are often called upon to witness, and to empathize with, the pain and suffering of others. In the current study, we directly compared neural responses to others' physical pain and emotional suffering by presenting participants (n = 41 with 96 verbal stories, each describing a protagonist's physical and/or emotional experience, ranging from neutral to extremely negative. A separate group of participants rated "how much physical pain", and "how much emotional suffering" the protagonist experienced in each story, as well as how "vivid and movie-like" the story was. Although ratings of Pain, Suffering and Vividness were positively correlated with each other across stories, item-analyses revealed that each scale was correlated with activity in distinct brain regions. Even within regions of the "Shared Pain network" identified using a separate data set, responses to others' physical pain and emotional suffering were distinct. More broadly, item analyses with continuous predictors provided a high-powered method for identifying brain regions associated with specific aspects of complex stimuli - like verbal descriptions of physical and emotional events.

  18. The development and discussion of computerized visual perception assessment tool for Chinese characters structures - Concurrent estimation of the overall ability and the domain ability in item response theory approach.

    Science.gov (United States)

    Wu, Huey-Min; Lin, Chin-Kai; Yang, Yu-Mao; Kuo, Bor-Chen

    2014-11-12

    Visual perception is the fundamental skill required for a child to recognize words, and to read and write. There was no visual perception assessment tool developed for preschool children based on Chinese characters in Taiwan. The purposes were to develop the computerized visual perception assessment tool for Chinese Characters Structures and to explore the psychometrical characteristic of assessment tool. This study adopted purposive sampling. The study evaluated 551 kindergarten-age children (293 boys, 258 girls) ranging from 46 to 81 months of age. The test instrument used in this study consisted of three subtests and 58 items, including tests of basic strokes, single-component characters, and compound characters. Based on the results of model fit analysis, the higher-order item response theory was used to estimate the performance in visual perception, basic strokes, single-component characters, and compound characters simultaneously. Analyses of variance were used to detect significant difference in age groups and gender groups. The difficulty of identifying items in a visual perception test ranged from -2 to 1. The visual perception ability of 4- to 6-year-old children ranged from -1.66 to 2.19. Gender did not have significant effects on performance. However, there were significant differences among the different age groups. The performance of 6-year-olds was better than that of 5-year-olds, which was better than that of 4-year-olds. This study obtained detailed diagnostic scores by using a higher-order item response theory model to understand the visual perception of basic strokes, single-component characters, and compound characters. Further statistical analysis showed that, for basic strokes and compound characters, girls performed better than did boys; there also were differences within each age group. For single-component characters, there was no difference in performance between boys and girls. However, again the performance of 6-year-olds was better than

  19. Modifying parents peer attachment scale with item response theory%用项目反应理论修订父母同伴依恋量表

    Institute of Scientific and Technical Information of China (English)

    臧运洪; 赵守盈; 陈维; 潘运; 张禹

    2012-01-01

    The item discrimination, difficulty and information peak function of the item response theory are used to revise parents peer attachment scale produced by Armsden and Greenberg ( 1991 ), the purpose is that this scale revised is more accurate to survey the status of parents peer attachment of Chinese youth. SPSS15.0 software is used to manage data , using MULTILOG 7.03 software to analysis parameters, using AMOS4.0 to test the verification revised. Results are as follows : 1. Parents peer attachment scale is one-dimensional which can be revised by item response theory. 2. The item discrimination a, difficulty b of new scale are with reasonable scope. 3. The test information peak function of new scale is smaller and has a higher reliability. New father and peer attachment scale contain two factors: trust and communication. New mather attachment scale include factors: trust, communication and alienation,which have the same factors with the original scale . Surveyed officially, the scale revised can effectively survey the status of parents peer attachment of Chinese miao youth.%应用项目反应理论的区分度、难度和信息函数峰值3个参数对Armsden和Greenberg(1991)的父母同伴依恋量表进行修订,目的:使修订后的量表更能精确地调查中国初中生的依恋现状。结果:父母同伴依恋量表符合单维性检验,可以根据项目反应理论进行修订。新量表的区分度a值和难度b值具有合理的取值范围。新量表的测验信息峰值函数变小,具有更高的信度。新父亲和同伴依恋量表均包含两个因子:信任和沟通。新母亲依恋量表包含的因子个数和原量表相同:信任、沟通和疏离。经正式施测,修订后的量表可以有效地调查中国苗族初中生的依恋现状。

  20. Generalized Full-Information Item Bifactor Analysis

    Science.gov (United States)

    Cai, Li; Yang, Ji Seung; Hansen, Mark

    2011-01-01

    Full-information item bifactor analysis is an important statistical method in psychological and educational measurement. Current methods are limited to single-group analysis and inflexible in the types of item response models supported. We propose a flexible multiple-group item bifactor analysis framework that supports a variety of…

  1. 教育考试中短测验的分析方法——基于两种项目反应理论方法的比较研究%Item Analysis of Short Test in Educational Testing: Comparative Study on Parameter and Non-parameter Item Response Theory

    Institute of Scientific and Technical Information of China (English)

    何壮; 袁淑莉; 赵守盈

    2012-01-01

    教育考试中专题、短测验等形式是命题的一种主要方式。对这类测验的分析,可以从参数项目反应理论和非参数项目反应理论入手。本研究分别选取Rasch模型和Mokken模型对某高三文科综合地理试卷进行分析比较。使用winsteps和xeaaibre软件进行Rasch分析,得到难度、信息量、项目功能差异等参数;使用MSP软件进行Mokken分析,得到正答率和同质性系数。比较两种结果,得出以下结论:(1)非参数项目反应理论以正答率对题目排序与参数项目反应理论以难度排序一致;(2)而有个别不符合参数项目反应理论标准的题目对提高测验质量同样有意义,不应被删除;(3)进行维度检验和题目筛选时,非参数项目反应理论标准比参数项目反应理论标准更加严格;(4)两种理论的项目功能差异检验结果一致。%As one of the significant types of tests, the test project and short test are popular in educational testing. Parameter and non-parameter item response theory being the starts, these tests were under analysis. Compared was the geography paper in inaugurated arts taken by some senior three students. During this comparison the Rasch and Mokken model were respectively selected. For analyzing software Winsteps and Xcalibre were utilized to analyze item parameters in Rasch model. Analyzed in detail were the parameters of difficulty, differential item functioning and information curve. Software MSP was for the purpose of analyzing items in Mokken model. Besides, the statistics of accurate rate and coefficients of homogeneity were also analyzed in detail. Finally, four conclusions were arrived at as the following: ( 1 ) The estimate results of difficulty between non-parameter and parameter item response theory were equivalent. (2)Those items, which failed to fit parameter item response theory, succeeded in non-parameter item response theory. (3)Non-parameter item

  2. Principles and procedures of considering item sequence effects in the development of calibrated item pools: Conceptual analysis and empirical illustration

    Directory of Open Access Journals (Sweden)

    Safir Yousfi

    2012-12-01

    Full Text Available Item responses can be context-sensitive. Consequently, composing test forms flexibly from a calibrated item pool requires considering potential context effects. This paper focuses on context effects that are related to the item sequence. It is argued that sequence effects are not necessarily a violation of item response theory but that item response theory offers a powerful tool to analyze them. If sequence effects are substantial, test forms cannot be composed flexibly on the basis of a calibrated item pool, which precludes applications like computerized adaptive testing. In contrast, minor sequence effects do not thwart applications of calibrated item pools. Strategies to minimize the detrimental impact of sequence effects on item parameters are discussed and integrated into a nomenclature that addresses the major features of item calibration designs. An example of an item calibration design demonstrates how this nomenclature can guide the process of developing a calibrated item pool.

  3. DSM-5 alternative personality disorder model traits as maladaptive extreme variants of the five-factor model: An item-response theory analysis.

    Science.gov (United States)

    Suzuki, Takakuni; Samuel, Douglas B; Pahlen, Shandell; Krueger, Robert F

    2015-05-01

    Over the past two decades, evidence has suggested that personality disorders (PDs) can be conceptualized as extreme, maladaptive variants of general personality dimensions, rather than discrete categorical entities. Recognizing this literature, the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) alternative PD model in Section III defines PDs partially through 25 maladaptive traits that fall within 5 domains. Empirical evidence based on the self-report measure of these traits, the Personality Inventory for DSM-5 (PID-5), suggests that these five higher-order domains share a structure and correlate in meaningful ways with the five-factor model (FFM) of general personality. In the current study, item response theory was used to compare the DSM-5 alternative PD model traits to those from a normative FFM inventory (the International Personality Item Pool-NEO [IPIP-NEO]) in terms of their measurement precision along the latent dimensions. Within a combined sample of 3,517 participants, results strongly supported the conclusion that the DSM-5 alternative PD model traits and IPIP-NEO traits are complimentary measures of 4 of the 5 FFM domains (with perhaps the exception of openness to experience vs. psychoticism). Importantly, the two measures yield largely overlapping information curves on these four domains. Differences that did emerge suggested that the PID-5 scales generally have higher thresholds and provide more information at the upper levels, whereas the IPIP-NEO generally had an advantage at the lower levels. These results support the general conceptualization that 4 domains of the DSM-5 alternative PD model traits are maladaptive, extreme versions of the FFM. (PsycINFO Database Record

  4. Losing Items in the Psychogeriatric Nursing Home

    Directory of Open Access Journals (Sweden)

    J. van Hoof PhD

    2016-09-01

    Full Text Available Introduction: Losing items is a time-consuming occurrence in nursing homes that is ill described. An explorative study was conducted to investigate which items got lost by nursing home residents, and how this affects the residents and family caregivers. Method: Semi-structured interviews and card sorting tasks were conducted with 12 residents with early-stage dementia and 12 family caregivers. Thematic analysis was applied to the outcomes of the sessions. Results: The participants stated that numerous personal items and assistive devices get lost in the nursing home environment, which had various emotional, practical, and financial implications. Significant amounts of time are spent on trying to find items, varying from 1 hr up to a couple of weeks. Numerous potential solutions were identified by the interviewees. Discussion: Losing items often goes together with limitations to the participation of residents. Many family caregivers are reluctant to replace lost items, as these items may get lost again.

  5. A Response to "A Description of Merger Applied to the Montana State University Context."

    Science.gov (United States)

    Sexton, Ronald P.; And Others

    1996-01-01

    Contains three responses to Stephen L. Coffman's article appearing in the same issue, "A Description of Merger Applied to the Montana State University Context": one from the chancellor of Montana State University-Billings, one from the president of Montana State University-Bozeman, and one from the commissioner of the Montana State University…

  6. To open or to close: species-specific stomatal responses to simultaneously applied opposing environmental factors.

    Science.gov (United States)

    Merilo, Ebe; Jõesaar, Indrek; Brosché, Mikael; Kollist, Hannes

    2014-04-01

    Plant stomatal responses to single environmental factors are well studied; however, responses to a change in two (or more) factors - a common situation in nature - have been less frequently addressed. We studied the stomatal responses to a simultaneous application of opposing environmental factors in six evolutionarily distant mono- and dicotyledonous herbs representing different life strategies (ruderals, competitors and stress-tolerators) to clarify whether the crosstalk between opening- and closure-inducing pathways leading to stomatal response is universal or species-specific. Custom-made gas exchange devices were used to study the stomatal responses to a simultaneous application of two opposing factors: decreased/increased CO2 concentration and light availability or reduced air humidity. The studied species responded similarly to changes in single environmental factors, but showed species-specific and nonadditive responses to two simultaneously applied opposing factors. The stomata of the ruderals Arabidopsis thaliana and Thellungiella salsuginea (previously Thellungiella halophila) always opened, whereas those of competitor-ruderals either closed in all two-factor combinations (Triticum aestivum), remained relatively unchanged (Nicotiana tabacum) or showed a response dominated by reduced air humidity (Hordeum vulgare). Our results, indicating that in changing environmental conditions species-specific stomatal responses are evident that cannot be predicted from studying one factor at a time, might be interesting for stomatal modellers, too.

  7. Response of self-assembly for magnetite nanocrystal in magnetic fluid under an applied magnetic field

    Institute of Scientific and Technical Information of China (English)

    Yun Zou; Yiyou Nie; Ziyun Di; Dongchen Zhang; Minghuang Sang; Xianfeng Chen

    2008-01-01

    @@ The response time and transmittivity of the magnetic fluid (MF) for different concentrations at room temperature were investigated in this letter. The volume fraction of the investigated sample ranged from 0.44% to 6.47%. It was found that the transmittivity decreased with increasing concentration under a given magnetic field, and the evolution time was changed with different concentrations. Moreover, the light intensity decreased rapidly at the beginning and then became stable when the magnetic field was applied.

  8. Response of vetch, lentil, chickpea and red pea to pre- or post-emergence applied herbicides

    Directory of Open Access Journals (Sweden)

    I. Vasilakoglou

    2013-09-01

    Full Text Available Broad-leaved weeds constitute a serious problem in the production of winter legumes, but few selective herbicides controlling these weeds have been registered in Europe. Four field experiments were conducted in 2009/10 and repeated in 2010/11 in Greece to study the response of common vetch (Vicia sativa L., lentil (Lens culinaris Medik., chickpea (Cicer arietinum L. and red pea (Lathyrus cicera L. to several rates of the herbicides pendimethalin, S-metolachlor, S-metolachlor plus terbuthylazine and flumioxazin applied pre-emergence, as well as imazamox applied post-emergence. Phytotoxicity, crop height, total weight and seed yield were evaluated during the experiments. The results of this study suggest that common vetch, lentil, chickpea and red pea differed in their responses to the herbicides tested. Pendimethalin at 1.30 kg ha-1, S-metolachlor at 0.96 kg ha-1 and flumioxazine at 0.11 kg ha-1 used as pre-emergence applied herbicides provided the least phytotoxicity to legumes. Pendimethalin at 1.98 kg ha-1 and both rates of S-metolachlor plus terbuthylazine provided the greatest common lambsquarters (Chenopodium album L. control. Imazamox at 0.03 to 0.04 kg ha-1 could also be used as early post-emergence applied herbicide in common vetch and red pea without any significant detrimental effect.

  9. The Professional Context as a Predictor for Response Distortion in the Adaption-Innovation Inventory--An Investigation Using Mixture Distribution Item Response Theory Models

    Science.gov (United States)

    Fischer, Sebastian; Freund, Philipp Alexander

    2014-01-01

    The Adaption-Innovation Inventory (AII), originally developed by Kirton (1976), is a widely used self-report instrument for measuring problem-solving styles at work. The present study investigates how scores on the AII are affected by different response styles. Data are collected from a combined sample (N = 738) of students, employees, and…

  10. Aplicação da TRI em uma medida de avaliação da compreensão de leitura Use of the item response theory on a measure for reading comprehension assessment

    Directory of Open Access Journals (Sweden)

    Lucas de Francisco Carvalho

    2013-01-01

    Full Text Available Este trabalho objetivou verificar os parâmetros dos itens e dos sujeitos, por meio da Teoria de Resposta ao Item (TRI, em uma medida de avaliação da compreensão de leitura, englobando análises quantitativas e qualitativas do mapa de itens, assim como investigar a presença de funcionamento diferencial dos itens (DIF. Participaram 518 crianças do 3º, 4º e 5º anos do ensino fundamental, com idades entre 6 e 16 anos, de escolas particular e pública de Belo Horizonte. Utilizou-se um texto elaborado de acordo com a técnica de Cloze. Foi confirmada a unidimensionalidade do instrumento; verificou-se média de theta maior que a média de dificuldade dos itens; e, a presença de DIF foi observada em alguns itens de acordo com os anos de ensino. Tais resultados demonstraram evidências de validade para o instrumento e são discutidos no trabalho.The objective of the present study was to verify the parameters of items and people by using the Item Response Theory (IRT in a reading comprehension measurement, including quantitative and qualitative analyses of the items map as well as to investigate the presence of Differential Item Functioning (DIF. The sample consisted of 518 children from the 3rd, 4th and 5th grades, aged from 6 to 16, from private and public schools in the city of Belo Horizonte-MG. The instrument was a text prepared according to Cloze technique. The data confirmed the unidimensionality of the instrument; showed average theta higher than the average of items; and, the presence of DIF was observed in some items in relation to the school grades. The results demonstrated validity evidence for the instrument and are discussed in this paper.

  11. Optimization of applied non-axisymmetric magnetic perturbations using multimodal plasma response on DIII-D

    Science.gov (United States)

    Weisberg, D. B.; Paz-Soldan, C.; Lanctot, M. J.; Strait, E. J.; Evans, T. E.

    2016-10-01

    The plasma response to proposed 3D coil geometries in the DIII-D tokamak is investigated using the linear MHD plasma response code MARS-F. An extensive examination of low- and high-field side coil arrangements shows the potential to optimize the coupling between imposed non-axisymmetric magnetic perturbations and the total plasma response by varying the toroidal and poloidal spectral content of the applied field. Previous work has shown that n=2 and n=3 perturbations can suppress edge-localized modes (ELMs) in cases where the applied field's coupling to resonant surfaces is enhanced by amplifying marginally-stable kink modes. This research is extended to higher n-number configurations of 2 to 3 rows with up to 12 coils each in order to advance the physical understanding and optimization of both the resonant and non-resonant responses. Both in- and ex-vessel configurations are considered. The plasma braking torque is also analyzed, and coil geometries with favorable plasma coupling characteristics are discussed. Work supported by GA internal funds.

  12. Single-cell E. coli response to an instantaneously applied chemotactic signal.

    Science.gov (United States)

    Sagawa, Takashi; Kikuchi, Yu; Inoue, Yuichi; Takahashi, Hiroto; Muraoka, Takahiro; Kinbara, Kazushi; Ishijima, Akihiko; Fukuoka, Hajime

    2014-08-05

    In response to an attractant or repellant, an Escherichia coli cell controls the rotational direction of its flagellar motor by a chemotaxis system. When an E. coli cell senses an attractant, a reduction in the intracellular concentration of a chemotaxis protein, phosphorylated CheY (CheY-P), induces counterclockwise (CCW) rotation of the flagellar motor, and this cellular response is thought to occur in several hundred milliseconds. Here, to measure the signaling process occurring inside a single E. coli cell, including the recognition of an attractant by a receptor cluster, the inactivation of histidine kinase CheA, and the diffusion of CheY and CheY-P molecules, we applied a serine stimulus by instantaneous photorelease from a caged compound and examined the cellular response at a temporal resolution of several hundred microseconds. We quantified the clockwise (CW) and CCW durations immediately after the photorelease of serine as the response time and the duration of the response, respectively. The results showed that the response time depended on the distance between the receptor and motor, indicating that the decreased CheY-P concentration induced by serine propagates through the cytoplasm from the receptor-kinase cluster toward the motor with a timing that is explained by the diffusion of CheY and CheY-P molecules. The response time included 240 ms for enzymatic reactions in addition to the time required for diffusion of the signaling molecule. The measured response time and duration of the response also revealed that the E. coli cell senses a similar serine concentration regardless of whether the serine concentration is increasing or decreasing. These detailed quantitative findings increase our understanding of the signal transduction process that occurs inside cells during bacterial chemotaxis.

  13. Item Calibration in Incomplete Testing Designs

    Science.gov (United States)

    Eggen, Theo J. H. M.; Verhelst, Norman D.

    2011-01-01

    This study discusses the justifiability of item parameter estimation in incomplete testing designs in item response theory. Marginal maximum likelihood (MML) as well as conditional maximum likelihood (CML) procedures are considered in three commonly used incomplete designs: random incomplete, multistage testing and targeted testing designs.…

  14. 量表评估效度的项目反应理论%Item response theory for measurement validity

    Institute of Scientific and Technical Information of China (English)

    Yang FM; Kao ST

    2014-01-01

    项目反应理论(Item response theory,IRT)是用来评估精神病学领域那些尚未被充分使用的测量量表效度一种重要方法.IRT描述了潜在心理特征(例如,该量表拟评估心理问题的架构)、量表中各项目的属性、以及被测试者对各项目应答之间的关系.本文介绍了IRT的基本前提,假设和方法.为了帮助解释这些概念,我们依据流行病学调查中心抑郁量表修订版中三个答案为是/否二分类选项的问题制定了一个假设的量表.流行病学调查中心抑郁量表已经用于19,399被测试者.我们首先用因子分析确认这三个项目的单维性,然后用Mplus软件建立2-ParameterLogic (2-PL) IRT模型,这是一种用来评估量表中各项目两两差异和项目难度的方法.本文将就这些分析结果的临床意义和在量表结构中的用途展开讨论.

  15. In vitro percutaneous penetration of topically applied capsaicin in relation to in vivo sensation responses.

    Science.gov (United States)

    Magnusson, B M; Koskinen, L D

    2000-02-15

    Capsaicin, the primary pungent element in several spices, elicits a variety of physiological effects which are due to neurogenic responses. The aim of the study was to explore the in vivo sensation responses of capsaicin and to compare the results with the in vitro percutaneous absorption of the substance. The overall objectives were to determining an in vitro-in vivo correlation for capsaicin. Capsaicin was applied in a chamber on the volar forearm of twelve volunteers and in a flow-through diffusion chamber on excised human epidermal membranes. Topical administration of capsaicin produced a complex cutaneous sensation that changed in intensity and quality as a function of time and was characterized by sting, prick, burn and pain. Percutaneous steady-state penetrations of capsaicin with a receptor fluid consisting either of 4% bovine serum albumin in phosphate buffered saline or 50% ethanol in water were 28.2+/-2.7 and 29.6+/-2.9 microg/cm(2) per h, respectively. The corresponding cumulative penetrated amounts of capsaicin after 30 min were 14. 7+/-1.7 and 19.2+/-2.1 microg/cm(2), respectively. The present investigation indicates that there is a good correlation between in vivo physiological responses and in vitro percutaneous penetration of topically applied capsaicin.

  16. Análise de Teoria de Resposta ao Item de um instrumento breve de avaliação de comportamentos antissociais = Item Response Theory Analysis of a brief instrument for assessing antisocial behaviors

    Directory of Open Access Journals (Sweden)

    Hauck Filho, Nelson

    2014-01-01

    Full Text Available Comportamentos antissociais são comuns a diversas condições psicopatológicas, incluindo transtornos da personalidade (e. g. , antissocial e narcisista e transtornos do humor (e. g. , transtorno bipolar. Todavia, até o momento, havia uma importante lacuna no contexto brasileiro no que diz respeito à avaliação breve dos comportamentos antissociais em indivíduos adultos de contextos não carcerários. Em virtude disso, o presente estudo teve como objetivo a construção e a análise mediante Teoria de Resposta ao Item de um instrumento breve para uso em pesquisas e rastreio junto à população geral adulta. As análises das respostas de 204 estudantes universitários (média de idades = 23,56 anos; DP = 7,70; 60,6% mulheres a um conjunto de itens permitiram reter 13 itens com excelentes propriedades psicométricas. Esses itens se mostraram avaliativos de um fator geral de antissocialidade, interpretável como uma propensão ao antagonismo, à não cooperação e à agressão em uma diversidade de contextos sociais. Limitações do estudo são discutidas ao final

  17. Applying Total Physical Response(TPR)Theory to Teaching Chinese Children English

    Institute of Scientific and Technical Information of China (English)

    张院院

    2015-01-01

    Now it has become a fashion in our society that young learners aged from 6 or even younger participate in foreign language learning.With the Second Language Acquisition theories,it is believed that learning a foreign language from the childhood can facilitate the learning.Children need a teaching method which conforms to their psychological and physical characteristics.American psychologist James Asher develops Total Physical Response,which advocates leaning through physical actions.He believes that children should learn a foreign language happily and confidently,just like the process of acquiring their mother tongue.However,Total Physical Response can not be applied effectively in the teaching process due to children’s instincts and characteristics.If there is a way or strategy which takes advantage of children’s characteristics and control their behavior in class,the teaching results would be more satisfying.

  18. Applying Total Physical Response(TPR)Theory to Teaching Chinese Children English

    Institute of Scientific and Technical Information of China (English)

    张院院

    2015-01-01

    [Abstrac]Now it has become a fashion in our society that young learners aged from 6 or even younger participate in foreign language learning.With the Second Language Acquisition theories, it is believed that learning a foreign language from the childhood can facilitate the learning.Children need a teaching method which conforms to their psychological and physical characteristics.American psychologist James Asher develops Total Physical Response, which advocates leaning through physical actions.He believes that children should learn a foreign language happily and confidently, just like the process of acquiring their mother tongue.However, Total Physical Response can not be applied effectively in the teaching process due to children's instincts and characteristics.If there is a way or strategy which takes advantage of children's characteristics and control their behavior in class, the teaching results would be more satisfying.

  19. Academic freedom and the professional responsibilities of applied ethicists: a comment on Minerva.

    Science.gov (United States)

    Dawson, Angus; Herington, Jonathan

    2014-05-01

    Academic freedom is an important good, but it comes with several responsibilities. In this commentary we seek to do two things. First, we argue against Francesca Minerva's view of academic freedom as presented in her article 'New threats to academic freedom' on a number of grounds. We reject the nature of the absolutist moral claim to free speech for academics implicit in the article; we reject the elitist role for academics as truth-seekers explicit in her view; and we reject a possible more moderate re-construction of her view based on the harm/offence distinction. Second, we identify some of the responsibilities of applied ethicists, and illustrate how they recommend against allowing for anonymous publication of research. Such a proposal points to the wider perils of a public discourse which eschews the calm and careful discussion of ideas.

  20. Evaluation of energy response of neutron rem monitor applied to high-energy accelerator facilities

    Energy Technology Data Exchange (ETDEWEB)

    Nakane, Yoshihiro; Harada, Yasunori; Sakamoto, Yukio [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment] [and others

    2003-03-01

    A neutron rem monitor was newly developed for applying to the high-intensity proton accelerator facility (J-PARC) that is under construction as a joint project between the Japan Atomic Energy Research Institute and the High Energy Accelerator Research Organization. To measure the dose rate accurately for wide energy range of neutrons from thermal to high-energy region, the neutron rem monitor was fabricated by adding a lead breeder layer to a conventional neutron rem monitor. The energy response of the monitor was evaluated by using neutron transport calculations for the energy range from thermal to 150 MeV. For verifying the results, the response was measured at neutron fields for the energy range from thermal to 65 MeV. The comparisons between the energy response and dose conversion coefficients show that the newly developed neutron rem monitor has a good performance in energy response up to 150 MeV, suggesting that the present study offered prospects of a practical fabrication of the rem monitor applicable to the high intensity proton accelerator facility. (author)

  1. An emotional functioning item bank of 24 items for computerized adaptive testing (CAT) was established

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Gamper, Eva-Maria; Costantini, Anna

    2016-01-01

    OBJECTIVE: To improve measurement precision, the European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group is developing an item bank for computerized adaptive testing (CAT) of emotional functioning (EF). The item bank will be within the conceptual framework...... of the widely used EORTC Quality of Life questionnaire (QLQ-C30). STUDY DESIGN AND SETTING: On the basis of literature search and evaluations by international samples of experts and cancer patients, 38 candidate items were developed. The psychometric properties of the items were evaluated in a large...... international sample of cancer patients. This included evaluations of dimensionality, item response theory (IRT) model fit, differential item functioning (DIF), and of measurement precision/statistical power. RESULTS: Responses were obtained from 1,023 cancer patients from four countries. The evaluations showed...

  2. 用项目反应理论分析自陈量表时最佳模型的选择%Choice of optimal item response model for analysis of self-report questionnaire

    Institute of Scientific and Technical Information of China (English)

    周晶; 郭庆科

    2005-01-01

    log-likelihood with the least items whose mean square residual error were greater than 2, and the volume of test information was greater than that provided by one-parameter model and no less than the three-parameter model. Therefore, the two-parameter logistic model was the best for 2-point scoring model. But the measurement precision of two-parameter Logistic model was lower than that of multi-grade response model.CONCLUSION: When 2-point items are adopted in self-report questionnaire, 2-parameter logistic model can be applied but not 1- or 3-parameter Logistic models. But when the questionnaire uses items that have more than 2 response grades, the measurement precision can be better than that of 2-point data. Merge of the options for the items may result in lowered measurement precision.

  3. The effect of item and person misfit on selection decisions : An empirical study

    NARCIS (Netherlands)

    Meijer, Rob R.; Tendeiro, Jorge N.

    2015-01-01

    Item response theory (IRT) is a mathematical model that is often applied in the development and analysis of educational and psychological assessments. Various IRT models exist, and practitioners must choose the model that is most appropriate for their particular assessment. Even when the most approp

  4. A Study of the Effects of Variation of Short-Term Memory Load, Reading Response Length, and Processing Hierarchy on TOEFL Listening Comprehension Item Performance. Report 33.

    Science.gov (United States)

    Henning, Grant

    Criticisms of the Test of English as a Foreign Language (TOEFL) have included speculation that the listening test places too much burden on short-term memory as compared with comprehension, that a knowledge of reading is required to respond successfully, and that many items appear to require mere recall and matching rather than higher-order…

  5. A Practical Guide to Check the Consistency of Item Response Patterns in Clinical Research Through Person-Fit Statistics : Examples and a Computer Program

    NARCIS (Netherlands)

    Meijer, Rob R.; Niessen, A. Susan M.; Tendeiro, Jorge N.

    2016-01-01

    Although there are many studies devoted to person-fit statistics to detect inconsistent item score patterns, most studies are difficult to understand for nonspecialists. The aim of this tutorial is to explain the principles of these statistics for researchers and clinicians who are interested in app

  6. The Effect of the Probability of Correct Response on the Variability of Measures of Differential Item Functioning. Program Statistics Research Technical Report No. 94-4.

    Science.gov (United States)

    Zwick, Rebecca

    The Mantel Haenszel (MH; 1959) approach of Holland and Thayer (1988) is a well-established method for assessing differential item functioning (DIF). The formula for the variance of the MH DIF statistic is based on work by Phillips and Holland (1987) and Robins, Breslow, and Greenland (1986). Recent simulation studies showed that the MH variances…

  7. Psychometric properties of three single-item pain scales in patients with rheumatoid arthritis seen during routine clinical care: a comparative perspective on construct validity, reproducibility and internal responsiveness

    OpenAIRE

    Sendlbeck, Melanie; Araujo, Elizabeth G; Schett, Georg; Englbrecht, Matthias

    2015-01-01

    Objective To investigate the construct validity, reproducibility (ie, retest reliability) and internal responsiveness to treatment change of common single-item scales measuring overall pain in patients with rheumatoid arthritis (RA) and to investigate the corresponding effect of common pain-related comorbidities and medical consultation on these outcomes. Methods 236 patients with RA completed a set of questionnaires including a visual analogue scale (VAS), a numerical rating scale (NRS) and ...

  8. Rethinking applied ELT: Life-responsive teaching in ESP classes and learners’ satisfaction with life

    Directory of Open Access Journals (Sweden)

    Ketabi, Saeed

    2013-01-01

    Full Text Available Many philosophers of education as well as researchers have highlighted the importance of life skills training in education. Recently, the idea of life-wise instruction has been imported into the field of English language teaching after the introduction of notions such as applied ELT (Pishghadam, 2011 and life syllabus (Pishghadam & Zabihi, 2012. This study was conducted to analyze L2 learners’ level of life satisfaction and its relationship with their ESP teachers’ life-responsive language teaching perceptions. For this purpose, two instruments, i.e. the Life-Responsive Language Teaching Beliefs Scale (LLTBS and the Satisfaction with Life Scale (SWLS, were administered to Iranian ESP teachers (N = 164 and a sizeable sample of their learners (N = 800, respectively. For one thing, analysis of the questionnaire results displayed low levels of life-wise language teaching perceptions on the part of ESP teachers and low levels of satisfaction with life among learners. The results also demonstrated how language learners’ scores on the satisfaction with life scale were significantly correlated with ESP teachers’ life-responsive teaching beliefs. It was thus concluded that through the integration of life skills in ESP classes, materials developers, syllabus designers and ESP practitioners may become empowered to enhance learners’ quality of life.

  9. Evaluation of item candidates: the PROMIS qualitative item review.

    Science.gov (United States)

    DeWalt, Darren A; Rothrock, Nan; Yount, Susan; Stone, Arthur A

    2007-05-01

    One of the PROMIS (Patient-Reported Outcome Measurement Information System) network's primary goals is the development of a comprehensive item bank for patient-reported outcomes of chronic diseases. For its first set of item banks, PROMIS chose to focus on pain, fatigue, emotional distress, physical function, and social function. An essential step for the development of an item pool is the identification, evaluation, and revision of extant questionnaire items for the core item pool. In this work, we also describe the systematic process wherein items are classified for subsequent statistical processing by the PROMIS investigators. Six phases of item development are documented: identification of extant items, item classification and selection, item review and revision, focus group input on domain coverage, cognitive interviews with individual items, and final revision before field testing. Identification of items refers to the systematic search for existing items in currently available scales. Expert item review and revision was conducted by trained professionals who reviewed the wording of each item and revised as appropriate for conventions adopted by the PROMIS network. Focus groups were used to confirm domain definitions and to identify new areas of item development for future PROMIS item banks. Cognitive interviews were used to examine individual items. Items successfully screened through this process were sent to field testing and will be subjected to innovative scale construction procedures.

  10. Responses of mink to auditory stimuli: Prerequisites for applying the ‘cognitive bias’ approach

    DEFF Research Database (Denmark)

    Svendsen, Pernille Maj; Malmkvist, Jens; Halekoh, Ulrich

    2012-01-01

    The aim of the study was to determine and validate prerequisites for applying a cognitive (judgement) bias approach to assessing welfare in farmed mink (Neovison vison). We investigated discrimination ability and associative learning ability using auditory cues. The mink (n = 15 females) were...... mink only showed habituation in experiment 2. Regardless of the frequency used (2 and 18 kHz), cues predicting the danger situation initially elicited slower responses compared to those predicting the safe situation but quickly became faster. Using auditory cues as discrimination stimuli for female...... farmed mink in a judgement bias approach would thus appear to be feasible. However several specific issues are to be considered in order to successfully adapt a cognitive bias approach to mink, and these are discussed....

  11. Finite strain response of crimped fibers under uniaxial traction: An analytical approach applied to collagen

    Science.gov (United States)

    Marino, Michele; Wriggers, Peter

    2017-01-01

    Composite materials reinforced by crimped fibers intervene in a number of advanced structural applications. Accordingly, constitutive equations describing their anisotropic behavior and explicitly accounting for fiber properties are needed for modeling and design purposes. To this aim, the finite strain response of crimped beams under uniaxial traction is herein addressed by obtaining analytical relationships based on the Principle of Virtual Works. The model is applied to collagen fibers in soft biological tissues, coupling geometric nonlinearities related to fiber crimp with material nonlinearities due to nanoscale mechanisms. Several numerical applications are presented, addressing the influence of geometric and material features. Available experimental data for tendons are reproduced, integrating the proposed approach within an optimization procedure for data fitting. The obtained results highlight the effectiveness of the proposed approach in correlating fibers structure with composite material mechanics.

  12. A Stepwise Test Characteristic Curve Method to Detect Item Parameter Drift

    Science.gov (United States)

    Guo, Rui; Zheng, Yi; Chang, Hua-Hua

    2015-01-01

    An important assumption of item response theory is item parameter invariance. Sometimes, however, item parameters are not invariant across different test administrations due to factors other than sampling error; this phenomenon is termed item parameter drift. Several methods have been developed to detect drifted items. However, most of the…

  13. Responses of Greenhouse Tomato and Pepper Yields and Nitrogen Dynamics to Applied Compound Fertilizers

    Institute of Scientific and Technical Information of China (English)

    ZHU Jian-Hua; LI Xiao-Lin; ZHANG Fu-Suo; LI Jun-Liang; P.CHRISTIE

    2004-01-01

    Yield and N uptake of tomato (Lycopersicum esculentum Mill.) and pepper (Capsicum annuum L.) crops in five successive rotations receiving two compound fertilizers (12-12-17 and 21-8-11 N-P2O5-K2O) were studied to determine 1)crop responses,2) dynamics of NO3-N and NH4-N in different soil layers,3) N balance and 4) system-level N efficiencies.Five treatments (2 fertilizers,2 fertilizer rates and a control),each with three replicates,were arranged in the study.The higher N fertilizer rate,300 kg N ha-1 (versus 150 kg N ha-i),returned higher vegetable fruit yields and total aboveground N uptake with the largest crop responses occurring for the low-N fertilizer (12-12-17) applied at 300 kg N ha-1 rather than with the high-N fertilizer (21-8-11). Ammonium-N in the top 90 cm of the soil profile declined during the experiment,while nitrate-N remained at a similar level throughout the experiment with the lower rate of fertilizer N.At the higher rate of N fertilizer there was a continuous NO3-N accumulation of over 800 kg N ha-1. About 200 kg N ha-1 was applied with irrigation to each crop using NO3-contaminated groundwater. In general,about 50% of the total N input was recovered from all treatments. Pepper,relative to tomato,used N more efficiently with smaller N losses,but the crops utilized less than 29% of the fertilizer N over the two and a half-year period. Local agricultural practices maintained high residual soil nutrient status. Thus,optimization of irrigation is required to minimize nitrate leaching and maximize crop N recovery.

  14. Was Kiobel Detrimental to Corporate Social Responsibility? Applying Lessons Learnt From American Exceptionalism

    Directory of Open Access Journals (Sweden)

    Benjamin Thompson

    2014-02-01

    Full Text Available The recent decision in the US Supreme Court Kiobel case applied the presumption against extraterritoriality towards the Alien Tort Statute, decreasing the potential scope of tort actions that can be made against corporations for severe human rights violations. In light of the growing influence of multinational corporations and the lack of any international law regime to regulate corporate wrongdoing, this decision might be seen as a blow against one of the few potential avenues for justice for those victims of corporate human rights violations. The Alien Tort Statute is not a jurisdictional statute that allows for claims under international law but is rather a uniquely American cause of action unconnected to international law. The question remains whether an extension of American law to provide remedies for severe corporate human rights abuses can be justified in the absence of any such remedies existent in international law. This article will attempt to answer this question applying criteria developed by leading scholars in response to American exceptionalism. It will argue that the Kiobel decision, rather than being detrimental to holding corporations accountable, actually addresses many of the negative aspects of extraterritorial litigation whilst preserving some possibility of remedy for victims of severe human rights violations by corporations.

  15. Curriculum, Translation, and Differential Functioning of Measurement and Geometry Items

    Science.gov (United States)

    Emenogu, Barnabas C.; Childs, Ruth A.

    2005-01-01

    A test item exhibits differential item functioning (DIF) if students with the same ability find it differentially difficult. When the item is administered in French and English, differences in language difficulty and meaning are the most likely explanations. However, curriculum differences may also contribute to DIF. The responses of Ontario…

  16. Differential immunomodulatory responses to nine polycyclic aromatic hydrocarbons applied by passive dosing.

    Science.gov (United States)

    Oostingh, Gertie J; Smith, Kilian E C; Tischler, Ulrike; Radauer-Preiml, Isabella; Mayer, Philipp

    2015-03-01

    Studying the effects of hydrophobic chemicals using in vitro cell based methods is hindered by the difficulty in bringing and keeping these chemicals in solution. Their effective concentrations are often lower than their nominal concentrations. Passive dosing is one approach that provides defined and stable dissolved concentrations during in vitro testing, and was applied to control and maintain freely dissolved concentrations of polycyclic aromatic hydrocarbons (PAHs) at levels up to their aqueous solubility limit. The immunomodulatory effects of 9 different PAHs at aqueous solubility on human bronchial epithelial cells were determined by analysing the cytokine promoter expression of 4 different inflammatory cytokines using stably transfected recombinant A549 cell lines. Diverse immunomodulatory responses were found with the highest induction observed for the most hydrophobic PAHs chrysene, benzo(a)antracene and benzo(a)pyrene. Cytokine promoter expression was then studied in dose response experiments with acenaphthene, phenanthrene and benzo(a)anthracene. The strongest induction was observed for benzo(a)anthracene. Cell viability analysis was performed and showed that none of the PAHs induced cytotoxicity at any of the concentrations tested. Overall, this study shows that (1) immunomodulatory effects of PAHs can be studied in vitro at controlled freely dissolved concentrations, (2) the most hydrophobic PAHs were the strongest inducers and (3) induction was often higher at lower exposure levels and decreased then with concentration despite the apparent absence of cytotoxicity.

  17. Applying procedural justice theory to law enforcement's response to persons with mental illness.

    Science.gov (United States)

    Watson, Amy C; Angell, Beth

    2007-06-01

    Procedural justice provides a framework for considering how persons with mental illness experience interactions with the police and how officer behaviors may shape cooperation or resistance. The procedural justice perspective holds that the fairness with which people are treated in an encounter with authority figures (such as the police) influences whether they cooperate or resist authority. Key components of a procedural justice framework include participation (having a voice), which involves having the opportunity to present one's own side of the dispute and be heard by the decision maker; dignity, which includes being treated with respect and politeness and having one's rights acknowledged; and trust that the authority is concerned with one's welfare. Procedural justice has its greatest impact early in the encounter, suggesting that how officers initially approach someone is extremely important. Persons with mental illness may be particularly attentive to how they are treated by police. According to this framework, people who are uncertain about their status (such as members of stigmatized groups) will respond most strongly to the fairness by which police exercise their authority. This article reviews the literature on police response to persons with mental illness. Procedural justice theory as it has been applied to mental health and justice system contexts is examined. Its application to encounters between police and persons with mental illness is discussed. Implications and cautions for efforts to improve police response to persons with mental illness and future research also are examined.

  18. Hormonal and neuromuscular responses to mechanical vibration applied to upper extremity muscles.

    Directory of Open Access Journals (Sweden)

    Riccardo Di Giminiani

    Full Text Available OBJECTIVE: To investigate the acute residual hormonal and neuromuscular responses exhibited following a single session of mechanical vibration applied to the upper extremities among different acceleration loads. METHODS: Thirty male students were randomly assigned to a high vibration group (HVG, a low vibration group (LVG, or a control group (CG. A randomized double-blind, controlled-parallel study design was employed. The measurements and interventions were performed at the Laboratory of Biomechanics of the University of L'Aquila. The HVG and LVG participants were exposed to a series of 20 trials ×10 s of synchronous whole-body vibration (WBV with a 10-s pause between each trial and a 4-min pause after the first 10 trials. The CG participants assumed an isometric push-up position without WBV. The outcome measures were growth hormone (GH, testosterone, maximal voluntary isometric contraction during bench-press, maximal voluntary isometric contraction during handgrip, and electromyography root-mean-square (EMGrms muscle activity (pectoralis major [PM], triceps brachii [TB], anterior deltoid [DE], and flexor carpi radialis [FCR]. RESULTS: The GH increased significantly over time only in the HVG (P = 0.003. Additionally, the testosterone levels changed significantly over time in the LVG (P = 0.011 and the HVG (P = 0.001. MVC during bench press decreased significantly in the LVG (P = 0.001 and the HVG (P = 0.002. In the HVG, the EMGrms decreased significantly in the TB (P = 0.006 muscle. In the LVG, the EMGrms decreased significantly in the DE (P = 0.009 and FCR (P = 0.006 muscles. CONCLUSION: Synchronous WBV acutely increased GH and testosterone serum concentrations and decreased the MVC and their respective maximal EMGrms activities, which indicated a possible central fatigue effect. Interestingly, only the GH response was dependent on the acceleration with respect to the subjects' responsiveness.

  19. A Comparison of Anchor-Item Designs for the Concurrent Calibration of Large Banks of Likert-Type Items

    Science.gov (United States)

    Garcia-Perez, Miguel A.; Alcala-Quintana, Rocio; Garcia-Cueto, Eduardo

    2010-01-01

    Current interest in measuring quality of life is generating interest in the construction of computerized adaptive tests (CATs) with Likert-type items. Calibration of an item bank for use in CAT requires collecting responses to a large number of candidate items. However, the number is usually too large to administer to each subject in the…

  20. Quality of life in the Danish general population--normative data and validity of WHOQOL-BREF using Rasch and item response theory models

    DEFF Research Database (Denmark)

    Noerholm, V; Groenvold, M; Watt, T

    2004-01-01

    , the objective of the study was to estimate the reference data for the quality of life questionnaire WHOQOL-BREF in the general Danish population and in subgroups defined by age, gender, and education. METHODS: Mail-out-mail-back questionnaires were sent to a randomly selected sample of the Danish general......-BREF domains is a more adequate expression of quality of life than the total score of all 26 items. Although none of the subscales are statistically sufficient measures of their domains, the profile scores seem to be adequate approximations to the optimal score....

  1. Dermatomyositis Leading to Necrotizing Vasculitis: A Perfect Response to Applied Therapy.

    Science.gov (United States)

    Akbaryan, Mahmood; Darabi, Farideh; Soltani, Zahra

    2016-12-01

    Dermatomyositis is an idiopathic inflammatory myopathy that cause skin and muscle complications. The ethiology is not understood well yet. Released cytokines including interferon and interleukins are suggested to make inflammatory responses in the skin or muscle. Muscle weakness and skin lesions including heliotrope rash, shawl sign and Gottron's papules are the most common symptoms. A biopsy (muscle or skin) is always the most reliable method for diagnosis. Corticosteroids in association with immunosuppressive agents are used as standard treatment. The patient was a 30 years old woman who got involved with dermatomyositis for 10 years. She has been under therapy with Methotrexate, Prednisolon and Azathioprine until she came to us suffering from progressive skin lesions. Experiments and examinations were normal except the lesions and detected lipoatrophy. Because of immune cells infiltration and observations necrotizing vasculitis was diagnosed. After three month of high dose prednisolon and intravenous cyclophosphamide therapy the lesions vanished remarkable. True and immediate diagnosis gives physicians the chance not only to assess the best treatment but have adequate time to apply the procedure. However shortening the therapy and diminishing morbidity of the disease need more investigations and efforts.

  2. Corporate Social Responsibility Applied for Rural Development: An Empirical Analysis of Firms from the American Continent

    Directory of Open Access Journals (Sweden)

    Miguel Arato

    2016-01-01

    Full Text Available Corporate Social Responsibility has been recognized by policymakers and development specialists as a feasible driver for rural development. The present paper explores both theoretically and empirically how firms involved in CSR provide development opportunities to rural communities. The research first evaluates the applied literature on the implementation of CSR by private firms and policymakers as means to foster sustainable rural development. The empirical research analyses the CSR activities of 100 firms from a variety of industries, sizes, and countries to determine the type of companies who are involved in rural development and the kind of activities they deployed. Results from the empirical research show that although rural development initiatives are not relevant for all types of companies, a significant number of firms from a variety of industries have engaged in CSR programs supporting rural communities. Firms appear to be interested in stimulating rural development and seem to benefit from it. This paper also includes an exploration of the main challenges and constraints that firms encounter when encouraging rural development initiatives.

  3. Consumer satisfaction and item response theory: creating a measurement scale Avaliação do nível de satisfação de alunos de uma instituição de ensino superior: uma aplicação da teoria da resposta ao item

    Directory of Open Access Journals (Sweden)

    Silvana Ligia Vincenzi Bortolotti

    2012-01-01

    Full Text Available Today, people have increasingly demanded more from the state and enterprises. Consumer satisfaction is not an organizational option, but rather a matter of survival for any institution. The quest for measurement of consumer satisfaction has been ongoing in many areas of research, and researchers have concentrated efforts to demonstrate the psychometric quality of their measurements. However, the techniques employed by these commitments have not kept pace with the advances in psychometric theory and methods. The Item Response Theory (IRT is an approach used for assessing latent trait. It is commonly used in educational and psychological tests and provides additional information beyond that obtained from classic psychometric techniques. This article presents a model of cumulative application of item response theory to measure the extent of students' satisfaction with their courses by creating a measurement scale. The Graded Response Model was used. The results demonstrate the effectiveness of this theory in measuring satisfaction since it places both items as individuals on the same scale. This theory may be valuable in the evaluation of customer satisfaction and many other organizational phenomena. The findings may help the decision maker of an enterprise with the correction of flows, processes, and procedures, and, consequently, it may help generate increased efficiency and effectiveness in daily tasks and in event management business. Finally, the information obtained from the analysis can play a role in the development and/or evaluation of institutional planning.O tema deste trabalho é a utilização da Teoria da Resposta ao Item (TRI como ferramenta de avaliação de aspectos organizacionais específicos. O objetivo é aplicar um modelo cumulativo da TRI para criar uma medida de satisfação de alunos com seus cursos, avaliando também a satisfação no ensino e criando uma escala de medida. Muito utilizada nas áreas educacional e psicol

  4. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate and massive objects require a longer procedure and will therefore take longer.

  5. Cotton response to poultry litter applied by subsurface banding relative to surface broadcasting

    Science.gov (United States)

    Dry poultry litter is typically land-applied by surface broadcasting, a practice that exposes certain litter nutrients to volatilization loss. Applying litter with a new, experimental implement that places the litter in narrow bands below the soil surface may reduce or eliminate such losses but has...

  6. Using module analysis for multiple choice responses: A new method applied to Force Concept Inventory data

    Science.gov (United States)

    Brewe, Eric; Bruun, Jesper; Bearden, Ian G.

    2016-12-01

    We describe Module Analysis for Multiple Choice Responses (MAMCR), a new methodology for carrying out network analysis on responses to multiple choice assessments. This method is used to identify modules of non-normative responses which can then be interpreted as an alternative to factor analysis. MAMCR allows us to identify conceptual modules that are present in student responses that are more specific than the broad categorization of questions that is possible with factor analysis and to incorporate non-normative responses. Thus, this method may prove to have greater utility in helping to modify instruction. In MAMCR the responses to a multiple choice assessment are first treated as a bipartite, student X response, network which is then projected into a response X response network. We then use data reduction and community detection techniques to identify modules of non-normative responses. To illustrate the utility of the method we have analyzed one cohort of postinstruction Force Concept Inventory (FCI) responses. From this analysis, we find nine modules which we then interpret. The first three modules include the following: Impetus Force, More Force Yields More Results, and Force as Competition or Undistinguished Velocity and Acceleration. This method has a variety of potential uses particularly to help classroom instructors in using multiple choice assessments as diagnostic instruments beyond the Force Concept Inventory.

  7. Effects of Aging and IQ on Item and Associative Memory

    Science.gov (United States)

    Ratcliff, Roger; Thapar, Anjali; McKoon, Gail

    2011-01-01

    The effects of aging and IQ on performance were examined in 4 memory tasks: item recognition, associative recognition, cued recall, and free recall. For item and associative recognition, accuracy and the response time (RT) distributions for correct and error responses were explained by Ratcliff's (1978) diffusion model at the level of individual…

  8. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate, preparation of the package and related paperwork). Large and massive objects require a longer procedure and will therefore take longer.

  9. Applying the principles of adult learning to the teaching of psychopharmacology: audience response systems.

    Science.gov (United States)

    Stahl, Stephen M; Davis, Richard L

    2009-08-01

    Medical presentations can be enhanced by systematically collecting audience feedback. This is readily accomplished with polling systems, called audience response systems. Several systems are now available that are small, inexpensive, and can be readily integrated into standard powerpoint presentations without the need for a technician. Use of audience response systems has several advantages. These include improving attentiveness, increasing learning, polling anonymously, tracking individual and group responses, gauging audience understanding, adding interactivity and fun, and evaluating both participant learning and instructor teaching. Tips for how to write questions for audience response systems are also included.

  10. IRT Item Parameter Scaling for Developing New Item Pools

    Science.gov (United States)

    Kang, Hyeon-Ah; Lu, Ying; Chang, Hua-Hua

    2017-01-01

    Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent…

  11. Faculty development on item writing substantially improves item quality.

    NARCIS (Netherlands)

    Naeem, N.; Vleuten, C.P.M. van der; Alfaris, E.A.

    2012-01-01

    The quality of items written for in-house examinations in medical schools remains a cause of concern. Several faculty development programs are aimed at improving faculty's item writing skills. The purpose of this study was to evaluate the effectiveness of a faculty development program in item develo

  12. Physics Items and Student's Performance at Enem

    CERN Document Server

    Gonçalves, Wanderley P

    2013-01-01

    The Brazilian National Assessment of Secondary Education (ENEM, Exame Nacional do Ensino M\\'edio) has changed in 2009: from a self-assessment of competences at the end of high school to an assessment that allows access to college and student financing. From a single general exam, now there are tests in four areas: Mathematics, Language, Natural Sciences and Social Sciences. A new Reference Matrix is build with components as cognitive domains, competencies, skills and knowledge objects; also, the methodological framework has changed, using now Item Response Theory to provide scores and allowing longitudinal comparison of results from different years, providing conditions for monitoring high school quality in Brazil. We present a study on the issues discussed in Natural Science Test of ENEM over the years 2009, 2010 and 2011. Qualitative variables are proposed to characterize the items, and data from students' responses in Physics items were analysed. The qualitative analysis reveals the characteristics of the ...

  13. Identifying Unbiased Items for Screening Preschoolers for Disruptive Behavior Problems.

    Science.gov (United States)

    Studts, Christina R; Polaha, Jodi; van Zyl, Michiel A

    2016-10-25

    OBJECTIVE : Efficient identification and referral to behavioral services are crucial in addressing early-onset disruptive behavior problems. Existing screening instruments for preschoolers are not ideal for pediatric primary care settings serving diverse populations. Eighteen candidate items for a new brief screening instrument were examined to identify those exhibiting measurement bias (i.e., differential item functioning, DIF) by child characteristics. METHOD : Parents/guardians of preschool-aged children (N = 900) from four primary care settings completed two full-length behavioral rating scales. Items measuring disruptive behavior problems were tested for DIF by child race, sex, and socioeconomic status using two approaches: item response theory-based likelihood ratio tests and ordinal logistic regression. RESULTS : Of 18 items, eight were identified with statistically significant DIF by at least one method. CONCLUSIONS : The bias observed in 8 of 18 items made them undesirable for screening diverse populations of children. These items were excluded from the new brief screening tool.

  14. Emerging Opportunities for School Psychologists to Enhance our Remediation Procedure Evidence Base as We Apply Response to Intervention

    Science.gov (United States)

    Skinner, Christopher H.; McCleary, Daniel F.; Skolits, Gary L.; Poncy, Brian C.; Cates, Gary L.

    2013-01-01

    The success of Response-to-Intervention (RTI) and similar models of service delivery is dependent on educators being able to apply effective and efficient remedial procedures. In the process of implementing problem-solving RTI models, school psychologists have an opportunity to contribute to and enhance the quality of our remedial-procedure…

  15. Unreliable item or inconsistent person? A study of variation in health beliefs and belief-anchors to biomedical models.

    Science.gov (United States)

    Ip, Edward H; Saldana, Santiago; Chen, Shyh-Huei; Kirk, Julienne K; Bell, Ronny A; Nguyen, Ha; Grzywacz, Joseph G; Arcury, Thomas A; Quandt, Sara A

    2015-08-01

    The reliability of an item designed to measure health belief is often confounded with response consistency at the person level. The study applied contemporary measurement methods to an inventory of common sense beliefs about diabetes and used a sample of N = 563 adults with diabetes to test the hypothesis that individuals whose beliefs are congruent with a biomedical model are more consistent in their responses. Item-level analysis revealed that the domains of Causes and Medical Management were the least reliable. Person-level analysis showed that respondents who held views congruent with the biomedical model were more consistent than people who did not.

  16. The Determination of Hierarchies among TOEFL Vocabulary and Reading Comprehension Items.

    Science.gov (United States)

    Perkins, Kyle; And Others

    A study was undertaken to identify the prerequisite relations (or hierarchies among the items) existing in the item responses of a sample of 86 foreign students who took the Test of English as a Foreign Language (TOEFL) vocabulary and reading comprehension test, Form 3JTF1. The form contains 30 vocabulary items and 30 reading comprehension items.…

  17. Detection of person misfit in computerized adaptive tests with polytomous items

    NARCIS (Netherlands)

    Krimpen-Stoop, van Edith M.L.A.; Meijer, Rob R.

    2002-01-01

    Item scores that do not fit an assumed item response theory model may cause the latent trait value to be inaccurately estimated. For a computerized adaptive test (CAT) using dichotomous items, several person-fit statistics for detecting mis.tting item score patterns have been proposed. Both for pape

  18. A Comparison of Mantel-Haenszel Differential Item Functioning Parameters. LSAC Research Report Series.

    Science.gov (United States)

    Schnipke, Deborah L.; Roussos, Louis A.; Pashley, Peter J.

    Differential item functioning (DIF) analyses are conducted to investigate how items function in various subgroups. The Mantel-Haenszel (MH) DIF statistic is used at the Law School Admission Council and other testing companies. When item functioning can be well-described in terms of a one- or two-parameter logistic item response theory (IRT) model…

  19. Polytomous latent scales for the investigation of the ordering of items

    NARCIS (Netherlands)

    Ligtvoet, R.; van der Ark, L.A.; Bergsma, W. P.; Sijtsma, K.

    2011-01-01

    We propose three latent scales within the framework of nonparametric item response theory for polytomously scored items. Latent scales are models that imply an invariant item ordering, meaning that the order of the items is the same for each measurement value on the latent scale. This ordering prope

  20. A Simulation Study of Methods for Assessing Differential Item Functioning in Computer-Adaptive Tests.

    Science.gov (United States)

    Zwick, Rebecca; And Others

    Simulated data were used to investigate the performance of modified versions of the Mantel-Haenszel and standardization methods of differential item functioning (DIF) analysis in computer-adaptive tests (CATs). Each "examinee" received 25 items out of a 75-item pool. A three-parameter logistic item response model was assumed, and…

  1. New technologies for item monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Abbott, J.A. [EG & G Energy Measurements, Albuquerque, NM (United States); Waddoups, I.G. [Sandia National Labs., Albuquerque, NM (United States)

    1993-12-01

    This report responds to the Department of Energy`s request that Sandia National Laboratories compare existing technologies against several advanced technologies as they apply to DOE needs to monitor the movement of material, weapons, or personnel for safety and security programs. The authors describe several material control systems, discuss their technologies, suggest possible applications, discuss assets and limitations, and project costs for each system. The following systems are described: WATCH system (Wireless Alarm Transmission of Container Handling); Tag system (an electrostatic proximity sensor); PANTRAK system (Personnel And Material Tracking); VRIS (Vault Remote Inventory System); VSIS (Vault Safety and Inventory System); AIMS (Authenticated Item Monitoring System); EIVS (Experimental Inventory Verification System); Metrox system (canister monitoring system); TCATS (Target Cueing And Tracking System); LGVSS (Light Grid Vault Surveillance System); CSS (Container Safeguards System); SAMMS (Security Alarm and Material Monitoring System); FOIDS (Fiber Optic Intelligence & Detection System); GRADS (Graded Radiation Detection System); and PINPAL (Physical Inventory Pallet).

  2. Response surface method applied to optimization of estradiol permeation in chitosan membranes

    Indian Academy of Sciences (India)

    Luciano Mengatto; María I Cabrera; Julio A Luna

    2012-06-01

    The present work deals with the study of estradiol permeation in chitosan membranes. A fractional factorial design was built for the determination of the main factors affecting estradiol permeation. The independent factors analysed were: concentration of chitosan, concentration of cross-linking agent, cross-linking time and thermal treatment. It was found that concentration of chitosan and cross-linking time significantly affected the response. The effects of thermal treatment and concentration of cross-linking agent were not significant. An optimization process based on response surface methodology was carried out in order to develop a statistical model which describes the relationship between active independent variables and estradiol flux. This model can be used to find out a combination of factor levels during response optimization. Possible options for response optimization are to maximize, minimize or move towards a target value.

  3. Classical item and test analysis with graphics: the ViSta-CITA program.

    Science.gov (United States)

    Ledesma, Rubén Daniel; Molina, J Gabriel

    2009-11-01

    Current advances in test development theory have mostly been influenced by item response theory. Notwithstanding this, classical test theory still plays a major part in the development of tests for applied educational and behavioral research. This article describes ViSta-CITA, a computer program that implements a set of classical item and test analysis methods that incorporate innovative graphics whose aim is to provide deeper insight into analysis results. Such an aim is achieved through the SpreadPlot, a graphical method designed to display multiple, simultaneous, interactive views of the analysis results. It behaves on a dynamic basis, so that users' changes (e.g., selecting a subset of items) are automatically updated in the graphical windows showing the analysis results. Moreover, ViSta-CITA is freely available, and its code is open to modifications or additions by the user. Features such as these constitute useful tools for research and teaching purposes related to test development.

  4. Peeking into personality test answers: inter- and intraindividual variety in item interpretations.

    Science.gov (United States)

    Arro, Grete

    2013-03-01

    Personality research of today applies basically inventories having neither unambiguously interpretable items nor responses. The substantive process of generating the test answer is rarely investigated and thus the possible field of meanings, out of which the answer is created, remains hidden. In order to investigate the possible array of spontaneous answers to personality test items, a situative open-ended personality inventory was developed to determine individuals' ways of interpreting personality test items and relevant personality descriptions for individuals. The children's sample (N = 704 of 10-13 year olds) answered five free-response contextualized personality test questions, each related to one of the Five Factor Model personality dimensions. It was revealed that there is no universal interpretation of an item. First, different children's answers to same question described different personality dimensions - substantial number of the respondents' answers did not reflect the personality domain assumed in an item. So there are several ways to interpret test questions; answers may refer to different personality dimensions and not necessarily the one assumed by the researcher. Second, a number of children mentioned more than one personality trait for one item, indicating that even within one person there may be several relevant interpretations of the same item. Considering personality traits as occurring one by one and mutually exclusively during personality test answering may be artificial; in reality trait combinations may reflect actual reaction. In sum, the results suggest there is no single predictable interpretational trajectory in meaning construction process if semiotically mediated constructs, e.g., personality reflection, are assessed.

  5. Item Overexposure in Computerized Classification Tests Using Sequential Item Selection

    Directory of Open Access Journals (Sweden)

    Alan Huebner

    2012-06-01

    Full Text Available Computerized classification tests (CCTs often use sequential item selection which administers items according to maximizing psychometric information at a cut point demarcating passing and failing scores. This paper illustrates why this method of item selection leads to the overexposure of a significant number of items, and the performances of three different methods for controlling maximum item exposure rates in CCTs are compared. Specifically, the Sympson-Hetter, restricted, and item eligibility methods are examined in two studies realistically simulating different types of CCTs and are evaluated based upon criteria including classification accuracy, the number of items exceeding the desired maximum exposure rate, and test overlap. The pros and cons of each method are discussed from a practical perspective.

  6. Applied Research on the Construction of the Textbook-based Test Item Bank for the Course of English in Vocational Colleges%高职英语课程性试题库建设的应用研究

    Institute of Scientific and Technical Information of China (English)

    邹园艳; 李志萍

    2012-01-01

    科学的测试是提高教学质量的必要保证,因此建立科学合理的测试系统非常必要。在分析建设高职英语课程性试题库必要性的基础上,探讨试题库的命题策略,以重庆电子工程职业学院的实例阐述试题库的实际应用效果,并提出在试题库建设中应注意的问题。%Scientific tests provide essential guarantee for the improvement of the quality of teaching, so the establish ment of a scientific and effective test system is indeed necessary. Based on the analysis of the necessity of the con struction of the textbookbased test item bank for the course of English in vocational colleges, this essay probes into its test design strategies. Besides, this essay also expounds its practical application in Chongqing College of Electron ic Engineering as well as pointing out the problems in its construction which should be paid attention to.

  7. Conscientiousness in the workplace : Applying mixture IRT to investigate scalability and predictive validity

    NARCIS (Netherlands)

    Egberink, I.J.L.; Meijer, R.R.; Veldkamp, B.P.

    2010-01-01

    Mixture item response theory (IRT) models have been used to assess multidimensionality of the construct being measured and to detect different response styles for different groups. In this study a mixture version of the graded response model was applied to investigate scalability and predictive vali

  8. Conscientiousness in the workplace: Applying mixture IRT to investigate scalability and predictive validity

    NARCIS (Netherlands)

    Egberink, Iris J.L.; Meijer, Rob R.; Veldkamp, Bernard P.

    2010-01-01

    Mixture item response theory (IRT) models have been used to assess multidimensionality of the construct being measured and to detect different response styles for different groups. In this study a mixture version of the graded response model was applied to investigate scalability and predictive vali

  9. Editorial Changes and Item Performance: Implications for Calibration and Pretesting

    Directory of Open Access Journals (Sweden)

    Heather Stoffel

    2014-11-01

    Full Text Available Previous research on the impact of text and formatting changes on test-item performance has produced mixed results. This matter is important because it is generally acknowledged that any change to an item requires that it be recalibrated. The present study investigated the effects of seven classes of stylistic changes on item difficulty, discrimination, and response time for a subset of 65 items that make up a standardized test for physician licensure completed by 31,918 examinees in 2012. One of two versions of each item (original or revised was randomly assigned to examinees such that each examinee saw only two experimental items, with each item being administered to approximately 480 examinees. The stylistic changes had little or no effect on item difficulty or discrimination; however, one class of edits -' changing an item from an open lead-in (incomplete statement to a closed lead-in (direct question -' did result in slightly longer response times. Data for nonnative speakers of English were analyzed separately with nearly identical results. These findings have implications for the conventional practice of repretesting (or recalibrating items that have been subjected to minor editorial changes.

  10. Bayesian Item Selection in Constrained Adaptive Testing Using Shadow Tests

    Science.gov (United States)

    Veldkamp, Bernard P.

    2010-01-01

    Application of Bayesian item selection criteria in computerized adaptive testing might result in improvement of bias and MSE of the ability estimates. The question remains how to apply Bayesian item selection criteria in the context of constrained adaptive testing, where large numbers of specifications have to be taken into account in the item…

  11. 17 CFR 229.406 - (Item 406) Code of ethics.

    Science.gov (United States)

    2010-04-01

    ... 17 Commodity and Securities Exchanges 2 2010-04-01 2010-04-01 false (Item 406) Code of ethics. 229... 406) Code of ethics. (a) Disclose whether the registrant has adopted a code of ethics that applies to... code of ethics, explain why it has not done so. (b) For purposes of this Item 406, the term code...

  12. Safety Evaluation for Packaging (onsite) T Plant Canyon Items

    Energy Technology Data Exchange (ETDEWEB)

    OBRIEN, J.H.

    2000-07-14

    This safety evaluation for packaging (SEP) evaluates and documents the ability to safely ship mostly unique inventories of miscellaneous T Plant canyon waste items (T-P Items) encountered during the canyon deck clean off campaign. In addition, this SEP addresses contaminated items and material that may be shipped in a strong tight package (STP). The shipments meet the criteria for onsite shipments as specified by Fluor Hanford in HNF-PRO-154, Responsibilities and Procedures for all Hazardous Material Shipments.

  13. Response of Two Tomato Cultivars to Field-applied Proline and Salt Stress

    Directory of Open Access Journals (Sweden)

    Kahlaoui B.

    2013-08-01

    Full Text Available An experiment was carried out using saline water (6.57 dS.m-1 and subsurface drip irrigation (SDI on two tomato cultivars (Solanum lycopersicum, cv. Rio Grande and Heinz-2274 in a silty clay soil. The former is a salinity tolerant and the latter a sensitive cultivar. Exogenous application of proline was done by foliar spray at two concentrations: 10 and 20 mg.L-1, with a control (saline water without proline, during the flowering stage. As a result of the proline applied, significant effects were observed on both cultivars of tomato, particularly with low concentration of proline (10 mg.L-1. It led to increase of leaf area, growth length and fruit yield. Regarding mineral nutrition, Ca2+ was higher in different organs while low accumulation of Na+ occurred. However, Cl- was very low significantly in all tissues of plants of Rio Grande at the higher concentration of proline applied.

  14. Donor impurity states and related optical response in a lateral coupled dot-ring system under applied electric field

    Energy Technology Data Exchange (ETDEWEB)

    Correa, J.D. [Departamento de Ciencias Básicas, Universidad de Medellín, Medellín (Colombia); Mora-Ramos, M.E. [Centro de Investigación en Ciencias, Instituto de Ciencias Básicas y Aplicadas, Universidad Autonoma del Estado de Morelos, Av. Universidad 1001, CP 62209 Cuernavaca, Morelos (Mexico); Duque, C.A., E-mail: cduque@fisica.udea.edu.co [Grupo de Materia Condensada-UdeA, Instituto de Física, Facultad de Ciencias Exactas y Naturales, Universidad de Antioquia UdeA, Calle 70 No. 52-21, Medellín (Colombia)

    2015-09-01

    A study on the effects of an externally applied electric field on the linear optical absorption and relative refractive index change associated with transitions between off-center donor impurity states in laterally coupled quantum dot-ring system is reported. Electron states are calculated within the effective mass and parabolic band approximations by means of an exact diagonalization procedure. The states and the optical response in each case show significant sensitivity to the geometrical distribution of confining energies as well as to the strength of the applied field.

  15. Detecting Differential Item Functioning and Differential Test Functioning on Math School Final-exam

    Directory of Open Access Journals (Sweden)

    - Mansyur

    2016-08-01

    Full Text Available This study aims at finding out the characteristics of Differential Item Functioning (DIF and Differential Test Functioning (DTF on school final-exam for Math subject based on Item Response Theory (ITR. The subjects of this study were questions and all of the students’ answer sheets chosen by using convenience sampling method and obtained 286 responses consisted of 147 male and 149 female students’ responses. The data of this study collected using documentation technique by quoting the response of Math school final-exam participants. The data analysis of this study was Item Response Theory approach with model 2P of Lord’s chi-square DIF method. This study showed that from 40 question items analysed theoretically using Item Response Theory (ITR, affected Differential Item Functioning (DIF gender was ten items and affected DIF location (area was 13 items. Meanwhile, Differential Test Functioning (DTF was benefitted for female and least profitable to citizen.

  16. Differential immunomodulatory responses to nine polycyclic aromatic hydrocarbons applied by passive dosing

    DEFF Research Database (Denmark)

    Oostingh, Gertie J.; Smith, Kilian E. C.; Tischler, Ulrike

    2015-01-01

    (a)antracene and benzo(a)pyrene. Cytokine promoter expression was then studied in dose response experiments with acenaphthene, phenanthrene and benzo(a)anthracene. The strongest induction was observed for benzo(a)anthracene. Cell viability analysis was performed and showed that none of the PAHs induced cytotoxicity...

  17. GENDER DIFFERENCES IN APPLYING COMPLIMENTS AND COMPLIMENT RESPONSES IN CHINESE CONTEXT

    Institute of Scientific and Technical Information of China (English)

    OuanLihong

    2004-01-01

    The previous research done by the author shows that there exist significant differences between men and women in their realization patterns of compliments and compliment responses. These differences are reflected in the strategies used in complimenting and responding to compliments. Generally, women tend to use more polite strategies than men do. This article will explore these differences from both social and cultural perspectives.

  18. AQUATIC ANIMAL RESPIRATION AND COUGH RESPONSE APPLIED TO INNOVATIVE ENVIRONMENTAL BIOMONITORING: A BIBLIOGRAPHY

    Science.gov (United States)

    This bibliography encompasses a body of in-depth technical information on the mechanics and physiology of respiration in aquatic animals (vertebrate and invertebrate). In compiling the bibliography, special emphasis was given to identifying studies that deal with responses of thi...

  19. The household responsibility system and social change in rural Guizhou, China: applying a cohort approach

    NARCIS (Netherlands)

    Yuan, J.

    2010-01-01

    Since the introduction of the Household Responsibility System (HRS) in 1978, Chinese rural households have experienced many changes. The HRS allows farming households to organize their own agricultural production on contracted lands, enabling them to work more efficiently and get more benefits compa

  20. Counteracting Educational Injustice with Applied Critical Leadership: Culturally Responsive Practices Promoting Sustainable Change

    Science.gov (United States)

    Santamaría, Lorri J.; Santamaría, Andrés P.

    2015-01-01

    This contribution considers educational leadership practice to promote and sustain diversity. Comparative case studies are presented featuring educational leaders in the United States and New Zealand who counter injustice in their practice. The leaders' leadership practices responsive to the diversity presented in their schools offer…

  1. Item banking to improve, shorten and computerize self-reported fatigue: an illustration of steps to create a core item bank from the FACIT-Fatigue Scale.

    Science.gov (United States)

    Lai, Jin-shei; Cella, David; Chang, Chih-Hung; Bode, Rita K; Heinemann, Allen W

    2003-08-01

    Fatigue is a common symptom among cancer patients and the general population. Due to its subjective nature, fatigue has been difficult to effectively and efficiently assess. Modern computerized adaptive testing (CAT) can enable precise assessment of fatigue using a small number of items from a fatigue item bank. CAT enables brief assessment by selecting questions from an item bank that provide the maximum amount of information given a person's previous responses. This article illustrates steps to prepare such an item bank, using 13 items from the Functional Assessment of Chronic Illness Therapy Fatigue Subscale (FACIT-F) as the basis. Samples included 1022 cancer patients and 1010 people from the general population. An Item Response Theory (IRT)-based rating scale model, a polytomous extension of the Rasch dichotomous model was utilized. Nine items demonstrating acceptable psychometric properties were selected and positioned on the fatigue continuum. The fatigue levels measured by these nine items along with their response categories covered 66.8% of the general population and 82.6% of the cancer patients. Although the operational CAT algorithms to handle polytomously scored items are still in progress, we illustrated how CAT may work by using nine core items to measure level of fatigue. Using this illustration, a fatigue measure comparable to its full-length 13-item scale administration was obtained using four items. The resulting item bank can serve as a core to which will be added a psychometrically sound and operational item bank covering the entire fatigue continuum.

  2. Item analysis of in use multiple choice questions in pharmacology

    Science.gov (United States)

    Kaur, Mandeep; Singla, Shweta; Mahajan, Rajiv

    2016-01-01

    Background: Multiple choice questions (MCQs) are a common method of assessment of medical students. The quality of MCQs is determined by three parameters such as difficulty index (DIF I), discrimination index (DI), and distracter efficiency (DE). Objectives: The objective of this study is to assess the quality of MCQs currently in use in pharmacology and discard the MCQs which are not found useful. Materials and Methods: A class test of central nervous system unit was conducted in the Department of Pharmacology. This test comprised 50 MCQs/items and 150 distracters. A correct response to an item was awarded one mark with no negative marking for incorrect response. Each item was analyzed for three parameters such as DIF I, DI, and DE. Results: DIF of 38 (76%) items was in the acceptable range (P = 30–70%), 11 (22%) items were too easy (P > 70%), and 1 (2%) item was too difficult (P 0.35), of 12 (24%) items was good (d = 0.20–0.34), and of 7 (14%) items was poor (d < 0.20). A total of 50 items had 150 distracters. Among these, 27 (18%) were nonfunctional distracters (NFDs) and 123 (82%) were functional distracters. Items with one NFD were 11 and with two NFDs were 8. Based on these parameters, 6 items were discarded, 17 were revised, and 27 were kept for subsequent use. Conclusion: Item analysis is a valuable tool as it helps us to retain the valuable MCQs and discard the items which are not useful. It also helps in increasing our skills in test construction and identifies the specific areas of course content which need greater emphasis or clarity. PMID:27563581

  3. The range fraction: an applied method to characterize regional groundwater responses to climate inputs.

    Science.gov (United States)

    Leising, Joseph F; Lutz, Alexandra

    2014-12-01

    An important component of ongoing water-resource investigations in the eastern Great Basin, USA, has been to ascertain the impact of future predicted climate change on groundwater availability. As a first step in that analysis, it was hypothesized that potentiometric fluctuations at certain wells would reflect annual-scale precipitation variation. Potentiometric behavior at a well depends on local hydrologic conditions, well construction, and human activities, in addition to natural recharge and regional water levels. Moreover, measurement data are limited for many wells. After preliminarily screening, a large body of well and climate station data, short-term potentiometric responses to annual-scale climate inputs, were identified at 18 wells using a simple visualization methodology developed during the study. For water levels displaying multi-annual trends, the signals were measured as deviations from a linear trendline. Groundwater responses lagged precipitation signals by less than 1 year to as much as 3 years, with most wells showing at most a 1- to 2-year delay. Response amplitude was variable and strongly depended on the hydrologic setting of each well.

  4. Individuality of Item Interpretation in Interchangeable ACL Scales

    Science.gov (United States)

    Fiske, Donald W.; Barack, Leonard I.

    1976-01-01

    The diversity among interpretations of single items in personality questionnaires has been noted previously. Using adjectives from the Adjective Check List (ACL), the study sought evidence bearing on these questions: Does such diversity make the responses to an item not comparable across subjects? If so, what are the implications for scores based…

  5. Detecting Local Item Dependence in Polytomous Adaptive Data

    Science.gov (United States)

    Mislevy, Jessica L.; Rupp, Andre A.; Harring, Jeffrey R.

    2012-01-01

    A rapidly expanding arena for item response theory (IRT) is in attitudinal and health-outcomes survey applications, often with polytomous items. In particular, there is interest in computer adaptive testing (CAT). Meeting model assumptions is necessary to realize the benefits of IRT in this setting, however. Although initial investigations of…

  6. Group differences in the heritability of items and test scores

    NARCIS (Netherlands)

    Wicherts, J.M.; Johnson, W.

    2009-01-01

    It is important to understand potential sources of group differences in the heritability of intelligence test scores. On the basis of a basic item response model we argue that heritabilities which are based on dichotomous item scores normally do not generalize from one sample to the next. If groups

  7. Reorientation response of magnetic microspheres attached to gold electrodes under an applied magnetic field

    Energy Technology Data Exchange (ETDEWEB)

    De Los Santos Valladares, L.; Reeve, R.M.; Mitrelias, T.; Langford, R.M.; Barnes, C.H.W., E-mail: luis_d_v@hotmail.com [Cavendish Laboratory, Department of Physics, University of Cambridge Materials and Structures Laboratory (United Kingdom); Bustamante Dominguez, A. [Laboratorio de Ceramicos y Nanomateriales, Facultad de Ciencias Fisicas, Universidad Nacional Mayor de San Marcos, Lima (Peru); Aguiar, J. Albino [Universidade Federal de Pernambuco (UFPE), Recife, PE (Brazil). Departamento de Fisica; Azuma, Y. [Materials and Structures Laboratory, Tokyo Institute of Technology, Midori-ku, Yokohama (Japan); Majima, Y. [CREST, Japan Science and Technology Agency (JST), Midori-ku, Yokohama (Japan)

    2013-08-15

    In this work, we report the mechanical reorientation of thiolated ferromagnetic microspheres bridging a pair of gold electrodes under an external magnetic field. When an external magnetic field (7 kG) is applied during the measurement of the current-voltage characteristics of a carboxyl ferromagnetic microsphere (4 μm diameter) attached to two gold electrodes by self-assembled monolayers (SAMs) of octane dithiol (C{sub 8}H{sub 18}S{sub 2}), the current signal is distorted. Rather than due to magnetoresistance, this effect is caused by a mechanical reorientation of the ferromagnetic sphere, which alters the number of SAMs between the sphere and the electrodes and therefore affects conduction. To study the physical reorientation of the ferromagnetic particles, we measure their hysteresis loops while suspended in a liquid solution. (author)

  8. Response of Triatoma infestans to pour-on cypermethrin applied to chickens under laboratory conditions

    Directory of Open Access Journals (Sweden)

    Ivana Amelotti

    2009-05-01

    Full Text Available This article reports the effects of a pour-on formulation of cypermethrin (6% active ingredient applied to chickens exposed to Triatoma infestans, the main vector of Chagas disease in rural houses of the Gran Chaco Region of South America. This study was designed as a completely random experiment with three experimental groups and five replicates. Third instar nymphs were fed on chickens treated with 0, 1 and 2 cc of the formulation. Nymphs were allowed to feed on the chickens at different time intervals after the insecticide application. Third-instar nymphs fed on treated chickens showed a higher mortality, took less blood during feeding and had a lower moulting rate. The mortality rate was highest seven days after the insecticide solution application and blood intake was affected until 30 days after the application of the solution.

  9. Chaos theory applied to the caloric response of the vestibular system.

    Science.gov (United States)

    Aasen, T

    1993-12-01

    Developments in the field of nonlinear dynamics has given us a new conceptual framework for understanding the mechanisms involved in the regulation of complex nonlinear systems. This concept, called "chaos" or "deterministic chaos," has been applied to EKG, EEG, and other physiological signals, but not yet to the ENG signal. The underlying geometrical structure in chaotic dynamics is fractal (noninteger dimension), and calculating the fractal dimension of the electronystagmographic recording from caloric testing gave a dimension ranging from 3.3 to 7.7. This result demonstrates that the multidimensional vestibular system, with its numerous neurological pathways, can somehow reduce the degrees of freedom and give rise to an irregular dynamic low-dimensional behavior, which is associated with deterministic chaos.

  10. A Formulation of the Mantel-Haenszel Differential Item Functioning Parameter with Practical Implications. Statistical Report. LSAC Research Report Series.

    Science.gov (United States)

    Roussos, Louis A.; Schnipke, Deborah L.; Pashley, Peter J.

    The Mantel-Haenszel (MH) differential item functioning (DIF) parameter for uniform DIF is well defined when item responses follow the two-parameter-logistic (2PPL) item response function (IRF), but not when they follow the three-parameter-logistic (3PL) IRF, the model typically used with multiple choice items. This research report presents a…

  11. Model Penjadwalan Batch Multi Item dengan Dependent Processing Time

    Directory of Open Access Journals (Sweden)

    Sukoyo Sukoyo

    2010-01-01

    Full Text Available This paper investigates a development of single machine batch scheduling for multi items with dependent processing time. The batch scheduling problem is to determine simultaneously number of batch (N, which item and its size allocated for each batch, and processing sequences of resulting batches. We use total actual flow time as the objective of schedule performance. The multi item batch scheduling problem could be formulated into a biner-integer nonlinear programming model because the number of batch should be in integer value, the allocation of items to resulting batch need binary values, and also there are some non-linearity on objective function and constraint due to the dependent processing time. By applying relaxation on the decision variable of number of batch (N as parameter, a heuristic procedure could be applied to find solution of the single machine batch scheduling problem for multi items.

  12. Differential Weighting of Items to Improve University Admission Test Validity

    OpenAIRE

    Eduardo Backhoff Escudero; Felipe Tirado Segura; Norma Larrazolo Reyna

    2001-01-01

    This paper gives an evaluation of different ways to increase university admission test criterion-related validity, by differentially weighting test items. We compared four methods of weighting multiple-choice items of the Basic Skills and Knowledge Examination (EXHCOBA): (1) punishing incorrect responses by a constant factor, (2) weighting incorrect responses, considering the levels of error, (3) weighting correct responses, considering the item’s difficulty, based on the Classic Measur...

  13. Action of jasmonates in plant stress responses and development--applied aspects.

    Science.gov (United States)

    Wasternack, Claus

    2014-01-01

    Jasmonates (JAs) are lipid-derived compounds acting as key signaling compounds in plant stress responses and development. The JA co-receptor complex and several enzymes of JA biosynthesis have been crystallized, and various JA signal transduction pathways including cross-talk to most of the plant hormones have been intensively studied. Defense to herbivores and necrotrophic pathogens are mediated by JA. Other environmental cues mediated by JA are light, seasonal and circadian rhythms, cold stress, desiccation stress, salt stress and UV stress. During development growth inhibition of roots, shoots and leaves occur by JA, whereas seed germination and flower development are partially affected by its precursor 12-oxo-phytodienoic acid (OPDA). Based on these numerous JA mediated signal transduction pathways active in plant stress responses and development, there is an increasing interest in horticultural and biotechnological applications. Intercropping, the mixed growth of two or more crops, mycorrhization of plants, establishment of induced resistance, priming of plants for enhanced insect resistance as well as pre- and post-harvest application of JA are few examples. Additional sources for horticultural improvement, where JAs might be involved, are defense against nematodes, biocontrol by plant growth promoting rhizobacteria, altered composition of rhizosphere bacterial community, sustained balance between growth and defense, and improved plant immunity in intercropping systems. Finally, biotechnological application for JA-induced production of pharmaceuticals and application of JAs as anti-cancer agents were intensively studied.

  14. Differential functioning of Bender Visual-Motor Gestalt Test items.

    Science.gov (United States)

    Sisto, Fermino Fernandes; Dos Santos, Acácia Aparecida Angeli; Noronha, Ana Paula Porto

    2010-02-01

    Differential Item Functioning (DIF) refers to items that do not function the same way for comparable members of different groups. The present study focuses on analyzing and classifying sex-related differential item functioning in the Bender Visual-Motor Gestalt Test. Subjects were 1,052 children attending public schools (513 boys, 539 girls, ages 6-10 years). The protocols were scored using the Bender Graduated Scoring System, which evaluates only the distortion criterion using the Rasch logistic response model. The scoring system fit the Rasch model, although two items were found to be biased by sex. When analyzing differential functioning of items for boys and girls separately, the number of differentially functioning items was equal.

  15. Dynamic response of a tunable phononic crystal under applied mechanical and magnetic loadings

    Science.gov (United States)

    Bayat, Alireza; Gordaninejad, Faramarz

    2015-06-01

    The dynamic response of a tunable phononic crystal consisting of a porous hyperelastic magnetoelastic elastomer subjected to a macroscopic deformation and an external magnetic field is theoretically investigated. Finite deformations and magnetic induction influence phononic characteristics of the periodic structure through geometrical pattern transformation and material properties. A magnetoelastic energy function is proposed to develop constitutive laws considering large deformations and magnetic induction in the periodic structure. Analytical and finite element methods are utilized to compute the dispersion relation and band structure of the phononic crystal for different cases of deformation and magnetic loadings. It is demonstrated that magnetic induction not only controls the band diagram of the structure but also has a strong effect on preferential directions of wave propagation.

  16. Teste de Raciocínio Auditivo Musical (RAu: estudo inicial por meio da Teoria de Reposta ao Item Test de Raciocinio Auditivo Musical (RAu: estudio inicial a través de la Teoría de Repuesta al Ítem Auditory Musical Reasoning Test: an initial study with Item Response Theory

    Directory of Open Access Journals (Sweden)

    Fernando Pessotto

    2012-12-01

    ón entre los grupos de músicos y no músicos. Los datos encontrados apuntan evidencias de que los ítems miden una dimensión principal (alfa=0,92 con alta capacidad para diferenciar los grupos de músicos profesionales, aficionados y laicos obteniéndose un coeficiente de validez de criterio de r=0,68. Los resultados indican evidencias positivas de precisión y validez para el RAu.This study investigated internal structure and criterion validity of a test that aims at assessing auditory processing of musical ability (Auditory Musical Reasoning Test, RAu. 162 people of both sexes were evaluated, 56.8% men, aged between 15 and 59 years of age (M=27.5; SD=9.01. Participants were divided among musicians (N=24, amateurs (N=62 and lay people (N=76 according to the extension of their knowledge in music. Full Information Item Factor Analysis verified the dimensionality of the instrument and also the properties of the items via Item Response Theory (IRT. Furthermore, we sought to identify the ability to discriminate between professional musicians, amateurs and lay people. Data showed evidence that the items measure a major dimension (alpha=.92 with high ability to differentiate groups of musicians, amateurs and lay people giving a criterion validity coefficient of r=.68. The results indicate positive evidence of reliability and validity for RAu test.

  17. Effects of reusing baseline volumes of interest by applying (non-rigid image registration on positron emission tomography response assessments.

    Directory of Open Access Journals (Sweden)

    Floris H P van Velden

    Full Text Available OBJECTIVES: Reusing baseline volumes of interest (VOI by applying non-rigid and to some extent (local rigid image registration showed good test-retest variability similar to delineating VOI on both scans individually. The aim of the present study was to compare response assessments and classifications based on various types of image registration with those based on (semi-automatic tumour delineation. METHODS: Baseline (n = 13, early (n = 12 and late (n = 9 response (after one and three cycles of treatment, respectively whole body [(18F]fluoro-2-deoxy-D-glucose positron emission tomography/computed tomography (PET/CT scans were acquired in subjects with advanced gastrointestinal malignancies. Lesions were identified for early and late response scans. VOI were drawn independently on all scans using an adaptive 50% threshold method (A50. In addition, various types of (non-rigid image registration were applied to PET and/or CT images, after which baseline VOI were projected onto response scans. Response was classified using PET Response Criteria in Solid Tumors for maximum standardized uptake value (SUV(max, average SUV (SUV(mean, peak SUV (SUV(peak, metabolically active tumour volume (MATV, total lesion glycolysis (TLG and the area under a cumulative SUV-volume histogram curve (AUC. RESULTS: Non-rigid PET-based registration and non-rigid CT-based registration followed by non-rigid PET-based registration (CTPET did not show differences in response classifications compared to A50 for SUV(max and SUV(peak, however, differences were observed for MATV, SUV(mean, TLG and AUC. For the latter, these registrations demonstrated a poorer performance for small lung lesions (<2.8 ml, whereas A50 showed a poorer performance when another area with high uptake was close to the target lesion. All methods were affected by lesions with very heterogeneous tracer uptake. CONCLUSIONS: Non-rigid PET- and CTPET-based image registrations may be used to classify response

  18. Comparative Study of Various E. coli Strains for Biohydrogen Production Applying Response Surface Methodology

    Directory of Open Access Journals (Sweden)

    Péter Bakonyi

    2012-01-01

    Full Text Available The proper strategy to establish efficient hydrogen-producing biosystems is the biochemical, physiological characterization of hydrogen-producing microbes followed by metabolic engineering in order to give extraordinary properties to the strains and, finally, bioprocess optimization to realize enhanced hydrogen fermentation capability. In present paper, it was aimed to show the utility both of strain engineering and process optimization through a comparative study of wild-type and genetically modified E. coli strains, where the effect of two major operational factors (substrate concentration and pH on bioH2 production was investigated by experimental design and response surface methodology (RSM was used to determine the suitable conditions in order to obtain maximum yields. The results revealed that by employing the genetically engineered E. coli (DJT 135 strain under optimized conditions (pH: 6.5; Formate conc.: 1.25 g/L, 0.63 mol H2/mol formate could be attained, which was 1.5 times higher compared to the wild-type E. coli (XL1-BLUE that produced 0.42 mol H2/mol formate (pH: 6.4; Formate conc.: 1.3 g/L.

  19. Applying Analytic Reasoning to Clarify Intention and Responsibility in Joint Criminal Enterprise Cases

    Directory of Open Access Journals (Sweden)

    Anthony Amatrudo

    2016-12-01

    Full Text Available This paper argues that both criminologists and lawyers need a far more philosophically robust account of joint action, notably as it relates to technical matters of intentionality and responsibility when dealing with joint criminal enterprise cases. Criminology seems unable to see beyond the superficiality of cultural explanations ill-suited to understanding matters of action. Law seems wedded to mystical notions of foresight. As regards the law there seems common agreement that joint enterprise prosecutions tend to over-criminalise secondary parties. This paper suggests that the current discussions around joint criminal enterprise will benefit from a critical engagement with analytical philosophy. The paper will examine a series of technical accounts of shared commitment and intention in order to explain the problems of joint criminal enterprise (multi-agent criminal activity. Este artículo defiende que tanto criminólogos como abogados necesitan ofrecer una acción conjunta más robusta, desde el punto de vista filosófico, especialmente en lo que se refiere a aspectos técnicos de intencionalidad y responsabilidad, al tratar casos de colaboración criminal. La criminología parece incapaz de ver más allá de la superficialidad de las explicaciones culturales, inadecuadas para entender cuestiones de acción. El derecho parece aliado con nociones místicas de previsión. En lo que respecta al derecho, parece que existe un consenso en que los fiscales de asociaciones de malhechores tienden a penalizar en exceso a los cómplices. Este artículo sugiere que el debate actual sobre asociaciones criminales se beneficiará de un compromiso crítico con la filosofía analítica. El artículo analiza un conjunto de explicaciones técnicas de compromiso e intención compartidos para explicar los problemas de las asociaciones criminales (actividad criminal multi-agente. DOWNLOAD THIS PAPER FROM SSRN: http://ssrn.com/abstract=2847796

  20. Dynamic response of a thin sessile drop of conductive liquid to an abruptly applied or removed electric field

    Science.gov (United States)

    Corson, L. T.; Mottram, N. J.; Duffy, B. R.; Wilson, S. K.; Tsakonas, C.; Brown, C. V.

    2016-10-01

    We consider, both theoretically and experimentally, a thin sessile drop of conductive liquid that rests on the lower plate of a parallel-plate capacitor. We derive analytical expressions for both the initial deformation and the relaxation dynamics of the drop as the electric field is either abruptly applied or abruptly removed, as functions of the geometrical, electrical, and material parameters, and investigate the ranges of validity of these expressions by comparison with full numerical simulations. These expressions provide a reasonable description of the experimentally measured dynamic response of a drop of conductive ionic liquid 1-butyl-3-methyl imidazolium tetrafluoroborate.

  1. An Item Factor Analysis of the Mooney Problem Check List

    Science.gov (United States)

    Stewart, David W.; Deiker, Thomas

    1976-01-01

    Explores the factor structure of the Mooney Problem Check List (MPCL) at the junior and senior high school level by undertaking a large obverse factor analysis of item responses in three adolescent criterion groups. (Author/DEP)

  2. Three controversies over item disclosure in medical licensure examinations

    Directory of Open Access Journals (Sweden)

    Yoon Soo Park

    2015-09-01

    Full Text Available In response to views on public's right to know, there is growing attention to item disclosure – release of items, answer keys, and performance data to the public – in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations – 1 fairness and validity, 2 impact on passing levels, and 3 utility of item disclosure – by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers’ right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration.

  3. Item Analysis and Differential Item Functioning of a Brief Conduct Problem Screen

    Science.gov (United States)

    Wu, Johnny; King, Kevin M.; Witkiewitz, Katie; Racz, Sarah Jensen; McMahon, Robert J.

    2012-01-01

    Research has shown that boys display higher levels of childhood conduct problems than girls, and Black children display higher levels than White children, but few studies have tested for scalar equivalence of conduct problems across gender and race. The authors conducted a 2-parameter item response theory (IRT) model to examine item…

  4. The Influence of Item Formats when Locating a Student on a Learning Progression in Science

    Directory of Open Access Journals (Sweden)

    Jing Chen

    2016-07-01

    Full Text Available Learning progressions are used to describe how students’ understanding of a topic progresses over time. This study evaluates the effectiveness of different item formats for placing students into levels along a learning progression for carbon cycling. The item formats investigated were Constructed Response (CR items and two types of two-tier items: (1 Ordered Multiple-Choice (OMC followed by CR items and (2 Multiple True or False (MTF followed by CR items. Our results suggest that estimates of students’ learning progression level based on OMC and MTF responses are moderately predictive of their level based on CR responses. With few exceptions, CR items were effective for differentiating students among learning progression levels. Based on the results, we discuss how to design and best use items in each format to more accurately measure students’ level along learning progressions in science.

  5. Use of Item Parceling in Structural Equation Modeling with Missing Data

    Science.gov (United States)

    Orcan, Fatih

    2013-01-01

    Parceling is referred to as a procedure for computing sums or average scores across multiple items. Parcels instead of individual items are then used as indicators of latent factors in the structural equation modeling analysis (Bandalos 2002, 2008; Little et al., 2002; Yang, Nay, & Hoyle, 2010). Item parceling may be applied to alleviate some…

  6. An Empirical Investigation of Methods for Assessing Item Fit for Mixed Format Tests

    Science.gov (United States)

    Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N.

    2013-01-01

    Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…

  7. Growth responses of Kentucky bluegrass (Poa pratensis L.) to trinexapac-ethyl applied in spring and autumn

    Institute of Scientific and Technical Information of China (English)

    Guangyu FAN; Xiuju BIAN; Huibin LI; Zhao MENG; Shengyao LIU

    2009-01-01

    The practices with low clippings production to save time, money, or landfill space, were favored by turf managers. Understanding the responses of Kentucky bluegrass (Poa pratensis L.) to Trinexapac-ethyl (TE) would facilitate recommendations regarding its safe and effective use in Northern China. The objectives of this study are (1) to investigate the effects of TE on vertical growth, clipping yield, leaf width, and chlorophyll content of Kentucky bluegrass, and (2) to compare the seasonal application impacts of TE. Both spring and autumn experiment results demonstrated that Trinexapac-ethyl applied to Kentucky bluegrass, suppressed the vertical grass growth and significantly reduced the Kentucky bluegrass clippings production within a few weeks after initial treatment. Applied trinexapac-ethyl enhanced Kentucky bluegrass leaf width in both spring and autumn experimental periods. Discoloration on leaf tips was observed and lasted for four weeks when the same TE rate of 0.191 mL·m-2 was applied in early autumn. Darker leaves with higher chlorophyll content compared with non TE-treatments appeared after the initial four weeks of the treatment in autumn and the treatment for the entire spring.

  8. Conditional recall and the frequency effect in the serial recall task: an examination of item-to-item associativity.

    Science.gov (United States)

    Miller, Leonie M; Roodenrys, Steven

    2012-11-01

    The frequency effect in short-term serial recall is influenced by the composition of lists. In pure lists, a robust advantage in the recall of high-frequency (HF) words is observed, yet in alternating mixed lists, HF and low-frequency (LF) words are recalled equally well. It has been argued that the preexisting associations between all list items determine a single, global level of supportive activation that assists item recall. Preexisting associations between items are assumed to be a function of language co-occurrence; HF-HF associations are high, LF-LF associations are low, and mixed associations are intermediate in activation strength. This account, however, is based on results when alternating lists with equal numbers of HF and LF words were used. It is possible that directional association between adjacent list items is responsible for the recall patterns reported. In the present experiment, the recall of three forms of mixed lists-those with equal numbers of HF and LF items and pure lists-was examined to test the extent to which item-to-item associations are present in serial recall. Furthermore, conditional probabilities were used to examine more closely the evidence for a contribution, since correct-in-position scoring may mask recall that is dependent on the recall of prior items. The results suggest that an item-to-item effect is clearly present for early but not late list items, and they implicate an additional factor, perhaps the availability of resources at output, in the recall of late list items.

  9. Hierarchical Item Response Models for Cognitive Diagnosis

    Science.gov (United States)

    Hansen, Mark Patrick

    2013-01-01

    Cognitive diagnosis models (see, e.g., Rupp, Templin, & Henson, 2010) have received increasing attention within educational and psychological measurement. The popularity of these models may be largely due to their perceived ability to provide useful information concerning both examinees (classifying them according to their attribute profiles)…

  10. Thirty years of nonparametric item response theory

    NARCIS (Netherlands)

    Molenaar, W.

    2001-01-01

    Relationships between a mathematical measurement model and its real-world applications are discussed. A distinction is made between large data matrices commonly found in educational measurement and smaller matrices found in attitude and personality measurement. Nonparametric methods are evaluated fo

  11. Quantitative penetration testing with item response theory

    NARCIS (Netherlands)

    Arnold, Florian; Pieters, Wolter; Stoelinga, Mariëlle

    2014-01-01

    Existing penetration testing approaches assess the vulnerability of a system by determining whether certain attack paths are possible in practice. Thus, penetration testing has so far been used as a qualitative research method. To enable quantitative approaches to security risk management, including

  12. Quantitative penetration testing with item response theory

    NARCIS (Netherlands)

    Pieters, W.; Arnold, F.; Stoelinga, M.I.A.

    2013-01-01

    Existing penetration testing approaches assess the vulnerability of a system by determining whether certain attack paths are possible in practice. Therefore, penetration testing has thus far been used as a qualitative research method. To enable quantitative approaches to security risk management, in

  13. Quantitative penetration testing with item response theory

    NARCIS (Netherlands)

    Arnold, Florian; Pieters, Wolter; Stoelinga, Mariëlle

    2013-01-01

    Existing penetration testing approaches assess the vulnerability of a system by determining whether certain attack paths are possible in practice. Thus, penetration testing has so far been used as a qualitative research method. To enable quantitative approaches to security risk management, including

  14. 非参数认知诊断方法:多级评分的聚类分析%Nonparametric Cognitive Diagnosis:A Cluster Diagnostic Method Based on Grade Response Items

    Institute of Scientific and Technical Information of China (English)

    康春花; 任平; 曾平飞

    2015-01-01

    Examinations help students learn more efficiently by filling their learning gaps. To achieve this goal, we have to differentiate students who have from those who have not mastered a set of attributes as measured by the test through cognitive diagnostic assessment. K-means cluster analysis, being a nonparametric cognitive diagnosis method requires the Q-matrix only, which reflects the relationship between attributes and items. This does not require the estimation of the parameters, so is independent of sample size, simple to operate, and easy to understand. Previous research use the sum score vectors or capability scores vector as the clustering objects. These methods are only adaptive for dichotomous data. Structural response items are, however, the main type used in examinations, particularly as required in recent reforms. On the basis of previous research, this paper puts forward a method to calculate a capability matrix reflecting the mastery level on skills and is applicable to grade response items. Our study included four parts. First, we introduced the K-means cluster diagnosis method which has been adapted for dichotomous data. Second, we expanded the K-means cluster diagnosis method for grade response data (GRCDM). Third, in Part Two, we investigated the performance of the method introduced using a simulation study. Fourth, we investigated the performance of the method in an empirical study. The simulation study focused on three factors. First, the sample size was set to be 100, 500, and 1000. Second, the percentage of random errors was manipulated to be 5%, 10%, and 20%. Third, it had four hierarchies, as proposed by Leighton. All experimental conditions composed of seven attributes, different items according to hierarchies. Simulation results showed that: (1) GRCDM had a high pattern match ratio (PMR) and high marginal match ratio (MMR). This method was shown to be feasible in cognitive diagnostic assessment. (2) The classification accuracy (MMR and PMR

  15. The ITER 3D Magnetic Diagnostic Response to Applied n=3 and n=4 RMP's

    Energy Technology Data Exchange (ETDEWEB)

    Lazerson, S A [PPPL

    2014-09-01

    The ITER magnetic diagnostic response to applied n=3 and n=4 RMPs has been calculated for the 15MA scenario. The VMEC code was utilized to calculate free boundary 3D ideal MHD equilibria, where the non-stellarator symmetric terms were included in the calculation. This allows an assessment to be made of the possible boundary displacements due to RMP application in ITER. As the VMEC code assumes a continuous set of nested flux surface, the possibility of island and stochastic region formation is ignored. At the start of the current at-top (L-Mode) application of n = 4 RMP's indicates approximately 1 cm peak-to-peak displacements on the low field side of the plasma while later in the shot (H-mode) perturbations as large as 3 cm are present. Forward modeling of the ITER magnetic diagnostics indicates significant non-axisymmetric plasma response, exceeding 10% the axisymmetric signal in many of the flux loops. Magnetic field probes seem to indicate a greater robustness to 3D effects but still indicate large sensitivities to 3D effects in a number of sensors. Forward modeling of the diagnostics response to 3D equilibria allows assessment of diagnostics design and control scenarios.

  16. Faculty development on item writing substantially improves item quality.

    Science.gov (United States)

    Naeem, Naghma; van der Vleuten, Cees; Alfaris, Eiad Abdelmohsen

    2012-08-01

    The quality of items written for in-house examinations in medical schools remains a cause of concern. Several faculty development programs are aimed at improving faculty's item writing skills. The purpose of this study was to evaluate the effectiveness of a faculty development program in item development. An objective method was developed and used to assess improvement in faculty's competence to develop high quality test items. This was a quasi experimental study with a pretest-midtest-posttest design. A convenience sample of 51 faculty members participated. Structured checklists were used to assess the quality of test items at each phase of the study. Group scores were analyzed using repeated measures analysis of variance. The results showed a significant increase in participants' mean scores on Multiple Choice Questions, Short Answer Questions and Objective Structured Clinical Examination checklists from pretest to posttest (p development are generally lacking in quality. It also provides evidence of the value of faculty development in improving the quality of items generated by faculty.

  17. Differential Weighting of Items to Improve University Admission Test Validity

    Directory of Open Access Journals (Sweden)

    Eduardo Backhoff Escudero

    2001-05-01

    Full Text Available This paper gives an evaluation of different ways to increase university admission test criterion-related validity, by differentially weighting test items. We compared four methods of weighting multiple-choice items of the Basic Skills and Knowledge Examination (EXHCOBA: (1 punishing incorrect responses by a constant factor, (2 weighting incorrect responses, considering the levels of error, (3 weighting correct responses, considering the item’s difficulty, based on the Classic Measurement Theory, and (4 weighting correct responses, considering the item’s difficulty, based on the Item Response Theory. Results show that none of these methods increased the instrument’s predictive validity, although they did improve its concurrent validity. It was concluded that it is appropriate to score the test by simply adding up correct responses.

  18. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    Science.gov (United States)

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  19. Selection of unidimensional scales from a multidimensional item bank in the polytomous Mokken IRT model

    NARCIS (Netherlands)

    Hemker, BT; Sijtsma, Klaas; Molenaar, Ivo W

    1995-01-01

    An automated item selection procedure for selecting unidimensional scales of polytomous items from multidimensional datasets is developed for use in the context of the Mokken item response theory model of monotone homogeneity (Mokken & Lewis, 1982). The selection procedure is directly based on the s

  20. The Asymptotic Distribution of Ability Estimates: Beyond Dichotomous Items and Unidimensional IRT Models

    Science.gov (United States)

    Sinharay, Sandip

    2015-01-01

    The maximum likelihood estimate (MLE) of the ability parameter of an item response theory model with known item parameters was proved to be asymptotically normally distributed under a set of regularity conditions for tests involving dichotomous items and a unidimensional ability parameter (Klauer, 1990; Lord, 1983). This article first considers…

  1. Curriculum and Translation Differential Item Functioning: A Comparison of Two DIF Detection Techniques.

    Science.gov (United States)

    Emenogu, Barnabas; Childs, Ruth A.

    This study investigated the possible impacts of language and curriculum differences on the performance of test items by subpopulations of students. Focusing on Measurement and Geometry items completed by students in French- and English-language schools in Ontario made it possible to explore the differences and to compare the item response theory…

  2. A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

    Science.gov (United States)

    Edwards, Michael C.

    2010-01-01

    Item factor analysis has a rich tradition in both the structural equation modeling and item response theory frameworks. The goal of this paper is to demonstrate a novel combination of various Markov chain Monte Carlo (MCMC) estimation routines to estimate parameters of a wide variety of confirmatory item factor analysis models. Further, I show…

  3. Item Vector Plots for the Multidimensional Three-Parameter Logistic Model

    Science.gov (United States)

    Bryant, Damon; Davis, Larry

    2011-01-01

    This brief technical note describes how to construct item vector plots for dichotomously scored items fitting the multidimensional three-parameter logistic model (M3PLM). As multidimensional item response theory (MIRT) shows promise of being a very useful framework in the test development life cycle, graphical tools that facilitate understanding…

  4. Calibration of an Item Bank for the Assessment of Basque Language Knowledge

    Science.gov (United States)

    Lopez-Cuadrado, Javier; Perez, Tomas A.; Vadillo, Jose A.; Gutierrez, Julian

    2010-01-01

    The main requisite for a functional computerized adaptive testing system is the need of a calibrated item bank. This text presents the tasks carried out during the calibration of an item bank for assessing knowledge of Basque language. It has been done in terms of the 3-parameter logistic model provided by the item response theory. Besides, this…

  5. Reliability estimation for single dichotomous items based on Mokken's IRT model

    NARCIS (Netherlands)

    Meijer, R R; Sijtsma, K; Molenaar, Ivo W

    1995-01-01

    Item reliability is of special interest for Mokken's nonparametric item response theory, and is useful for the evaluation of item quality in nonparametric test construction research. It is also of interest for nonparametric person-fit analysis. Three methods for the estimation of the reliability of

  6. Reliability estimation for single dichotomous items based on Mokken's IRT model

    NARCIS (Netherlands)

    Meijer, Rob R.; Sijtsma, Klaas; Molenaar, Ivo W.

    1995-01-01

    Item reliability is of special interest for Mokken’s nonparametric item response theory, and is useful for the evaluation of item quality in nonparametric test construction research. It is also of interest for nonparametric person-fit analysis. Three methods for the estimation of the reliability of

  7. OSL and Tl response characterization of micro LiF:Mg, Ti dosimeters to be applied to VMAT quality assurance

    Energy Technology Data Exchange (ETDEWEB)

    Bravim, A.; Campos, L. L. [Instituto de Pesquisas Energeticas e Nucleares / CNEN, Av. Lineu Prestes 2242, Cidade Universitaria, 05508-000 Sao Paulo (Brazil); Sakuraba, R. K.; Da Cruz, J. C., E-mail: ambravim@hotmail.com [Sociedade Beneficente Israelita Brasileira - Hospital Albert Einstein, Av. Albert Einstein 627/701, Jardim Leonor, 05652-900 Sao Paulo (Brazil)

    2014-08-15

    VMAT Rapid Arc is a new method of treatment responsible for a change in the setting of radiotherapy, bringing benefits and allowing a lower toxicity in the treatment of patients. With this treatment is possible to minimize the radiation dose to the healthy tissues and escalate the dose to the target volume (tumor) (Hall, 1998; Mundt, 2005; Bortfeld, 2006). The quality assurance is essential to verify the operation of all components involved in the process of treatment planning and dose delivery. Several organizations recommended the verification of patient dose for quality improvement in radiotherapy and the recommended maximum values for the uncertainty in the dose range of ± 5% (ICRU, 1976, AAPM, 1983). This paper aims to evaluate the feasibility of applying LiF:Mg,Ti micro dosimeters as a new method of dosimetry to VMAT Rapid Arc. (Author)

  8. Het nut van item respons theorie bij de constructie en evaluatie van niet-cognitieve instrumenten voor selectie en assessment binnen organisaties. : (The usefulness of item response theory for the construction and evaluation of noncognitive tests in personnel selection and assessment.)

    NARCIS (Netherlands)

    Egberink, Iris J. L.; Meijer, Rob R.

    2012-01-01

    In this article we discuss the use of IRT for the development and application of noncognitive measures in personnel selection and career development. We introduce the basic principles of IRT and we discuss the usefulness of IRT to evaluate the quality of items and tests to assess the measurement pre

  9. Responses of soil microbial communities in the rhizosphere of cucumber (Cucumis sativus L.) to exogenously applied p-hydroxybenzoic acid.

    Science.gov (United States)

    Zhou, Xingang; Yu, Gaobo; Wu, Fengzhi

    2012-08-01

    Changes in soil biological properties have been implicated as one of the causes of soil sickness, a phenomenon that occurs in continuous monocropping systems. However, the causes for these changes are not yet clear. The aim of this work was to elucidate the role of p-hydroxybenzoic acid (PHBA), an autotoxin of cucumber (Cucumis sativus L.), in changing soil microbial communities. p-Hydroxybenzoic acid was applied to soil every other day for 10 days in cucumber pot assays. Then, the structures and sizes of bacterial and fungal communities, dehydrogenase activity, and microbial carbon biomass (MCB) were assessed in the rhizosphere soil. Structures and sizes of rhizosphere bacterial and fungal communities were analyzed by polymerase chain reaction (PCR)-denaturing gradient gel electrophoresis (DGGE) and real-time PCR, respectively. p-Hydroxybenzoic acid inhibited cucumber seedling growth and stimulated rhizosphere dehydrogenase activity, MBC content, and bacterial and fungal community sizes. Rhizosphere bacterial and fungal communities responded differently to exogenously applied PHBA. The PHBA decreased the Shannon-Wiener index for the rhizosphere bacterial community but increased that for the rhizosphere fungal community. In addition, the response of the rhizosphere fungal community structure to PHBA acid was concentration dependent, but was not for the rhizosphere bacterial community structure. Our results indicate that PHBA plays a significant role in the chemical interactions between cucumber and soil microorganisms and could account for the changes in soil microbial communities in the continuously monocropped cucumber system.

  10. Relation between relative growth rate, endogenous gibberellins, and the response to applied gibberellic acid for Plantago major.

    Science.gov (United States)

    Dijkstra, P; Reegen, H; Kuiper, P J

    1990-08-01

    Relationships between relative growth rate (RGR), endogenous gibberellin (GA) concentration and the response to application of gibberellic acid (GA(3) ) were studied for two inbred lines of Plantago major L., which differed in RGR. A4, the fast-growing inbred line, had a higher free GA concentration than the slow-growing W9, as analyzed by enzyme immunoassay. GA(3) application increased total plant weight and RGR(3) particularly for the slow-growing line. Chlorophyll a content and photosynthetic activity per unit leaf area were decreased, while transpiration rate was unaffected by GA(3) application. The increase in RGR by GA(3) application was associated with an increased leaf weight ratio; specific leaf area and percentage of dry matter in the leaves were only temporarily affected. Root respiration rate per unit dry weight was unaffected. The correlation between low RGR, low GA concentration and high responsiveness to applied GA(3) supports the contention that gibberellins are involved in the regulation of RGR. However, the transient influence of GA(3) application on some growth components suggests the involvement of other regulatory factors in addition to GA.

  11. A primary study on applying Delphi method to screen items of empowerment psychological care scale for critically ill patients%Delphi法筛选危重症患者赋能心理护理量表条目的初步研究

    Institute of Scientific and Technical Information of China (English)

    刘亚楠; 李红

    2012-01-01

    Objective To screen empowerment psychological care scale items for critically ill patients through Delphi method.Methods Totally 27 experts were invited to conduct two rounds consultation by using Delphi method,and the results were summarized.Results The effective response rate of the experts' questionnaire for the first round and the second round was 90% and 100%.The specialist authority coefficient was 0.895.The coordination coefficients for 2 rounds were 0.432 and 0.448 respectively.After two-round consultation,empowerment psychological care scale items for critically ill patients was finally formulated,containing 4 dimensions and 24 evaluation items.Conclusions The selected experts were highly representative,and they were enthusiastically involved in the consultation.And the scale has remarkable authority and coordination;it can reflect the theme of the study,provide instructive function for clinical medical personnel in implementation and evaluation of empowerment psychological care.%目的 应用Delphi法筛选危重症患者赋能心理护理量表条目.方法 采用Delphi专家咨询法对27名专家进行2轮咨询,并对结果进行归纳.结果 2轮咨询问卷回收率分别为90%和100%;专家权威系数为0.895;2轮咨询的专家意见的协调系数分别为0.432和0.448,差异明显;经过2轮咨询,最终确定由4个维度、24个条目构成的危重症患者赋能心理护理量表.结论 此研究专家代表性强,积极性高,权威程度和协调程度好,形成的危重症患者赋能心理护理量表能够较好地反映所研究的主题,对实施和评价危重症患者的赋能心理护理具有指导作用.

  12. Orienting response elicitation by personally significant information under subliminal stimulus presentation: demonstration using the concealed information test.

    Science.gov (United States)

    Maoz, Keren; Breska, Assaf; Ben-Shakhar, Gershon

    2012-12-01

    Considerable evidence suggests that subliminal information can trigger cognitive and neural processes. Here, we examined whether elicitation of orienting response by personally significant (PS) verbal information requires conscious awareness of the input. Subjects were exposed to the Concealed Information Test (CIT), in which autonomic responses for autobiographical items are typically larger than for control items. These items were presented subliminally using two different masking protocols: single or multiple presentation of the masked item. An objective test was used to verify unawareness to the stimuli. As predicted, PS items elicited significantly stronger skin conductance responses than the control items in both exposure conditions. The results extend previous findings showing that autonomic responses can be elicited following subliminal exposure to aversive information, and also may have implications on the applied usage of the CIT.

  13. Ineffectiveness of reverse wording of questionnaire items: let's learn from cows in the rain.

    Directory of Open Access Journals (Sweden)

    Eric van Sonderen

    Full Text Available OBJECTIVE: We examined the effectiveness of reverse worded items as a means of reducing or preventing response bias. We first distinguished between several types of response bias that are often confused in literature. We next developed arguments why reversing items is probably never a good way to address response bias. We proposed testing whether reverse wording affects response bias with item-level data from the Multidimensional Fatigue Inventory (MFI-20, an instrument that contains reversed worded items. METHODS: With data from 700 respondents, we compared scores on items that were similar with respect either to content or to direction of wording. Psychometric properties of sets of these items worded in the same direction were compared with sets consisting of both straightforward and reversed worded items. RESULTS: We did not find evidence that ten reverse-worded items prevented response bias. Instead, the data suggest scores were contaminated by respondent inattention and confusion. CONCLUSIONS: Using twenty items, balanced for scoring direction, to assess fatigue did not prevent respondents from inattentive or acquiescent answering. Rather, fewer mistakes are made with a 10-item instrument with items posed in the same direction. Such a format is preferable for both epidemiological and clinical studies.

  14. Increasing Active Student Responding in a University Applied Behavior Analysis Course: The Effect of Daily Assessment and Response Cards on End of Week Quiz Scores

    Science.gov (United States)

    Malanga, Paul R.; Sweeney, William J.

    2008-01-01

    The study compared the effects of daily assessment and response cards on average weekly quiz scores in an introduction to applied behavior analysis course. An alternating treatments design (Kazdin 1982, "Single-case research designs." New York: Oxford University Press; Cooper et al. 2007, "Applied behavior analysis." Upper Saddle River:…

  15. Evaluation of the PROMIS physical function item bank in orthopaedic patients.

    Science.gov (United States)

    Hung, Man; Clegg, Daniel O; Greene, Tom; Saltzman, Charles L

    2011-06-01

    The patient-reported outcomes measurement information system (PROMIS) physical function item bank v1 (PPFIB) contains 124 item response theory (IRT) calibrated items (Rose et al. 2008. J Clin Epidemiol 61:17–33).We report the psychometric properties of these items within an outpatient, orthopaedic patient population. In particular, we investigated whether a single unidimensional IRT scale can adequately define physical function of patients presenting with primarily upper or lower extremity orthopaedic complaints. We conducted a prospective study at an orthopaedic outpatient clinic to collect data from 865 adult patients with all 124 PROMIS physical function items and seven demographic items. Items were evaluated by a Rasch model. Total variance (60.6%) across the 124 items was explained by a single Rasch dimension. The variance explained by the second dimension was 7.7%, reflecting differential item functioning in the upper and lower extremity patients. The upper extremity physical function items had a pronounced ceiling effect. A single physical function dimension accounts for most of the item variance in the PPFIB, suggesting that the items are measuring predominantly one single construct. Separate subscales for lower versus upper extremities, especially with additional items at the upper trait level of the upper extremity subscale, may further enhance evaluation of physical function in orthopaedic patients.

  16. Psychometric Consequences of Subpopulation Item Parameter Drift

    Science.gov (United States)

    Huggins-Manley, Anne Corinne

    2017-01-01

    This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…

  17. Constructed-Response DIF Evaluations for Mixed-Format Tests. Research Report. ETS RR-13-33

    Science.gov (United States)

    Moses, Tim; Liu, Jinghua; Tan, Adele; Deng, Weiling; Dorans, Neil J.

    2013-01-01

    In this study, differential item functioning (DIF) methods utilizing 14 different matching variables were applied to assess DIF in the constructed-response (CR) items from 6 forms of 3 mixed-format tests. Results suggested that the methods might produce distinct patterns of DIF results for different tests and testing programs, in that the DIF…

  18. Using module analysis for multiple choice responses:A new method applied to Force Concept Inventory data

    OpenAIRE

    Brewe, Eric; Bruun, Jesper; Bearden, Ian

    2016-01-01

    We describe a methodology for carrying out a network analysis of Force Concept Inventory (FCI) responses that aims to identify communities of incorrect responses. This method first treats FCI responses as a bipartite, student X response, network. We then use Locally Adaptive Network Sparsification\\citep{Foti2011} and InfoMap\\citep{rosvall2009map} community detection algorithms to find modules of incorrect responses. This method is then used to analyze post-FCI data from one cohort of Danish u...

  19. The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency

    DEFF Research Database (Denmark)

    Rose, Matthias; Bjørner, Jakob; Gandek, Barbara;

    2014-01-01

    OBJECTIVE: To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. STUDY DESIGN AND SETTING: The items were evaluated using qualitative and quantitative methods. A total...... of 16,065 adults answered item subsets (n>2,200/item) on the Internet, with oversampling of the chronically ill. Classical test and item response theory methods were used to evaluate 149 PROMIS PF items plus 10 Short Form-36 and 20 Health Assessment Questionnaire-Disability Index items. A graded...... response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. RESULTS: The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living...

  20. Why item parcels are (almost) never appropriate: two wrongs do not make a right--camouflaging misspecification with item parcels in CFA models.

    Science.gov (United States)

    Marsh, Herbert W; Lüdtke, Oliver; Nagengast, Benjamin; Morin, Alexandre J S; Von Davier, Matthias

    2013-09-01

    The present investigation has a dual focus: to evaluate problematic practice in the use of item parcels and to suggest exploratory structural equation models (ESEMs) as a viable alternative to the traditional independent clusters confirmatory factor analysis (ICM-CFA) model (with no cross-loadings, subsidiary factors, or correlated uniquenesses). Typically, it is ill-advised to (a) use item parcels when ICM-CFA models do not fit the data, and (b) retain ICM-CFA models when items cross-load on multiple factors. However, the combined use of (a) and (b) is widespread and often provides such misleadingly good fit indexes that applied researchers might believe that misspecification problems are resolved--that 2 wrongs really do make a right. Taking a pragmatist perspective, in 4 studies we demonstrate with responses to the Rosenberg Self-Esteem Inventory (Rosenberg, 1965), Big Five personality factors, and simulated data that even small cross-loadings seriously distort relations among ICM-CFA constructs or even decisions on the number of factors; although obvious in item-level analyses, this is camouflaged by the use of parcels. ESEMs provide a viable alternative to ICM-CFAs and a test for the appropriateness of parcels. The use of parcels with an ICM-CFA model is most justifiable when the fit of both ICM-CFA and ESEM models is acceptable and equally good, and when substantively important interpretations are similar. However, if the ESEM model fits the data better than the ICM-CFA model, then the use of parcels with an ICM-CFA model typically is ill-advised--particularly in studies that are also interested in scale development, latent means, and measurement invariance.

  1. Identifying predictors of physics item difficulty: A linear regression approach

    Science.gov (United States)

    Mesic, Vanes; Muratovic, Hasnija

    2011-06-01

    Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge

  2. Gender differential item functioning on a national field-specific test: The case of PhD entrance exam of TEFL in Iran

    Directory of Open Access Journals (Sweden)

    Alireza Ahmadi

    2016-01-01

    Full Text Available Differential Item Functioning (DIF exists when examinees of equal ability from different groups have different probabilities of successful performance in a certain item. This study examined gender differential item functioning across the PhD Entrance Exam of TEFL (PEET in Iran, using both logistic regression (LR and one-parameter item response theory (1-p IRT models. The PEET is a national test consisting of a centralized written examination designed to provide information on the eligibility of PhD applicants of TEFL to enter PhD programs. The 2013 administration of this test provided score data for a sample of 999 Iranian PhD applicants consisting of 397 males and 602 females. First, the data were subjected to DIF analysis through logistic regression (LR model. Then, to triangulate the findings, a 1-p IRT procedure was applied. The results indicated (1 more items flagged for DIF by LR than by 1-p IRT (2 DIF cancellation (the number of DIF items were equal for both males and females, as revealed through LR, (3 equal number of uniform and non-uniform DIF, as tracked via LR, and (4 female superiority in the test performance, as revealed via IRT analysis. Overall, the findings of the study indicated that PEET suffers from DIF. As such, test developers and policymakers (like NOET & MSRT are recommended to take these findings into serious consideration and exercise care in fair test practice by dedicating effort to more unbiased test development and decision making.

  3. Development and community-based validation of eight item banks to assess mental health.

    Science.gov (United States)

    Batterham, Philip J; Sunderland, Matthew; Carragher, Natacha; Calear, Alison L

    2016-09-30

    There is a need for precise but brief screening of mental health problems in a range of settings. The development of item banks to assess depression and anxiety has resulted in new adaptive and static screeners that accurately assess severity of symptoms. However, expansion to a wider array of mental health problems is required. The current study developed item banks for eight mental health problems: social anxiety disorder, panic disorder, post-traumatic stress disorder, obsessive-compulsive disorder, adult attention-deficit hyperactivity disorder, drug use, psychosis and suicidality. The item banks were calibrated in a population-based Australian adult sample (N=3175) by administering large item pools (45-75 items) and excluding items on the basis of local dependence or measurement non-invariance. Item Response Theory parameters were estimated for each item bank using a two-parameter graded response model. Each bank consisted of 19-47 items, demonstrating excellent fit and precision across a range of -1 to 3 standard deviations from the mean. No previous study has developed such a broad range of mental health item banks. The calibrated item banks will form the basis of a new system of static and adaptive measures to screen for a broad array of mental health problems in the community.

  4. Ineffectiveness of Reverse Wording of Questionnaire Items : Let's Learn from Cows in the Rain

    NARCIS (Netherlands)

    van Sonderen, Eric; Sanderman, Robbert; Coyne, James C.

    2013-01-01

    Objective: We examined the effectiveness of reverse worded items as a means of reducing or preventing response bias. We first distinguished between several types of response bias that are often confused in literature. We next developed arguments why reversing items is probably never a good way to ad

  5. Prediction of true test scores from observed item scores and ancillary data.

    Science.gov (United States)

    Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

    2015-05-01

    In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability.

  6. Sharing the cost of redundant items

    DEFF Research Database (Denmark)

    Hougaard, Jens Leth; Moulin, Hervé

    2014-01-01

    are network connectivity problems when an existing (possibly inefficient) network must be maintained. We axiomatize a family cost ratios based on simple liability indices, one for each agent and for each item, measuring the relative worth of this item across agents, and generating cost allocation rules......We ask how to share the cost of finitely many public goods (items) among users with different needs: some smaller subsets of items are enough to serve the needs of each user, yet the cost of all items must be covered, even if this entails inefficiently paying for redundant items. Typical examples...... additive in costs....

  7. Early applied electric field stimulation attenuates secondary apoptotic responses and exerts neuroprotective effects in acute spinal cord injury of rats.

    Science.gov (United States)

    Zhang, C; Zhang, G; Rong, W; Wang, A; Wu, C; Huo, X

    2015-04-16

    Injury potential, which refers to a direct current voltage between intact and injured nerve ends, is mainly caused by injury-induced Ca2+ influx. Our previous studies revealed that injury potential increased with the onset and severity of spinal cord injury (SCI), and an application of applied electric field stimulation (EFS) with the cathode distal to the lesion could delay and attenuate injury potential formation. As Ca2+ influx is also considered as a major trigger for secondary injury after SCI, we hypothesize that EFS would protect an injured spinal cord from secondary injury and consequently improve functional and pathological outcomes. In this study, rats were divided into three groups: (1) sham group, laminectomy only; (2) control group, subjected to SCI only; and (3) EFS group, received EFS immediately post-injury with the injury potential modulated to 0±0.5 mV by EFS. Functional recovery of the hind limbs was assessed using the Basso, Beattie, and Bresnahan (BBB) locomotor scale. Results revealed that EFS-treated rats exhibited significantly better locomotor function recovery. Luxol fast blue staining was performed to assess the spared myelin area. Immunofluorescence was used to observe the number of myelinated nerve fibers. Ultrastructural analysis was performed to evaluate the size of myelinated nerve fibers. Findings showed that the EFS group rats exhibited significantly less myelin loss and had larger and more myelinated nerve fibers than the control group rats in dorsal corticospinal tract (dCST) 8 weeks after SCI. Furthermore, we found that EFS inhibited the activation of calpain and caspase-3, as well as the expression of Bax, as detected by Western blot analysis. Moreover, EFS decreased cellular apoptosis, as measured by TUNEL, within 4 weeks post-injury. Results suggest that early EFS could significantly reduce spinal cord degeneration and improve functional and historical recovery. Furthermore, these neuroprotective effects may be related to

  8. Development of the Assessment Items of Debris Flow Using the Delphi Method

    Science.gov (United States)

    Byun, Yosep; Seong, Joohyun; Kim, Mingi; Park, Kyunghan; Yoon, Hyungkoo

    2016-04-01

    In recent years in Korea, Typhoon and the localized extreme rainfall caused by the abnormal climate has increased. Accordingly, debris flow is becoming one of the most dangerous natural disaster. This study aimed to develop the assessment items which can be used for conducting damage investigation of debris flow. Delphi method was applied to classify the realms of assessment items. As a result, 29 assessment items which can be classified into 6 groups were determined.

  9. Difficulty and Discrimination Parameters of Boston Naming Test Items in a Consecutive Clinical Series

    Science.gov (United States)

    Pedraza, Otto; Sachs, Bonnie C.; Ferman, Tanis J.; Rush, Beth K.; Lucas, John A.

    2011-01-01

    The Boston Naming Test is one of the most widely used neuropsychological instruments; yet, there has been limited use of modern psychometric methods to investigate its properties at the item level. The current study used Item response theory to examine each item's difficulty and discrimination properties, as well as the test's measurement precision across the range of naming ability. Participants included 300 consecutive referrals to the outpatient neuropsychology service at Mayo Clinic in Florida. Results showed that successive items do not necessarily reflect a monotonic increase in psychometric difficulty, some items are inadequate to distinguish individuals at various levels of naming ability, multiple items provide redundant psychometric information, and measurement precision is greatest for persons within a low-average range of ability. These findings may be used to develop short forms, improve reliability in future test versions by replacing psychometrically poor items, and analyze profiles of intra-individual variability. PMID:21593059

  10. Exploring Item Order in Anxiety-Related Constructs: Practical Impacts of Serial Position

    Directory of Open Access Journals (Sweden)

    R. Nicholas Carleton

    2012-04-01

    Full Text Available The present study was designed to test for item order effects by measuring four distinct constructs that contribute substantively to anxiety-related psychopathology (i.e., anxiety sensitivity, fear of negative evaluation, injury/illness sensitivity, and intolerance of uncertainty. Participants (n = 999; 71% women were randomly assigned to complete measures for each construct presented in one of two modalities: (a items presented cohesively as measures or (b items presented randomly interspersed with one another. The results suggested that item order had a relatively small impact on item endorsement, response patterns, and reliabilities. The small impact was such that item order appears unlikely to influence clinical decisions related to these measures. These findings not only have implications for these and other similar measures, but further inform a long-standing debate about whether item grouping is a substantial concern in measurement.

  11. Net and Global Differential Item Functioning in PISA Polytomously Scored Science Items: Application of the Differential Step Functioning Framework

    Science.gov (United States)

    Akour, Mutasem; Sabah, Saed; Hammouri, Hind

    2015-01-01

    The purpose of this study was to apply two types of Differential Item Functioning (DIF), net and global DIF, as well as the framework of Differential Step Functioning (DSF) to real testing data to investigate measurement invariance related to test language. Data from the Program for International Student Assessment (PISA)-2006 polytomously scored…

  12. Effects of statistical models and items difficulties on making trait-level inferences: A simulation study

    Directory of Open Access Journals (Sweden)

    Nelson Hauck Filho

    2014-12-01

    Full Text Available Researchers dealing with the task of estimating locations of individuals on continuous latent variables may rely on several statistical models described in the literature. However, weighting costs and benefits of using one specific model over alternative models depends on empirical information that is not always clearly available. Therefore, the aim of this simulation study was to compare the performance of seven popular statistical models in providing adequate latent trait estimates in conditions of items difficulties targeted at the sample mean or at the tails of the latent trait distribution. Results suggested an overall tendency of models to provide more accurate estimates of true latent scores when using items targeted at the sample mean of the latent trait distribution. Rating Scale Model, Graded Response Model, and Weighted Least Squares Mean- and Variance-adjusted Confirmatory Factor Analysis yielded the most reliable latent trait estimates, even when applied to inadequate items for the sample distribution of the latent variable. These findings have important implications concerning some popular methodological practices in Psychology and related areas.

  13. Applying the Reader-Response Theory to Literary Texts in EFL-Pre-Service Teachers' Initial Education

    Science.gov (United States)

    Garzón, Eliana; Castañeda-Peña, Harold

    2015-01-01

    This article presents the pedagogical implementation of the reader-response theory in a class of English as a foreign language with language pre-service teachers as they experience the reading of two short stories. The research took place over a 16 week period in which students kept a portfolio of their written responses to the stories.…

  14. Monte Carlo simulation of the response functions of Cd Te detectors to be applied in X-rays spectroscopy

    Energy Technology Data Exchange (ETDEWEB)

    Tomal, A. [Universidade Federale de Goias, Instituto de Fisica, Campus Samambaia, 74001-970, Goiania, (Brazil); Lopez G, A. H.; Santos, J. C.; Costa, P. R., E-mail: alessandra_tomal@yahoo.com.br [Universidade de Sao Paulo, Instituto de Fisica, Rua du Matao Travessa R. 187, Cidade Universitaria, 05508-090 Sao Paulo (Brazil)

    2014-08-15

    In this work, the energy response functions of a Cd Te detector were obtained by Monte Carlo simulation in the energy range from 5 to 150 keV, using the Penelope code. The response functions simulated included the finite detector resolution and the carrier transport. The simulated energy response matrix was validated through comparison with experimental results obtained for radioactive sources. In order to investigate the influence of the correction by the detector response at diagnostic energy range, x-ray spectra were measured using a Cd Te detector (model Xr-100-T, Amptek), and then corrected by the energy response of the detector using the stripping procedure. Results showed that the Cd Te exhibit good energy response at low energies (below 40 keV), showing only small distortions on the measured spectra. For energies below about 70 keV, the contribution of the escape of Cd- and Te-K x-rays produce significant distortions on the measured x-ray spectra. For higher energies, the most important correction is the detector efficiency and the carrier trapping effects. The results showed that, after correction by the energy response, the measured spectra are in good agreement with those provided by different models from the literature. Finally, our results showed that the detailed knowledge of the response function and a proper correction procedure are fundamental for achieve more accurate spectra from which several qualities parameters (i.e. half-value layer, effective energy and mean energy) can be determined. (Author)

  15. Optimal item pool design for computerized adaptive tests with polytomous items using GPCM

    Directory of Open Access Journals (Sweden)

    Xuechun Zhou

    2014-09-01

    Full Text Available Computerized adaptive testing (CAT is a testing procedure with advantages in improving measurement precision and increasing test efficiency. An item pool with optimal characteristics is the foundation for a CAT program to achieve those desirable psychometric features. This study proposed a method to design an optimal item pool for tests with polytomous items using the generalized partial credit model (G-PCM. It extended a method for approximating optimality with polytomous items being described succinctly for the purpose of pool design. Optimal item pools were generated using CAT simulations with and without practical constraints of content balancing and item exposure control. The performances of the item pools were evaluated against an operational item pool. The results indicated that the item pools designed with stratification based on discrimination parameters performed well with an efficient use of the less discriminative items within the target accuracy levels. The implications for developing item pools are also discussed.

  16. Real and Artificial Differential Item Functioning

    Science.gov (United States)

    Andrich, David; Hagquist, Curt

    2012-01-01

    The literature in modern test theory on procedures for identifying items with differential item functioning (DIF) among two groups of persons includes the Mantel-Haenszel (MH) procedure. Generally, it is not recognized explicitly that if there is real DIF in some items which favor one group, then as an artifact of this procedure, artificial DIF…

  17. Matrix Sampling of Test Items. ERIC Digest.

    Science.gov (United States)

    Childs, Ruth A.; Jaciw, Andrew P.

    This Digest describes matrix sampling of test items as an approach to achieving broad coverage while minimizing testing time per student. Matrix sampling involves developing a complete set of items judged to cover the curriculum, then dividing the items into subsets and administering one subset to each student. Matrix sampling, by limiting the…

  18. Processing Polarity Items: Contrastive Licensing Costs

    Science.gov (United States)

    Saddy, Douglas; Drenhaus, Heiner; Frisch, Stefan

    2004-01-01

    We describe an experiment that investigated the failure to license polarity items in German using event-related brain potentials (ERPs). The results reveal distinct processing reflexes associated with failure to license positive polarity items in comparison to failure to license negative polarity items. Failure to license both negative and…

  19. Expected linking error resulting from item parameter drift among the common Items on Rasch calibrated tests.

    Science.gov (United States)

    Miller, G Edward; Gesn, Paul Randall; Rotou, Jourania

    2005-01-01

    In state assessment programs that employ Rasch-based common item linking procedures, the linking constant is usually estimated with only those common items not identified as exhibiting item difficulty parameter drift. Since state assessments typically contain a fixed number of items, an item classified as exhibiting parameter drift during the linking process remains on the exam as a scorable item even if it is removed from the common item set. Under the assumption that item parameter drift has occurred for one or more of the common items, the expected effect of including or excluding the "affected" item(s) in the estimation of the linking constant is derived in this article. If the item parameter drift is due solely to factors not associated with a change in examinee achievement, no linking error will (be expected to) occur given that the linking constant is estimated only with the items not identified as "affected"; linking error will (be expected to) occur if the linking constant is estimated with all common items. However, if the item parameter drift is due solely to change in examinee achievement, the opposite is true: no linking error will (be expected to) occur if the linking constant is estimated with all common items; linking error will (be expected to) occur if the linking constant is estimated only with the items not identified as "affected".

  20. The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format Test Equating

    Science.gov (United States)

    Öztürk-Gübes, Nese; Kelecioglu, Hülya

    2016-01-01

    The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…

  1. 48 CFR 52.223-9 - Estimate of Percentage of Recovered Material Content for EPA-Designated Items.

    Science.gov (United States)

    2010-10-01

    ... recovered material content for EPA-designated item(s) delivered and/or used in contract performance... responsible for the performance of this contract and hereby certify that the percentage of recovered material content for EPA-designated items met the applicable contract specifications or other...

  2. The Application of Strength of Association Statistics to the Item Analysis of an In-Training Examination in Diagnostic Radiology.

    Science.gov (United States)

    Diamond, James J.; McCormick, Janet

    1986-01-01

    Using item responses from an in-training examination in diagnostic radiology, the application of a strength of association statistic to the general problem of item analysis is illustrated. Criteria for item selection, general issues of reliability, and error of measurement are discussed. (Author/LMO)

  3. Calibration and validation of the Dutch-Flemish PROMIS pain interference item bank in patients with chronic pain

    NARCIS (Netherlands)

    Crins, M.H.P.; Roorda, L.D.; Smits, N.; de Vet, H.C.W.; Westenhovens, R.; Cella, D.; Cook, K.F.; Revicki, D.; van Leeuwen, J.; Boers, M.; Dekker, J.; Terwee, C.B.

    2015-01-01

    The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT) model, to evaluate the cross-cultural validity of the Dutch-Flemish translation

  4. Design Patterns for Digital Item Types in Higher Education

    Science.gov (United States)

    Draaijer, S.; Hartog, R. J. M.

    2007-01-01

    A set of design patterns for digital item types has been developed in response to challenges identified in various projects by teachers in higher education. The goal of the projects in question was to design and develop formative and summative tests, and to develop interactive learning material in the form of quizzes. The subject domains involved…

  5. An Investigation of Item Fit Statistics for Mixed IRT Models

    Science.gov (United States)

    Chon, Kyong Hee

    2009-01-01

    The purpose of this study was to investigate procedures for assessing model fit of IRT models for mixed format data. In this study, various IRT model combinations were fitted to data containing both dichotomous and polytomous item responses, and the suitability of the chosen model mixtures was evaluated based on a number of model fit procedures.…

  6. Modeling nonignorable missing data processes in item calibration

    NARCIS (Netherlands)

    Glas, Cees A.W.; Pimentel, Jonald L.

    2006-01-01

    In this report, it is shown that the problem of nonignorable missing data in the calibration phase for computerized adaptive testing can be handled by introducing an item response theory (IRT) model for the missing data indicator. In the first simulation study, it is shown that treating data with no

  7. IRT-Estimated Reliability for Tests Containing Mixed Item Formats

    Science.gov (United States)

    Shu, Lianghua; Schwarz, Richard D.

    2014-01-01

    As a global measure of precision, item response theory (IRT) estimated reliability is derived for four coefficients (Cronbach's a, Feldt-Raju, stratified a, and marginal reliability). Models with different underlying assumptions concerning test-part similarity are discussed. A detailed computational example is presented for the targeted…

  8. A Comparison of Item Fit Statistics for Mixed IRT Models

    Science.gov (United States)

    Chon, Kyong Hee; Lee, Won-Chan; Dunbar, Stephen B.

    2010-01-01

    In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's G[superscript 2], Orlando and Thissen's S-X[superscript 2] and S-G[superscript 2], and Stone's chi[superscript 2*] and G[superscript 2*]. To investigate the…

  9. A new method for synthesizing radiation dose-response data from multiple trials applied to prostate cancer

    DEFF Research Database (Denmark)

    Diez, Patricia; Vogelius, Ivan S; Bentzen, Søren M

    2010-01-01

    A new method is presented for synthesizing dose-response data for biochemical control of prostate cancer according to study design (randomized vs. nonrandomized) and risk group (low vs. intermediate-high)....

  10. Monte Carlo simulation of the response functions of CdTe detectors to be applied in x-ray spectroscopy.

    Science.gov (United States)

    Tomal, A; Santos, J C; Costa, P R; Lopez Gonzales, A H; Poletti, M E

    2015-06-01

    In this work, the energy response functions of a CdTe detector were obtained by Monte Carlo (MC) simulation in the energy range from 5 to 160keV, using the PENELOPE code. In the response calculations the carrier transport features and the detector resolution were included. The computed energy response function was validated through comparison with experimental results obtained with (241)Am and (152)Eu sources. In order to investigate the influence of the correction by the detector response at diagnostic energy range, x-ray spectra were measured using a CdTe detector (model XR-100T, Amptek), and then corrected by the energy response of the detector using the stripping procedure. Results showed that the CdTe exhibits good energy response at low energies (below 40keV), showing only small distortions on the measured spectra. For energies below about 80keV, the contribution of the escape of Cd- and Te-K x-rays produce significant distortions on the measured x-ray spectra. For higher energies, the most important correction is the detector efficiency and the carrier trapping effects. The results showed that, after correction by the energy response, the measured spectra are in good agreement with those provided by a theoretical model of the literature. Finally, our results showed that the detailed knowledge of the response function and a proper correction procedure are fundamental for achieving more accurate spectra from which quality parameters (i.e., half-value layer and homogeneity coefficient) can be determined.

  11. The Prediction of Item Parameters Based on Classical Test Theory and Latent Trait Theory

    Science.gov (United States)

    Anil, Duygu

    2008-01-01

    In this study, the prediction power of the item characteristics based on the experts' predictions on conditions try-out practices cannot be applied was examined for item characteristics computed depending on classical test theory and two-parameters logistic model of latent trait theory. The study was carried out on 9914 randomly selected students…

  12. Optimal stratification of item pools in α-stratified computerized adaptive testing

    NARCIS (Netherlands)

    Chang, Hua-Hua; Linden, van der Wim J.

    2003-01-01

    A method based on 0-1 linear programming (LP) is presented to stratify an item pool optimally for use in α-stratified adaptive testing. Because the 0-1 LP model belongs to the subclass of models with a network flow structure, efficient solutions are possible. The method is applied to a previous item

  13. A Method on the Item Investment Risk Interval Decision-making of Processing Ranking Style

    Institute of Scientific and Technical Information of China (English)

    CHEN Li-wen

    2002-01-01

    In this paper, on the bases of the defeot of riskful type and indefinite type decisions, the concept of the type of item investment probability scheduling decision is given, and a linear programming model and its solution are made out. The feasibility of probability scheduling type item investment plan is studied by applying the quality of interval arithmetic.

  14. Applications of NLP Techniques to Computer-Assisted Authoring of Test Items for Elementary Chinese

    Science.gov (United States)

    Liu, Chao-Lin; Lin, Jen-Hsiang; Wang, Yu-Chun

    2010-01-01

    The authors report an implemented environment for computer-assisted authoring of test items and provide a brief discussion about the applications of NLP techniques for computer assisted language learning. Test items can serve as a tool for language learners to examine their competence in the target language. The authors apply techniques for…

  15. Approximation Preserving Reductions among Item Pricing Problems

    Science.gov (United States)

    Hamane, Ryoso; Itoh, Toshiya; Tomita, Kouhei

    When a store sells items to customers, the store wishes to determine the prices of the items to maximize its profit. Intuitively, if the store sells the items with low (resp. high) prices, the customers buy more (resp. less) items, which provides less profit to the store. So it would be hard for the store to decide the prices of items. Assume that the store has a set V of n items and there is a set E of m customers who wish to buy those items, and also assume that each item i ∈ V has the production cost di and each customer ej ∈ E has the valuation vj on the bundle ej ⊆ V of items. When the store sells an item i ∈ V at the price ri, the profit for the item i is pi = ri - di. The goal of the store is to decide the price of each item to maximize its total profit. We refer to this maximization problem as the item pricing problem. In most of the previous works, the item pricing problem was considered under the assumption that pi ≥ 0 for each i ∈ V, however, Balcan, et al. [In Proc. of WINE, LNCS 4858, 2007] introduced the notion of “loss-leader, ” and showed that the seller can get more total profit in the case that pi < 0 is allowed than in the case that pi < 0 is not allowed. In this paper, we derive approximation preserving reductions among several item pricing problems and show that all of them have algorithms with good approximation ratio.

  16. Emergency Power For Critical Items

    Science.gov (United States)

    Young, William R.

    2009-07-01

    Natural disasters, such as hurricanes, floods, tornados, and tsunami, are becoming a greater problem as climate change impacts our environment. Disasters, whether natural or man made, destroy lives, homes, businesses and the natural environment. Such disasters can happen with little or no warning, leaving hundreds or even thousands of people without medical services, potable water, sanitation, communications and electrical services for up to several weeks. In our modern world, the need for electricity has become a necessity. Modern building codes and new disaster resistant building practices are reducing the damage to homes and businesses. Emergency gasoline and diesel generators are becoming common place for power outages. Generators need fuel, which may not be available after a disaster, but Photovoltaic (solar-electric) systems supply electricity without petroleum fuel as they are powered by the sun. Photovoltaic (PV) systems can provide electrical power for a home or business. PV systems can operate as utility interactive or stand-alone with battery backup. Determining your critical load items and sizing the photovoltaic system for those critical items, guarantees their operation in a disaster.

  17. Response model parameter linking

    NARCIS (Netherlands)

    Barrett, Michelle Derbenwick

    2015-01-01

    With a few exceptions, the problem of linking item response model parameters from different item calibrations has been conceptualized as an instance of the problem of equating observed scores on different test forms. This thesis argues, however, that the use of item response models does not require

  18. An iterative method applied to optimize the design of PIN photodiodes for enhanced radiation tolerance and maximum light response

    Energy Technology Data Exchange (ETDEWEB)

    Cedola, A.P., E-mail: ariel.cedola@ing.unlp.edu.a [Grupo de Estudio de Materiales y Dispositivos Electronicos (GEMyDE), Dpto. Electrotecnia, Facultad de Ingenieria, Universidad Nacional de La Plata, 48 y 116, C.C. 91, La Plata 1900, Buenos Aires (Argentina); Cappelletti, M.A. [Grupo de Estudio de Materiales y Dispositivos Electronicos (GEMyDE), Dpto. Electrotecnia, Facultad de Ingenieria, Universidad Nacional de La Plata, 48 y 116, C.C. 91, La Plata 1900, Buenos Aires (Argentina); Casas, G. [Grupo de Estudio de Materiales y Dispositivos Electronicos (GEMyDE), Dpto. Electrotecnia, Facultad de Ingenieria, Universidad Nacional de La Plata, 48 y 116, C.C. 91, La Plata 1900, Buenos Aires (Argentina); Universidad Nacional de Quilmes, Roque Saenz Pena 352, Bernal 1876, Buenos Aires (Argentina); Peltzer y Blanca, E.L. [Grupo de Estudio de Materiales y Dispositivos Electronicos (GEMyDE), Dpto. Electrotecnia, Facultad de Ingenieria, Universidad Nacional de La Plata, 48 y 116, C.C. 91, La Plata 1900, Buenos Aires (Argentina); Instituto de Fisica de Liquidos y Sistemas Biologicos (IFLYSIB), CONICET - UNLP - CIC, La Plata 1900, Buenos Aires (Argentina)

    2011-02-11

    An iterative method based on numerical simulations was developed to enhance the proton radiation tolerance and the responsivity of Si PIN photodiodes. The method allows to calculate the optimal values of the intrinsic layer thickness and the incident light wavelength, in function of the light intensity and the maximum proton fluence to be supported by the device. These results minimize the effects of radiation on the total reverse current of the photodiode and maximize its response to light. The implementation of the method is useful in the design of devices whose operation point should not suffer variations due to radiation.

  19. Development of a Microsoft Excel tool for one-parameter Rasch model of continuous items: an application to a safety attitude survey

    Directory of Open Access Journals (Sweden)

    Tsair-Wei Chien

    2017-01-01

    Full Text Available Abstract Background Many continuous item responses (CIRs are encountered in healthcare settings, but no one uses item response theory’s (IRT probabilistic modeling to present graphical presentations for interpreting CIR results. A computer module that is programmed to deal with CIRs is required. To present a computer module, validate it, and verify its usefulness in dealing with CIR data, and then to apply the model to real healthcare data in order to show how the CIR that can be applied to healthcare settings with an example regarding a safety attitude survey. Methods Using Microsoft Excel VBA (Visual Basic for Applications, we designed a computer module that minimizes the residuals and calculates model’s expected scores according to person responses across items. Rasch models based on a Wright map and on KIDMAP were demonstrated to interpret results of the safety attitude survey. Results The author-made CIR module yielded OUTFIT mean square (MNSQ and person measures equivalent to those yielded by professional Rasch Winsteps software. The probabilistic modeling of the CIR module provides messages that are much more valuable to users and show the CIR advantage over classic test theory. Conclusions Because of advances in computer technology, healthcare users who are familiar to MS Excel can easily apply the study CIR module to deal with continuous variables to benefit comparisons of data with a logistic distribution and model fit statistics.

  20. A Comparison of Sales Response Predictions From Demand Models Applied to Store-Level versus Panel Data

    NARCIS (Netherlands)

    Andrews, Rick L.; Currim, Imran S.; Leeflang, Peter S. H.

    2011-01-01

    In order to generate sales promotion response predictions, marketing analysts estimate demand models using either disaggregated (consumer-level) or aggregated (store-level) scanner data. Comparison of predictions from these demand models is complicated by the fact that models may accommodate differe

  1. Social Responsibility in Research Practice: Engaging applied scientists with the socio-ethical context of their work

    NARCIS (Netherlands)

    Schuurbiers, D.

    2010-01-01

    How to encourage researchers to critically reflect on the ethical and social dimensions of their work? That is the central research question of this thesis. It starts from the assumption that the neutrality view of the social responsibility of the researcher – the view that researchers have no busin

  2. Applying dynamic parameters to predict hemodynamic response to volume expansion in spontaneously breathing patients with septic shock.

    Science.gov (United States)

    Lanspa, Michael J; Grissom, Colin K; Hirshberg, Eliotte L; Jones, Jason P; Brown, Samuel M

    2013-02-01

    Volume expansion is a mainstay of therapy in septic shock, although its effect is difficult to predict using conventional measurements. Dynamic parameters, which vary with respiratory changes, appear to predict hemodynamic response to fluid challenge in mechanically ventilated, paralyzed patients. Whether they predict response in patients who are free from mechanical ventilation is unknown. We hypothesized that dynamic parameters would be predictive in patients not receiving mechanical ventilation. This is a prospective, observational, pilot study. Patients with early septic shock and who were not receiving mechanical ventilation received 10-mL/kg volume expansion (VE) at their treating physician's discretion after initial resuscitation in the emergency department. We used transthoracic echocardiography to measure vena cava collapsibility index and aortic velocity variation before VE. We used a pulse contour analysis device to measure stroke volume variation (SVV). Cardiac index was measured immediately before and after VE using transthoracic echocardiography. Hemodynamic response was defined as an increase in cardiac index 15% or greater. Fourteen patients received VE, five of whom demonstrated a hemodynamic response. Vena cava collapsibility index and SVV were predictive (area under the curve = 0.83, 0.92, respectively). Optimal thresholds were calculated: vena cava collapsibility index, 15% or greater (positive predictive value, 62%; negative predictive value, 100%; P = 0.03); SVV, 17% or greater (positive predictive value 100%, negative predictive value 82%, P = 0.03). Aortic velocity variation was not predictive. Vena cava collapsibility index and SVV predict hemodynamic response to fluid challenge patients with septic shock who are not mechanically ventilated. Optimal thresholds differ from those described in mechanically ventilated patients.

  3. ‘Forget me (not?’ – Remembering forget-items versus un-cued items in directed forgetting

    Directory of Open Access Journals (Sweden)

    Bastian eZwissler

    2015-11-01

    Full Text Available Humans need to be able to selectively control their memories. Here, we investigate the underlying processes in item-method directed forgetting and compare the classic active memory cues in this paradigm with a passive instruction. Typically, individual items are presented and each is followed by either a forget- or remember-instruction. On a surprise test of all items, memory is then worse for to-be-forgotten items (TBF compared to to-be-remembered items (TBR. This is thought to result from selective rehearsal of TBR, or from active inhibition of TBF, or from both. However, evidence suggests that if a forget instruction initiates active processing, paradoxical effects may also arise. To investigate the underlying mechanisms, four experiments were conducted where un-cued items (UI were introduced and recognition performance was compared between TBR, TBF and UI stimuli. Accuracy was encouraged via a performance-dependent monetary bonus. Across all experiments, including perceptually fully matched variants, memory accuracy for TBF was reduced compared to TBR, but better than for UI. Moreover, participants used a more conservative response criterion when responding to TBF stimuli. Thus, ironically, the F cue results in active processing, but this does not have inhibitory effects that would impair recognition memory beyond a un-cued baseline condition. This casts doubts on inhibitory accounts of item-method directed forgetting and is also difficult to reconcile with pure selective rehearsal of TBR. While the F-cue does induce active processing, this does not result in particularly successful forgetting. The pattern seems most consistent with the notion of ironic processing.

  4. Parallel Matrix Factorization for Binary Response

    CERN Document Server

    Khanna, Rajiv; Agarwal, Deepak; Chen, Beechung

    2012-01-01

    Predicting user affinity to items is an important problem in applications like content optimization, computational advertising, and many more. While bilinear random effect models (matrix factorization) provide state-of-the-art performance when minimizing RMSE through a Gaussian response model on explicit ratings data, applying it to imbalanced binary response data presents additional challenges that we carefully study in this paper. Data in many applications usually consist of users' implicit response that are often binary -- clicking an item or not; the goal is to predict click rates, which is often combined with other measures to calculate utilities to rank items at runtime of the recommender systems. Because of the implicit nature, such data are usually much larger than explicit rating data and often have an imbalanced distribution with a small fraction of click events, making accurate click rate prediction difficult. In this paper, we address two problems. First, we show previous techniques to estimate bi...

  5. Understanding emotional responses to breast/ovarian cancer genetic risk assessment: an applied test of a cognitive theory of emotion.

    Science.gov (United States)

    Phelps, Ceri; Bennett, Paul; Brain, Kate

    2008-10-01

    This study explored whether Smith and Lazarus' (1990, 1993) cognitive theory of emotion could predict emotional responses to an emotionally ambiguous real-life situation. Questionnaire data were collected from 145 women upon referral for cancer genetic risk assessment. These indicated a mixed emotional reaction of both positive and negative emotions to the assessment. Hierarchical regression analyses revealed that the hypothesised models explained between 20% and 33% of the variance of anxiety, hope and gratitude scores, but only 10% of the variance for challenge scores. For the previously unmodelled emotion of relief, 31% of the variance was explained by appraisals and core relational themes. The findings help explain why emotional responses to cancer genetic risk assessment vary and suggest that improving the accuracy of individuals' beliefs and expectations about the assessment process may help subsequent adaptation to risk information.

  6. Bounds on Quantiles in the Presence of Full and Partial Item Nonresponse

    NARCIS (Netherlands)

    Vazquez-Alvarez, R.; Melenberg, B.; van Soest, A.H.O.

    1999-01-01

    Microeconomic surveys are usually subject to the problem of item nonresponse, typically associated with variables like income and wealth, where confidentiality and/or lack of accurate information can affect the response behavior of the individual. Follow up categorical questions can reduce item nonr

  7. Diagnosing item score patterns using IRT based person-fit statistics

    NARCIS (Netherlands)

    Meijer, Rob R.

    2001-01-01

    Person-fit statistics have been proposed to investigate the fit of an item score pattern to an item response theory (IRT) model. This study investigated how these statistics can be used to detect different types of misfit. Intelligence test data for 992 people at or beyond college level were analyze

  8. A review of methods for evaluating the fit of item score patterns on a test

    NARCIS (Netherlands)

    Meijer, Rob R.; Sijtsma, Klaas

    1999-01-01

    Methods are discussed that can be used to investigate the fit of an item score pattern to a test model. Model-based tests and personality inventories are administered to more than 100 million people a year and, as a result, individual fit is of great concern. Item Response Theory (IRT) modeling and

  9. IRT Item Parameter Recovery with Marginal Maximum Likelihood Estimation Using Loglinear Smoothing Models

    Science.gov (United States)

    Casabianca, Jodi M.; Lewis, Charles

    2015-01-01

    Loglinear smoothing (LLS) estimates the latent trait distribution while making fewer assumptions about its form and maintaining parsimony, thus leading to more precise item response theory (IRT) item parameter estimates than standard marginal maximum likelihood (MML). This article provides the expectation-maximization algorithm for MML estimation…

  10. The Impact of Fallible Item Parameter Estimates on Latent Trait Recovery

    Science.gov (United States)

    Cheng, Ying; Yuan, Ke-Hai

    2010-01-01

    In this paper we propose an upward correction to the standard error (SE) estimation of theta[subscript ML], the maximum likelihood (ML) estimate of the latent trait in item response theory (IRT). More specifically, the upward correction is provided for the SE of theta[subscript ML] when item parameter estimates obtained from an independent pretest…

  11. The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning

    Science.gov (United States)

    Finch, W. Holmes

    2011-01-01

    Missing information is a ubiquitous aspect of data analysis, including responses to items on cognitive and affective instruments. Although the broader statistical literature describes missing data methods, relatively little work has focused on this issue in the context of differential item functioning (DIF) detection. Such prior research has…

  12. Using Necessary Information to Identify Item Dependence in Passage-Based Reading Comprehension Tests

    Science.gov (United States)

    Baldonado, Angela Argo; Svetina, Dubravka; Gorin, Joanna

    2015-01-01

    Applications of traditional unidimensional item response theory models to passage-based reading comprehension assessment data have been criticized based on potential violations of local independence. However, simple rules for determining dependency, such as including all items associated with a particular passage, may overestimate the dependency…

  13. Impact of Local Item Dependence on True-Score Equating. LSAC Research Report Series.

    Science.gov (United States)

    Reese, Lynda M.; Pashley, Peter J.

    This study investigated the practical effects of local item dependence (LID) on item response theory (IRT) true-score equating. A scenario was defined that emulated the Law School Admission Test (LSAT) preequating model, and data were generated to assess the impact of different degrees of LID on final equating outcomes. An extreme amount of LID…

  14. 49 CFR 375.207 - What items must be in my advertisements?

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 5 2010-10-01 2010-10-01 false What items must be in my advertisements? 375.207... Services to My Customers General Responsibilities § 375.207 What items must be in my advertisements? (a) You and your agents must publish and use only truthful, straightforward, and honest advertisements....

  15. Studying Differential Item Functioning via Latent Variable Modeling: A Note on a Multiple-Testing Procedure

    Science.gov (United States)

    Raykov, Tenko; Marcoulides, George A.; Lee, Chun-Lung; Chang, Chi

    2013-01-01

    This note is concerned with a latent variable modeling approach for the study of differential item functioning in a multigroup setting. A multiple-testing procedure that can be used to evaluate group differences in response probabilities on individual items is discussed. The method is readily employed when the aim is also to locate possible…

  16. Parent Ratings of ADHD Symptoms: Generalized Partial Credit Model Analysis of Differential Item Functioning across Gender

    Science.gov (United States)

    Gomez, Rapson

    2012-01-01

    Objective: Generalized partial credit model, which is based on item response theory (IRT), was used to test differential item functioning (DIF) for the "Diagnostic and Statistical Manual of Mental Disorders" (4th ed.), inattention (IA), and hyperactivity/impulsivity (HI) symptoms across boys and girls. Method: To accomplish this, parents completed…

  17. Applying Disruptive Preference Test Protocols to Increase the Number of "No Preference" Responses in the Placebo Pair, Using Chinese Consumers.

    Science.gov (United States)

    Xia, Yixun; Zhong, Fang; O'Mahony, Michael

    2016-09-01

    One form of paired preference test protocol requires consumers to assess 2 pairs of products. One is the target pair under consideration, while the other is a putatively identical pair named the "placebo pair" which is also presented as a control. Counterintuitively, the majority of consumers report preferences when presented with the placebo pair. Their response frequencies are hypothesized to be those of consumers having "no preference" and are compared with the response frequencies elicited by a target pair, to determine whether the target pair elicits significant preferences. The primary goal of this paper was to study the robustness of 2 new so called disruptive protocols that reduced the proportion of consumers, who reported preferences when assessing a putatively identical pair of products. For this task, the tests were performed in a different language, in a different country, using different products from before. The results showed that the proportion of consumers reporting preferences for the placebo pair was reduced, confirming earlier work. Also, comparison of d' values showed a lack of significant overall differences between the placebo and target pairs, while chi-squared analyses indicated significant differences in the response frequencies. This indicated that the sample was segmented into 2 balanced groups with opposing preferences.

  18. 49 CFR 175.8 - Exceptions for operator equipment and items of replacement.

    Science.gov (United States)

    2010-10-01

    ... HAZARDOUS MATERIALS SAFETY ADMINISTRATION, DEPARTMENT OF TRANSPORTATION HAZARDOUS MATERIALS REGULATIONS... items of replacement. (a) Operator equipment. This subchapter does not apply to— (1) Aviation fuel and...) Hazardous materials required aboard an aircraft in accordance with the applicable airworthiness...

  19. Detecting Differential Item Functioning and Differential Test Functioning on Math School Final-exam

    OpenAIRE

    - Mansyur; - Muliana

    2016-01-01

    This study aims at finding out the characteristics of Differential Item Functioning (DIF) and Differential Test Functioning (DTF) on school final-exam for Math subject based on Item Response Theory (ITR). The subjects of this study were questions and all of the students’ answer sheets chosen by using convenience sampling method and obtained 286 responses consisted of 147 male and 149 female students’ responses. The data of this study collected using documentation technique by quoting the resp...

  20. Application of Rough Set Theory in Item Cognitive Attribute Identification%粗糙集在项目认知属性标定中的应用

    Institute of Scientific and Technical Information of China (English)

    唐小娟; 丁树良; 俞宗火

    2015-01-01

    parameters are unknown, examinees are less and feedbacks are timely. In the current studies, we apply a new method – Rough Set Theory (RST) to ICAI. RST can solve the uncertainty in CD caused by the size of knowledge granularity. It doesn’t require any priori knowledge. Through the knowledge reduction, RST induces decision or classification rules, and then classifies the object. At first, we verificate the application of RST in ICAI. Then, in Study One, we explore how the match ratio of subjects' knowledge states and the slippage in subjects' responses to items impact the match ratio of item attributes. The number of item attributes is a variable which impacts the accuracy of CD, so, we also examine how the number of cognitive contributes impact the match ratio of item attributes. The results show that: (1) In the absence of item parameters, the rough set theory of ICAI has fast diagnostic speed and good results even though the sample size is small. So RST can be applied to classroom assessment. (2) The lower examinee’s PMR, the lower PMR of item attribute identification is. And the higher slippage in examinee’s response, the lower item attribute identification’s PMR is. (3) The more the number of item attributes, the lower item attribute identification’s PMR is. (4) Both results are estimated by rough set software, and regardless of sample size and item number, the estimated speed is very fast (about 10 seconds). It shows the advantage of RST in ICAI.%认知诊断是新一代测量理论的核心,对形成性教学评估具有重要意义。项目认知属性标定是认知诊断中一项基础而重要的工作,现有的项目认知属性辅助标定方法的研究工作很少,并且在应用上存在诸多局限。课堂评估是认知诊断应用的理想场所,但课堂评估中项目的选取具有随意性,教师难以在短时间内准确标识项目认知属性。本研究首次提出采用粗糙集方法对项目认知属性进行标定,该

  1. Segmenting and targeting American university students to promote responsible alcohol use: a case for applying social marketing principles.

    Science.gov (United States)

    Deshpande, Sameer; Rundle-Thiele, Sharyn

    2011-10-01

    The current study contributes to the social marketing literature in the American university binge-drinking context in three innovative ways. First, it profiles drinking segments by "values" and "expectancies" sought from behaviors. Second, the study compares segment values and expectancies of two competing behaviors, that is, binge drinking and participation in alternative activities. Third, the study compares the influence of a variety of factors on both behaviors in each segment. Finally, based on these findings and feedback from eight university alcohol prevention experts, appropriate strategies to promote responsible alcohol use for each segment are proposed.

  2. Response of a particle in a one-dimensional lattice to an applied force: Dynamics of the effective mass

    CERN Document Server

    Duque-Gomez, Federico

    2012-01-01

    We study the behaviour of the expectation value of the acceleration of a particle in a one-dimensional periodic potential when an external homogeneous force is suddenly applied. The theory is formulated in terms of modified Bloch states that include the interband mixing induced by the force. This approach allows us to understand the behaviour of the wavepacket, which responds with a mass that is initially the bare mass, and subsequently oscillates around the value predicted by the effective mass. If Zener tunneling can be neglected, the expression obtained for the acceleration of the particle is valid over timescales of the order of a Bloch oscillation, which are of interest for experiments with cold atoms in optical lattices. We discuss how these oscillations can be tuned in an optical lattice for experimental detection.

  3. Small-Item Vapor Test Method, FY11 Release

    Science.gov (United States)

    2012-07-01

    exposed. sessile drop : A liquid droplet that is firmly attached to a surface. If the droplet significantly spreads across the surface, it is better...following information regarding the observations. o Written description of applied drops as they appeared for each sample (e.g., sessile , spread). o... techniques for this method. The airflow and air volume are key variables required to assess risk. Residual agent: Because full-item extraction

  4. 41 CFR 101-30.701-2 - Item standardization code.

    Science.gov (United States)

    2010-07-01

    ... 41 Public Contracts and Property Management 2 2010-07-01 2010-07-01 true Item standardization code....7-Item Reduction Program § 101-30.701-2 Item standardization code. Item standardization code (ISC) means a code assigned an item in the supply system which identifies the item as authorized...

  5. Flower stalk segments of Arabidopsis thaliana ecotype Columbia lack the capacity to grow in response to exogenously applied auxin.

    Science.gov (United States)

    Soga, K; Wakabayashi, K; Hoson, T; Kamisaka, S

    2000-12-01

    Exogenously applied IAA stimulated cell elongation of segments excised from flower stalks of Arabidopsis thaliana ecotype Landsberg erecta (Ler) by increasing the cell wall extensibility, but it did not affect that of ecotype Columbia (Col). Treatment with a low pH buffer solution (pH 4.0) or fusicoccin (FC), a reagent activating H(+)-ATPases, significantly increased the cell wall extensibility and promoted elongation growth of flower stalk segments of both ecotypes, indicating that the flower stalk segments of Col possess the capacity to grow under acidic pH conditions. IAA promoted the proton excretion in segments of Ler but not of Col. On the other hand, FC increased the proton excretion in segments of Col as much as that of Ler. These results suggest that IAA activates the plasma membrane H(+)-ATPases in the segments of Ler but not those of Col, while FC activates them in both ecotypes. Flower stalks of Col may lack the mechanisms of activation by IAA of the plasma membrane H(+)-ATPases.

  6. From concepts to lexical items.

    Science.gov (United States)

    Bierwisch, M; Schreuder, R

    1992-03-01

    In this paper we address the question how in language production conceptual structures are mapped onto lexical items. First we describe the lexical system in a fairly abstract way. Such a system consists of, among other things, a fixed set of basic lexical entries characterized by four groups of information: phonetic form, grammatical features, argument structure, and semantic form. A crucial assumption of the paper is that the meaning in a lexical entry has a complex internal structure composed of more primitive elements (decomposition). Some aspects of argument structure and semantic form and their interaction are discussed with respect to the issue of synonymy. We propose two different mappings involved in lexical access. One maps conceptual structures to semantic forms, and the other maps semantic forms to conceptual structures. Both mappings are context dependent and are many-to-many mappings. We present an elaboration of Levelt's (1989) model in which these processes interact with the grammatical encoder and the mental lexicon. Then we address the consequences of decomposition for processing models, especially the nature of the input of lexical access and the time course. Processing models that use the activation metaphor may have difficulties accounting for certain phenomena where a certain lemma triggers not one, but two or more word forms that have to be produced with other word forms in between.

  7. Postural responses applied in a control model in cochlear implant users with pre-lingual hearing loss.

    Science.gov (United States)

    Suarez, Hamlet; Ferreira, Enrique; Alonso, Rafael; Arocena, Sofia; San Roman, Cecilia; Herrera, Tamara; Lapilover, Valeria

    2016-01-01

    Conclusions The assessment of postural responses (PR) based in a feedback control system model shows selective gains in different bands of frequencies adaptable with child development. Objective PR characterization of pre-lingual cochlear implant users (CIU) in different sensory conditions. Methods Total energy consumption of the body's center of pressure signal (ECCOP) and its distribution in three bands of frequencies: band 1 (0-0.1 Hz), band 2 (0.1-0.7 Hz), and band 3 (0.7-20 Hz) was measured in a sample of 18 CIU (8-16 years old) and in a control group (CG) (8-15 years old). They were assessed in a standing position on a force platform in two sensory conditions: 1 = Eyes open. 2 = Eyes closed and standing on foam. Results In condition 1, total ECCOP of PR and its proportion of energy consumption in the three bands of frequencies were similar between CIU and CG (p > 0.05). In condition 2, CIU have significantly higher ECCOP, mainly in high frequencies (bands 2 and 3) (p < 0.05). ECCOP values decreased with age also, mainly in bands 2 and 3. This behavior is interpreted in the control system model proposed as an adaptation process related with child development.

  8. An Item Analysis and Validity Investigation of Bender Visual Motor Gestalt Test Score Items

    Science.gov (United States)

    Lambert, Nadine M.

    1971-01-01

    This investigation attempted to demonstrate the utility of standard item analysis procedures for selecting the most reliable and valid items for scoring Bender Visual Motor Gestalt Test test records. (Author)

  9. Item difficulty scaling for WAIS-III picture arrangement.

    Science.gov (United States)

    Costello, Raymond M; Connolly, Sean G

    2005-06-01

    Only one study regarding the sequencing of items of the WAIS-III Picture Arrangement subtest was located in a search of published literature. That study of 50 alcohol abusers failed to demonstrate that the items are sequenced in the perfect order of difficulty as suggested by the test publisher. The current study was accomplished to replicate or refute the prior study and to extend findings into related matters. Two laboratories provided four archival samples of 100 cases. Only five items appear properly placed, with one (OPENS) especially misplaced. A new sequence is recommended so that clinicians can administer the test more efficiently and examine errors from a process approach to evaluation. Difficult items were not passed as often as expected by Hispanic respondents. This finding was considered an artifact related to archival convenience sampling and may not be representative as a general finding regarding Hispanic performance until experimental sampling techniques or proper statistical controls can be applied. Statistically controlling for IQ, through analysis of covariance, eliminated ethnicity effects on total score for the PA subtest.

  10. Applying the LLTM for the determination of children’s cognitive age-acceleration function

    Directory of Open Access Journals (Sweden)

    Klaus D. Kubinger

    2011-06-01

    Full Text Available The paper uses Item Response Theory (IRT for modeling and hypothesis testing children’s cogni-tive age-acceleration function – within calibration and standardization of some intelligence test. For this, basically Fischer’s Linear logistic test model (LLTM; Fischer, 1973, 2005 is applied. How-ever, instead of originally decomposing the item difficulty parameters of the Rasch model into certain hypothesized elementary parameters, we now suggest to decompose the person parameter alike. That is, there is a decomposition into a testee’s basic ability parameter and an age-leveled effect due to the developmental stage of the age-group in question. For convenience, we only inter-change testees and items in order to facilitate parameter estimation and model test – of course, the Rasch model is totally symmetric as concerns testees and items. By doing so, all findings in the context of LLTM apply; in particular, pertinent program packages are at our disposal. In order to examine the suggested approach’s feasibility, an empirical example is given. An Analogy test with eight items administered to more than 300 testees aged between 6 and 16, was analyzed. As a matter of fact, the logistic acceleration function proved to fit the data well and best.

  11. Quantitative Analysis of Complex Multiple-Choice Items in Science Technology and Society: Item Scaling

    Directory of Open Access Journals (Sweden)

    Ángel Vázquez Alonso

    2005-05-01

    Full Text Available The scarce attention to assessment and evaluation in science education research has been especially harmful for Science-Technology-Society (STS education, due to the dialectic, tentative, value-laden, and controversial nature of most STS topics. To overcome the methodological pitfalls of the STS assessment instruments used in the past, an empirically developed instrument (VOSTS, Views on Science-Technology-Society have been suggested. Some methodological proposals, namely the multiple response models and the computing of a global attitudinal index, were suggested to improve the item implementation. The final step of these methodological proposals requires the categorization of STS statements. This paper describes the process of categorization through a scaling procedure ruled by a panel of experts, acting as judges, according to the body of knowledge from history, epistemology, and sociology of science. The statement categorization allows for the sound foundation of STS items, which is useful in educational assessment and science education research, and may also increase teachers’ self-confidence in the development of the STS curriculum for science classrooms.

  12. 38 CFR 3.1606 - Transportation items.

    Science.gov (United States)

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Transportation items. 3... Burial Benefits § 3.1606 Transportation items. The transportation costs of those persons who come within... shipment. (6) Cost of transportation by common carrier including amounts paid as Federal taxes. (7) Cost...

  13. Interservice Availability of Multiservice Used Items.

    Science.gov (United States)

    1999-05-14

    t PIVT OF DISTRIBUTION STATEMENT A Approved for Public Releaseo r. Distribution Unlimited 19990817 034 INTERSERVICE AVAILABILTY OF MULTISERVICE USED...was initially cataloged or placed in the DoD supply system. As items matured, that is, as recurring item costs, usage rates, and technological

  14. Restricted Interests and Teacher Presentation of Items

    Science.gov (United States)

    Stocco, Corey S.; Thompson, Rachel H.; Rodriguez, Nicole M.

    2011-01-01

    Restricted and repetitive behavior (RRB) is more pervasive, prevalent, frequent, and severe in individuals with autism spectrum disorders (ASDs) than in their typical peers. One subtype of RRB is restricted interests in items or activities, which is evident in the manner in which individuals engage with items (e.g., repetitious wheel spinning),…

  15. Item nonresponse in questionnaire research with children

    NARCIS (Netherlands)

    Hox, J.J.; Borgers, N.

    2001-01-01

    This study investigates the effect of item and person characteristics on item nonresponse, for written questionnaires used with school children. Secondary analyses were done on questionnaire data collected in five distinct studies. To analyze the data, logistic multilevel analysis was used with the

  16. Developing and evaluating innovative items for the NCLEX: Part 2, item characteristics and cognitive processing.

    Science.gov (United States)

    Wendt, Anne; Harmes, J Christine

    2009-01-01

    This article is a continuation of the research on the development and evaluation of innovative item formats for the NCLEX examinations that was published in the March/April 2009 edition of Nurse Educator. The authors discuss the innovative item templates and evaluate the statistical characteristics and level of cognitive processing required to answer the examination items.

  17. Comparing Methods for Item Analysis: The Impact of Different Item-Selection Statistics on Test Difficulty

    Science.gov (United States)

    Jones, Andrew T.

    2011-01-01

    Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…

  18. Studies on the response of resistive-wall modes to applied magnetic perturbations in the EXTRAP T2R reversed field pinch

    Science.gov (United States)

    Gregoratto, D.; Drake, J. R.; Yadikin, D.; Liu, Y. Q.; Paccagnella, R.; Brunsell, P. R.; Bolzonella, T.; Marchiori, G.; Cecconello, M.

    2005-09-01

    Arrays of magnetic coils and sensors in the EXTRAP T2R [P. R. Brunsell et al., Plasma Phys. Controlled Fusion 43 1457 (2001)] reversed-field pinch have been used to investigate the plasma response to an applied resonant magnetic perturbation in the range of the resistive-wall modes (RWMs). Measured RWM growth rates agree with predictions of a cylindrical ideal-plasma model. The linear growth of low-n marginally stable RWMs is related to the so-called resonant-field amplification due to a dominant ∣n∣=2 machine error field of about 2 G. The dynamics of the m =1 RWMs interacting with the applied field produced by the coils can be accurately described by a two-pole system. Estimated poles and residues are given with sufficient accuracy by the cylindrical model with a thin continuous wall.

  19. Calibration of the PROMIS physical function item bank in Dutch patients with rheumatoid arthritis.

    Directory of Open Access Journals (Sweden)

    Martijn A H Oude Voshaar

    Full Text Available OBJECTIVE: To calibrate the Dutch-Flemish version of the PROMIS physical function (PF item bank in patients with rheumatoid arthritis (RA and to evaluate cross-cultural measurement equivalence with US general population and RA data. METHODS: Data were collected from RA patients enrolled in the Dutch DREAM registry. An incomplete longitudinal anchored design was used where patients completed all 121 items of the item bank over the course of three waves of data collection. Item responses were fit to a generalized partial credit model adapted for longitudinal data and the item parameters were examined for differential item functioning (DIF across country, age, and sex. RESULTS: In total, 690 patients participated in the study at time point 1 (T2, N = 489; T3, N = 311. The item bank could be successfully fitted to a generalized partial credit model, with the number of misfitting items falling within acceptable limits. Seven items demonstrated DIF for sex, while 5 items showed DIF for age in the Dutch RA sample. Twenty-five (20% items were flagged for cross-cultural DIF compared to the US general population. However, the impact of observed DIF on total physical function estimates was negligible. DISCUSSION: The results of this study showed that the PROMIS PF item bank adequately fit a unidimensional IRT model which provides support for applications that require invariant estimates of physical function, such as computer adaptive testing and targeted short forms. More studies are needed to further investigate the cross-cultural applicability of the US-based PROMIS calibration and standardized metric.

  20. The Structure of the Narcissistic Personality Inventory With Binary and Rating Scale Items.

    Science.gov (United States)

    Boldero, Jennifer M; Bell, Richard C; Davies, Richard C

    2015-01-01

    Narcissistic Personality Inventory (NPI) items typically have a forced-choice format, comprising a narcissistic and a nonnarcissistic statement. Recently, some have presented the narcissistic statements and asked individuals to either indicate whether they agree or disagree that the statements are self-descriptive (i.e., a binary response format) or to rate the extent to which they agree or disagree that these statements are self-descriptive on a Likert scale (i.e., a rating response format). The current research demonstrates that when NPI items have a binary or a rating response format, the scale has a bifactor structure (i.e., the items load on a general factor and on 6 specific group factors). Indexes of factor strength suggest that the data are unidimensional enough for the NPI's general factor to be considered a measure of a narcissism latent trait. However, the rating item general factor assessed more narcissism components than the binary item one. The positive correlations of the NPI's general factor, assessed when items have a rating response format, were moderate with self-esteem, strong with a measure of narcissistic grandiosity, and weak with 2 measures of narcissistic vulnerability. Together, the results suggest that using a rating format for items enhances the information provided by the NPI.