WorldWideScience

Sample records for single test item

  1. Single Item Inventory Models

    NARCIS (Netherlands)

    E.M. Bazsa-Oldenkamp; P. den Iseger

    2001-01-01

    textabstractThis paper extends a fundamental result about single-item inventory systems. This approach allows more general performance measures, demand processes and order policies, and leads to easier analysis and implementation, than prior research. We obtain closed form expressions for the

  2. Computerized adaptive testing with item cloning

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; van der Linden, Willem J.

    2003-01-01

    To increase the number of items available for adaptive testing and reduce the cost of item writing, the use of techniques of item cloning has been proposed. An important consequence of item cloning is possible variability between the item parameters. To deal with this variability, a multilevel item

  3. Single-item memory, associative memory, and the human hippocampus

    OpenAIRE

    Gold, Jeffrey J.; Hopkins, Ramona O.; Squire, Larry R.

    2006-01-01

    We tested recognition memory for items and associations in memory-impaired patients with bilateral lesions thought to be limited to the hippocampal region. In Experiment 1 (Combined memory test), participants studied words and then took a memory test in which studied words, new words, studied word pairs, and recombined word pairs were presented in a mixed order. In Experiment 2 (Separated memory test), participants studied single words and then took a memory test involving studied word and ne...

  4. The Feasibility of Single-Item Measures for Organizational Justice

    Science.gov (United States)

    Jordan, Jeremy S.; Turner, Brian A.

    2008-01-01

    Researchers in a number of disciplines have examined the utility of single-item measures for both affective and cognitive constructs. While these authors have indicated that, under certain circumstances, the use of single-item measures is appropriate, there remains concern regarding the reliability and validity of single-item measures. This study…

  5. Binomial test models and item difficulty

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1979-01-01

    In choosing a binomial test model, it is important to know exactly what conditions are imposed on item difficulty. In this paper these conditions are examined for both a deterministic and a stochastic conception of item responses. It appears that they are more restrictive than is generally

  6. Algorithmic test design using classical item parameters

    NARCIS (Netherlands)

    van der Linden, Willem J.; Adema, Jos J.

    Two optimalization models for the construction of tests with a maximal value of coefficient alpha are given. Both models have a linear form and can be solved by using a branch-and-bound algorithm. The first model assumes an item bank calibrated under the Rasch model and can be used, for instance,

  7. Bayesian item selection criteria for adaptive testing

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1996-01-01

    R.J. Owen (1975) proposed an approximate empirical Bayes procedure for item selection in adaptive testing. The procedure replaces the true posterior by a normal approximation with closed-form expressions for its first two moments. This approximation was necessary to minimize the computational

  8. Item calibration in incomplete testing designs

    Directory of Open Access Journals (Sweden)

    Norman D. Verhelst

    2011-01-01

    Full Text Available This study discusses the justifiability of item parameter estimation in incomplete testing designs in item response theory. Marginal maximum likelihood (MML as well as conditional maximum likelihood (CML procedures are considered in three commonly used incomplete designs: random incomplete, multistage testing and targeted testing designs. Mislevy and Sheenan (1989 have shown that in incomplete designs the justifiability of MML can be deduced from Rubin's (1976 general theory on inference in the presence of missing data. Their results are recapitulated and extended for more situations. In this study it is shown that for CML estimation the justification must be established in an alternative way, by considering the neglected part of the complete likelihood. The problems with incomplete designs are not generally recognized in practical situations. This is due to the stochastic nature of the incomplete designs which is not taken into account in standard computer algorithms. For that reason, incorrect uses of standard MML- and CML-algorithms are discussed.

  9. Computerized adaptive testing with item clones

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; van der Linden, Willem J.

    2001-01-01

    To reduce the cost of item writing and to enhance the flexibility of item presentation, items can be generated by item-cloning techniques. An important consequence of cloning is that it may cause variability on the item parameters. Therefore, a multilevel item response model is presented in which it

  10. An Item Analysis and Validity Investigation of Bender Visual Motor Gestalt Test Score Items

    Science.gov (United States)

    Lambert, Nadine M.

    1971-01-01

    This investigation attempted to demonstrate the utility of standard item analysis procedures for selecting the most reliable and valid items for scoring Bender Visual Motor Gestalt Test test records. (Author)

  11. Development and validation of the Single Item Narcissism Scale (SINS).

    Science.gov (United States)

    Konrath, Sara; Meier, Brian P; Bushman, Brad J

    2014-01-01

    The narcissistic personality is characterized by grandiosity, entitlement, and low empathy. This paper describes the development and validation of the Single Item Narcissism Scale (SINS). Although the use of longer instruments is superior in most circumstances, we recommend the SINS in some circumstances (e.g. under serious time constraints, online studies). In 11 independent studies (total N = 2,250), we demonstrate the SINS' psychometric properties. The SINS is significantly correlated with longer narcissism scales, but uncorrelated with self-esteem. It also has high test-retest reliability. We validate the SINS in a variety of samples (e.g., undergraduates, nationally representative adults), intrapersonal correlates (e.g., positive affect, depression), and interpersonal correlates (e.g., aggression, relationship quality, prosocial behavior). The SINS taps into the more fragile and less desirable components of narcissism. The SINS can be a useful tool for researchers, especially when it is important to measure narcissism with constraints preventing the use of longer measures.

  12. Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items

    Science.gov (United States)

    Aybek, Eren Can; Demirtasli, R. Nukhet

    2017-01-01

    This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…

  13. Single-Item Measurement of Suicidal Behaviors: Validity and Consequences of Misclassification.

    Directory of Open Access Journals (Sweden)

    Alexander J Millner

    Full Text Available Suicide is a leading cause of death worldwide. Although research has made strides in better defining suicidal behaviors, there has been less focus on accurate measurement. Currently, the widespread use of self-report, single-item questions to assess suicide ideation, plans and attempts may contribute to measurement problems and misclassification. We examined the validity of single-item measurement and the potential for statistical errors. Over 1,500 participants completed an online survey containing single-item questions regarding a history of suicidal behaviors, followed by questions with more precise language, multiple response options and narrative responses to examine the validity of single-item questions. We also conducted simulations to test whether common statistical tests are robust against the degree of misclassification produced by the use of single-items. We found that 11.3% of participants that endorsed a single-item suicide attempt measure engaged in behavior that would not meet the standard definition of a suicide attempt. Similarly, 8.8% of those who endorsed a single-item measure of suicide ideation endorsed thoughts that would not meet standard definitions of suicide ideation. Statistical simulations revealed that this level of misclassification substantially decreases statistical power and increases the likelihood of false conclusions from statistical tests. Providing a wider range of response options for each item reduced the misclassification rate by approximately half. Overall, the use of single-item, self-report questions to assess the presence of suicidal behaviors leads to misclassification, increasing the likelihood of statistical decision errors. Improving the measurement of suicidal behaviors is critical to increase understanding and prevention of suicide.

  14. Correlates of a Single-Item Indicator Versus a Multi-Item Scale of Outness About Same-Sex Attraction.

    Science.gov (United States)

    Wilkerson, J Michael; Noor, Syed W; Galos, Dylan L; Rosser, B R Simon

    2016-07-01

    In this study, we investigated if a single-item indicator measured the degree to which people were open about their same-sex attraction ("out") as accurately as a multi-item scale. For the multi-item scale, we used the Outness Inventory, which includes three subscales: family, world, and religion. We examined correlations between the single- and multi-item measures; between the single-item indicator and the subscales of the multi-item scale; and between the measures and internalized homonegativity, social attitudes towards homosexuality, and depressive symptoms. In addition, we calculated Tjur's R (2) as a measure of predictive power of the single-item indicator, multi-item scale, and subscales of the multi-item scale in predicting two health-related outcomes: depressive symptoms and condomless anal sex with multiple partners. There was a strong correlation between the single- and multi-item measures (r = 0.73). Furthermore, there were strong correlations between the single-item indicator and each subscale of the multi-item scale: family (r = 0.70), world (r = 0.77), and religion (r = 0.50). In addition, the correlations between the single-item indicator and internalized homonegativity (r = -0.63), social attitudes towards homosexuality (r = -0.38), and depression (r = -0.14) were higher than those between the multi-item scale and internalized homonegativity (r = -0.55), social attitudes towards homosexuality (r = -0.21), and depression (r = -0.13). Contrary to the premise that multi-item measures are superior to single-item measures, our collective findings indicate that the single-item indicator of outness performs better than the multi-item scale of outness.

  15. Guide to good practices for the development of test items

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-01-01

    While the methodology used in developing test items can vary significantly, to ensure quality examinations, test items should be developed systematically. Test design and development is discussed in the DOE Guide to Good Practices for Design, Development, and Implementation of Examinations. This guide is intended to be a supplement by providing more detailed guidance on the development of specific test items. This guide addresses the development of written examination test items primarily. However, many of the concepts also apply to oral examinations, both in the classroom and on the job. This guide is intended to be used as guidance for the classroom and laboratory instructor or curriculum developer responsible for the construction of individual test items. This document focuses on written test items, but includes information relative to open-reference (open book) examination test items, as well. These test items have been categorized as short-answer, multiple-choice, or essay. Each test item format is described, examples are provided, and a procedure for development is included. The appendices provide examples for writing test items, a test item development form, and examples of various test item formats.

  16. Item Response Theory Models for Performance Decline during Testing

    Science.gov (United States)

    Jin, Kuan-Yu; Wang, Wen-Chung

    2014-01-01

    Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

  17. Single item inventory models : A time- and event- averages approach

    NARCIS (Netherlands)

    E.M. Bazsa-Oldenkamp; P. den Iseger

    2003-01-01

    textabstractThis paper extends a fundamental result about single-item inventory systems. This approach allows more general performance measures, demand processes and order policies, and leads to easier analysis and implementation, than prior research. We obtain closed form expressions for the

  18. Face validity of the single work ability item

    DEFF Research Database (Denmark)

    Gupta, Nidhi; Jensen, Bjørn Søvsø; Søgaard, Karen

    2014-01-01

    PURPOSE: The purpose of this study was to investigate the face validity of the self-reported single item work ability with objectively measured heart rate reserve (%HRR) among blue-collar workers. METHODS: We utilized data from 127 blue-collar workers (Female = 53; Male = 74) aged 18-65 years fro...

  19. Assessing difference between classical test theory and item ...

    African Journals Online (AJOL)

    Assessing difference between classical test theory and item response theory methods in scoring primary four multiple choice objective test items. ... All research participants were ranked on the CTT number correct scores and the corresponding IRT item pattern scores from their performance on the PRISMADAT. Wilcoxon ...

  20. Test Item Development: Validity Evidence from Quality Assurance Procedures.

    Science.gov (United States)

    Downing, Steven M.; Haladyna, Thomas M.

    1997-01-01

    An ideal process is outlined for test item development and the study of item responses to ensure that tests are sound. Qualitative and quantitative methods are used to assess the item-level validity evidence for high-stakes examinations. A checklist for assessment is provided. (SLD)

  1. Development and validation of the Single Item Narcissism Scale (SINS.

    Directory of Open Access Journals (Sweden)

    Sara Konrath

    Full Text Available MAIN OBJECTIVES: The narcissistic personality is characterized by grandiosity, entitlement, and low empathy. This paper describes the development and validation of the Single Item Narcissism Scale (SINS. Although the use of longer instruments is superior in most circumstances, we recommend the SINS in some circumstances (e.g. under serious time constraints, online studies. METHODS: In 11 independent studies (total N = 2,250, we demonstrate the SINS' psychometric properties. RESULTS: The SINS is significantly correlated with longer narcissism scales, but uncorrelated with self-esteem. It also has high test-retest reliability. We validate the SINS in a variety of samples (e.g., undergraduates, nationally representative adults, intrapersonal correlates (e.g., positive affect, depression, and interpersonal correlates (e.g., aggression, relationship quality, prosocial behavior. The SINS taps into the more fragile and less desirable components of narcissism. SIGNIFICANCE: The SINS can be a useful tool for researchers, especially when it is important to measure narcissism with constraints preventing the use of longer measures.

  2. Computerized adaptive testing item selection in computerized adaptive learning systems

    NARCIS (Netherlands)

    Eggen, Theodorus Johannes Hendrikus Maria; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item selection methods traditionally developed for computerized adaptive testing (CAT) are explored for their usefulness in item-based computerized adaptive learning (CAL) systems. While in CAT Fisher information-based selection is optimal, for recovering learning populations in CAL systems item

  3. Effect of Differential Item Functioning on Test Equating

    Science.gov (United States)

    Kabasakal, Kübra Atalay; Kelecioglu, Hülya

    2015-01-01

    This study examines the effect of differential item functioning (DIF) items on test equating through multilevel item response models (MIRMs) and traditional IRMs. The performances of three different equating models were investigated under 24 different simulation conditions, and the variables whose effects were examined included sample size, test…

  4. A Review of Test Item Types

    Science.gov (United States)

    2008-03-06

    model ( 3PL ; Lord & Novick, 1968). IRT models appropriate for polytomously scored items (e.g., Muraki, 1997) are available, and mixing of models is not...problematic within the IRT framework per se. Nevertheless, the current CAT-ASVAB infrastructure is configured to work with the 3PL model only, and

  5. Detection of differential item functioning using Lagrange multiplier tests

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    1996-01-01

    In this paper it is shown that differential item functioning can be evaluated using the Lagrange multiplier test or C. R. Rao's efficient score test. The test is presented in the framework of a number of item response theory (IRT) models such as the Rasch model, the one-parameter logistic model, the

  6. Algorithms for computerized test construction using classical item parameters

    NARCIS (Netherlands)

    Adema, Jos J.; van der Linden, Willem J.

    1989-01-01

    Recently, linear programming models for test construction were developed. These models were based on the information function from item response theory. In this paper another approach is followed. Two 0-1 linear programming models for the construction of tests using classical item and test

  7. A Statistical Test for Differential Item Pair Functioning

    NARCIS (Netherlands)

    Bechger, T.M.; Maris, G.

    This paper presents an IRT-based statistical test for differential item functioning (DIF). The test is developed for items conforming to the Rasch (Probabilistic models for some intelligence and attainment tests, The Danish Institute of Educational Research, Copenhagen, 1960) model but we will

  8. A person fit test for IRT models for polytomous items

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; Dagohoy, A.V.

    2007-01-01

    A person fit test based on the Lagrange multiplier test is presented for three item response theory models for polytomous items: the generalized partial credit model, the sequential model, and the graded response model. The test can also be used in the framework of multidimensional ability

  9. Effects of Test Item Disclosure on Medical Licensing Examination

    Science.gov (United States)

    Yang, Eunbae B.; Lee, Myung Ae; Park, Yoon Soo

    2018-01-01

    In 2012, the National Health Personnel Licensing Examination Board of Korea decided to publicly disclose all test items and answers to satisfy the test takers' right to know and enhance the transparency of tests administered by the government. This study investigated the effects of item disclosure on the medical licensing examination (MLE),…

  10. Optimal Bayesian Adaptive Design for Test-Item Calibration

    NARCIS (Netherlands)

    van der Linden, Willem J.; Ren, Hao

    2015-01-01

    An optimal adaptive design for test-item calibration based on Bayesian optimality criteria is presented. The design adapts the choice of field-test items to the examinees taking an operational adaptive test using both the information in the posterior distributions of their ability parameters and the

  11. Procedures for Selecting Items for Computerized Adaptive Tests.

    Science.gov (United States)

    Kingsbury, G. Gage; Zara, Anthony R.

    1989-01-01

    Several classical approaches and alternative approaches to item selection for computerized adaptive testing (CAT) are reviewed and compared. The study also describes procedures for constrained CAT that may be added to classical item selection approaches to allow them to be used for applied testing. (TJH)

  12. Item selection and ability estimation adaptive testing

    NARCIS (Netherlands)

    Pashley, Peter J.; van der Linden, Wim J.; van der Linden, Willem J.; Glas, Cornelis A.W.; Glas, Cees A.W.

    2010-01-01

    The last century saw a tremendous progression in the refinement and use of standardized linear tests. The first administered College Board exam occurred in 1901 and the first Scholastic Assessment Test (SAT) was given in 1926. Since then, progressively more sophisticated standardized linear tests

  13. Components of item selection algorithm in computerized adaptive testing.

    Science.gov (United States)

    Han, Kyung T

    2018-03-24

    Computerized adaptive testing (CAT) greatly improves measurement efficiency in high-stakes testing operations through the selection and administration of test items whose difficulty level is most relevant to each individual test taker. This paper explains the three components of a conventional CAT item selection algorithm-test content balancing, item selection criterion, and item exposure control. There were several noteworthy methodologies underlying each component. Test script method and constrained CAT method were for test content balancing. As for item selection criteria, there wereThe maximized Fisher information criterion, b-matching method, a-stratification method, weighted likelihood information criterion, efficiency balanced information criterion, and Kullback-Leibler information criterion.The randomesque method, Sympson-Hetter method, the unconditional and conditional multinomial methods, and the fade-away method were for item exposure control. Threre were several holistic approaches to CAT using the automated test assembly methods such as the shadow test approach and the weighted deviation model. Item usage and exposure count were variable according to the different item selection criteria and exposure control methods.. Finally, another important factors to consider when determining an appropriate CAT design are computer resources requirement, size of items, and the test length. . Logic of CAT is now being adopted in "adaptive learning," which integrates the learning aspect and the (formative) assessment aspect of education into a continuous, individualized learning experience. Therefore, the algorithms and technologies in this review may be able to help medical health educators and high stakes test takers to adopt CAT more actively and efficiently.

  14. Quantitative Penetration Testing with Item Response Theory

    NARCIS (Netherlands)

    Arnold, Florian; Pieters, Wolter; Stoelinga, Mariëlle Ida Antoinette

    2014-01-01

    Existing penetration testing approaches assess the vulnerability of a system by determining whether certain attack paths are possible in practice. Thus, penetration testing has so far been used as a qualitative research method. To enable quantitative approaches to security risk management, including

  15. Quantitative penetration testing with item response theory

    NARCIS (Netherlands)

    Arnold, Florian; Pieters, Wolter; Stoelinga, Mariëlle Ida Antoinette

    2013-01-01

    Existing penetration testing approaches assess the vulnerability of a system by determining whether certain attack paths are possible in practice. Thus, penetration testing has so far been used as a qualitative research method. To enable quantitative approaches to security risk management, including

  16. Quantitative penetration testing with item response theory

    NARCIS (Netherlands)

    Pieters, W.; Arnold, F.; Stoelinga, M.I.A.

    2013-01-01

    Existing penetration testing approaches assess the vulnerability of a system by determining whether certain attack paths are possible in practice. Therefore, penetration testing has thus far been used as a qualitative research method. To enable quantitative approaches to security risk management,

  17. Test equating in the presence of DIF items.

    Science.gov (United States)

    Chu, Kwang-lee; Kamata, Akihito

    2005-01-01

    This paper proposes a multilevel measurement model that controls for DIF effects in test equating. The accuracy and stability of item and ability parameter estimates under the proposed multilevel measurement model were examined using randomly simulated data. Estimates from the proposed model were compared with those resulting from two multiple-group concurrent equating designs, including 1) a design that replaced DIF-items with items with no DIF; and 2) a design that retained DIF items with no attempt to control for DIF. In most of the investigated conditions, the results indicated that the proposed multilevel measurement model performed better than the two comparison models.

  18. Mathematical-programming approaches to test item pool design

    NARCIS (Netherlands)

    Veldkamp, Bernard P.; van der Linden, Willem J.; Ariel, A.

    2002-01-01

    This paper presents an approach to item pool design that has the potential to improve on the quality of current item pools in educational and psychological testing andhence to increase both measurement precision and validity. The approach consists of the application of mathematical programming

  19. Differential functioning of Bender Visual-Motor Gestalt Test items.

    Science.gov (United States)

    Sisto, Fermino Fernandes; Dos Santos, Acácia Aparecida Angeli; Noronha, Ana Paula Porto

    2010-02-01

    Differential Item Functioning (DIF) refers to items that do not function the same way for comparable members of different groups. The present study focuses on analyzing and classifying sex-related differential item functioning in the Bender Visual-Motor Gestalt Test. Subjects were 1,052 children attending public schools (513 boys, 539 girls, ages 6-10 years). The protocols were scored using the Bender Graduated Scoring System, which evaluates only the distortion criterion using the Rasch logistic response model. The scoring system fit the Rasch model, although two items were found to be biased by sex. When analyzing differential functioning of items for boys and girls separately, the number of differentially functioning items was equal.

  20. Item response times in computerized adaptive testing

    Directory of Open Access Journals (Sweden)

    Lutz F. Hornke

    2000-01-01

    Full Text Available Tiempos de respuesta al ítem en tests adaptativos informatizados. Los tests adaptativos informatizados (TAI proporcionan puntuaciones y a la vez tiempos de respuesta a los ítems. La investigación sobre el significado adicional que se puede obtener de la información contenida en los tiempos de respuesta es de especial interés. Se dispuso de los datos de 5912 jóvenes en un test adaptativo informatizado. Estudios anteriores indican mayores tiempos de respuesta cuando las respuestas son incorrectas. Este resultado fue replicado en este estudio más amplio. No obstante, los tiempos promedios de respuesta al ítem para las respuestas erróneas y correctas no muestran una interpretación diferencial de la obtenida con los niveles de rasgo, y tampoco correlacionan de manera diferente con unos cuantos tests de capacidad. Se discute si los tiempos de respuesta deben ser interpretados en la misma dimensión que mide el TAI o en otras dimensiones. Desde los primeros años 30 los tiempos de respuesta han sido considerados indicadores de rasgos de personalidad que deben ser diferenciados de los rasgos que miden las puntuaciones del test. Esta idea es discutida y se ofrecen argumentos a favor y en contra. Los acercamientos mas recientes basados en modelos también se muestran. Permanece abierta la pregunta de si se obtiene o no información diagnóstica adicional de un TAI que tenga una toma de datos detallada y programada.

  1. Optimal Bayesian Adaptive Design for Test-Item Calibration.

    Science.gov (United States)

    van der Linden, Wim J; Ren, Hao

    2015-06-01

    An optimal adaptive design for test-item calibration based on Bayesian optimality criteria is presented. The design adapts the choice of field-test items to the examinees taking an operational adaptive test using both the information in the posterior distributions of their ability parameters and the current posterior distributions of the field-test parameters. Different criteria of optimality based on the two types of posterior distributions are possible. The design can be implemented using an MCMC scheme with alternating stages of sampling from the posterior distributions of the test takers' ability parameters and the parameters of the field-test items while reusing samples from earlier posterior distributions of the other parameters. Results from a simulation study demonstrated the feasibility of the proposed MCMC implementation for operational item calibration. A comparison of performances for different optimality criteria showed faster calibration of substantial numbers of items for the criterion of D-optimality relative to A-optimality, a special case of c-optimality, and random assignment of items to the test takers.

  2. Single Event Effect (SEE) Test Planning 101

    Science.gov (United States)

    LaBel, Kenneth A.; Pellish, Jonathan; Berg, Melanie D.

    2011-01-01

    This is a course on SEE Test Plan development. It is an introductory discussion of the items that go into planning an SEE test that should complement the SEE test methodology used. Material will only cover heavy ion SEE testing and not proton, LASER, or other though many of the discussed items may be applicable. While standards and guidelines for how-to perform single event effects (SEE) testing have existed almost since the first cyclotron testing, guidance on the development of SEE test plans has not been as easy to find. In this section of the short course, we attempt to rectify this lack. We consider the approach outlined here as a "living" document: mission specific constraints and new technology related issues always need to be taken into account. We note that we will use the term "test planning" in the context of those items being included in a test plan.

  3. A Model-Free Diagnostic for Single-Peakedness of Item Responses Using Ordered Conditional Means

    Science.gov (United States)

    Polak, Marike; De Rooij, Mark; Heiser, Willem J.

    2012-01-01

    In this article we propose a model-free diagnostic for single-peakedness (unimodality) of item responses. Presuming a unidimensional unfolding scale and a given item ordering, we approximate item response functions of all items based on ordered conditional means (OCM). The proposed OCM methodology is based on Thurstone & Chave's (1929) "criterion…

  4. Detection of differential item functioning using Lagrange multiplier tests

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    1998-01-01

    Abstract: In the present paper it is shown that differential item functioning can be evaluated using the Lagrange multiplier test or Rao’s efficient score test. The test is presented in the framework of a number of IRT models such as the Rasch model, the OPLM, the 2-parameter logistic model, the

  5. Alternative approaches to updating item parameter estimates in tests with item cloning

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    2006-01-01

    Item cloning techniques can greatly reduce the cost of item writing and enhance the flexibility of item presentation. To deal with the possible variability of the item parameters caused by item cloning, Glas and van der Linden (in press, 2006) proposed a multilevel item response model where it is

  6. Differential item functioning analysis of the Vanderbilt Expertise Test for cars.

    Science.gov (United States)

    Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W; Van Gulick, Ana Beth; Gauthier, Isabel

    2015-01-01

    The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.

  7. Assessing Differential Item Functioning on the Test of Relational Reasoning

    Directory of Open Access Journals (Sweden)

    Denis Dumas

    2018-03-01

    Full Text Available The test of relational reasoning (TORR is designed to assess the ability to identify complex patterns within visuospatial stimuli. The TORR is designed for use in school and university settings, and therefore, its measurement invariance across diverse groups is critical. In this investigation, a large sample, representative of a major university on key demographic variables, was collected, and the resulting data were analyzed using a multi-group, multidimensional item-response theory model-comparison procedure. No significant differential item functioning was found on any of the TORR items across any of the demographic groups of interest. This finding is interpreted as evidence of the cultural fairness of the TORR, and potential test-development choices that may have contributed to that cultural fairness are discussed.

  8. Some new item selection criteria for adaptive testing

    NARCIS (Netherlands)

    Veerkamp, Wim J.J.; Veerkamp, W.J.J.; Berger, Martijn P.F.; Berger, Martijn

    1994-01-01

    In this study some alternative item selection criteria for adaptive testing are proposed. These criteria take into account the uncertainty of the ability estimates. A general weighted information criterion is suggested of which the usual maximum information criterion and the suggested alternative

  9. Item Selection Criteria with Practical Constraints for Computerized Classification Testing

    Science.gov (United States)

    Lin, Chuan-Ju

    2011-01-01

    This study compares four item selection criteria for a two-category computerized classification testing: (1) Fisher information (FI), (2) Kullback-Leibler information (KLI), (3) weighted log-odds ratio (WLOR), and (4) mutual information (MI), with respect to the efficiency and accuracy of classification decision using the sequential probability…

  10. Analysis of Individual "Test Of Astronomy STandards" (TOAST) Item Responses

    Science.gov (United States)

    Slater, Stephanie J.; Schleigh, Sharon Price; Stork, Debra J.

    2015-01-01

    The development of valid and reliable strategies to efficiently determine the knowledge landscape of introductory astronomy college students is an effort of great interest to the astronomy education community. This study examines individual item response rates from a widely used conceptual understanding survey, the Test Of Astronomy Standards…

  11. Small-Item Contact Test Method, FY11 Release

    Science.gov (United States)

    2012-07-01

    developmental and operational testing (DT/OT) activities , technology readiness assessments (TRA) to determine technology readiness level (TRL...or gas phase ( fumigants , including aerosols). decontamination process: The process of making any person, object, or area safe by absorbing...item. • For vaporous decontaminants: injection rate, flow rate, fumigant concentration, temperature, and relative humidity. • For liquid

  12. Examination of Polytomous Items' Psychometric Properties According to Nonparametric Item Response Theory Models in Different Test Conditions

    Science.gov (United States)

    Sengul Avsar, Asiye; Tavsancil, Ezel

    2017-01-01

    This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…

  13. An item analysis of the Conditional Reasoning Test of Aggression.

    Science.gov (United States)

    DeSimone, Justin A; James, Lawrence R

    2015-11-01

    This manuscript uses item response theory (IRT) to estimate item characteristics of the Conditional Reasoning Test of Aggression (CRT-A). Using a sample size of 5,511 respondents, the present analysis provides an accurate assessment of the capability of the CRT-A to measure latent aggression. The one-parameter logistic (1PL) model, two-parameter logistic (2PL) model, and three-parameter logistic (3PL) model are compared before the item analysis. Results suggest that the 2PL model is the most appropriate dichotomous IRT model for describing the item characteristics of the CRT-A. Potential multdimensionality in the CRT-A is also examined. Results suggest that CRT-A items work as theoretically intended, with the probability of selecting an aggressive response increasing with latent trait levels. Information curves indicate that the CRT-A is best suited for use with individuals who are high on latent aggression. Exploratory analyses include an examination of polytomous IRT models and DIF comparing student and employee respondents. The results have implications for future research using the CRT-A as well as the identification of populations appropriate for measurement using this assessment tool. (c) 2015 APA, all rights reserved).

  14. Applying modern psychometric techniques to melodic discrimination testing: Item response theory, computerised adaptive testing, and automatic item generation.

    Science.gov (United States)

    Harrison, Peter M C; Collins, Tom; Müllensiefen, Daniel

    2017-06-15

    Modern psychometric theory provides many useful tools for ability testing, such as item response theory, computerised adaptive testing, and automatic item generation. However, these techniques have yet to be integrated into mainstream psychological practice. This is unfortunate, because modern psychometric techniques can bring many benefits, including sophisticated reliability measures, improved construct validity, avoidance of exposure effects, and improved efficiency. In the present research we therefore use these techniques to develop a new test of a well-studied psychological capacity: melodic discrimination, the ability to detect differences between melodies. We calibrate and validate this test in a series of studies. Studies 1 and 2 respectively calibrate and validate an initial test version, while Studies 3 and 4 calibrate and validate an updated test version incorporating additional easy items. The results support the new test's viability, with evidence for strong reliability and construct validity. We discuss how these modern psychometric techniques may also be profitably applied to other areas of music psychology and psychological science in general.

  15. Which Single-Item Measures of Overactive Bladder Symptom Treatment Correlate Best With Patient Satisfaction?

    NARCIS (Netherlands)

    Michel, Martin C.; Oelke, Matthias; Vogel, Monika; de La Rosette, Jean J. M. C. H.

    2011-01-01

    Aims: While complex symptom scales are important research tools, simpler, preferably single item scales may be more useful for routine clinical practise in the evaluation of patients with overactive bladder syndrome (OAB). This study aimed to compare multiple single-item scales at baseline and after

  16. Work ability as prognostic risk marker of disability pension : Single-item work ability score versus multi-item work ability index

    NARCIS (Netherlands)

    Roelen, C.A.M.; Rhenen, van W.; Groothoff, J.W.; Klink, van der J.J.L.; Twisk, W.R.; Heymans, M.W.

    2014-01-01

    Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP.

  17. Demonstrating the Difference between Classical Test Theory and Item Response Theory Using Derived Test Data

    Science.gov (United States)

    Magno, Carlo

    2009-01-01

    The present report demonstrates the difference between classical test theory (CTT) and item response theory (IRT) approach using an actual test data for chemistry junior high school students. The CTT and IRT were compared across two samples and two forms of test on their item difficulty, internal consistency, and measurement errors. The specific…

  18. Small-Item Vapor Test Method, FY11 Release

    Science.gov (United States)

    2012-07-01

    technology (S&T), T&E, and developmental and operational testing (DT/OT) activities , technology readiness assessments (TRA) to determine technology...powders, wipes), or gas-phase ( fumigants , including aerosols). decontamination process: The process of making any person, object, or area safe by...delivered to item. ■ Vaporous decontaminants: injection rate, flow rate, fumigant concentration, temperature, and relative humidity. ■ Liquid

  19. A Regional and Local Item Response Theory Based Test Item Bank System.

    Science.gov (United States)

    Hathaway, Walter; And Others

    This report describes the development, operation, maintenance, and future prospects of the item banks pioneered by the Portland (Oregon) School District. At the time of this report, there were 3,500 mathematics, 2,200 reading, and 2,300 language usage items calibrated under the fixed parameter model of item response theory (IRT) for Grades 3-8.…

  20. Assessing the Validity of Single-item Life Satisfaction Measures: Results from Three Large Samples

    Science.gov (United States)

    Cheung, Felix; Lucas, Richard E.

    2014-01-01

    Purpose The present paper assessed the validity of single-item life satisfaction measures by comparing single-item measures to the Satisfaction with Life Scale (SWLS) - a more psychometrically established measure. Methods Two large samples from Washington (N=13,064) and Oregon (N=2,277) recruited by the Behavioral Risk Factor Surveillance System (BRFSS) and a representative German sample (N=1,312) recruited by the Germany Socio-Economic Panel (GSOEP) were included in the present analyses. Single-item life satisfaction measures and the SWLS were correlated with theoretically relevant variables, such as demographics, subjective health, domain satisfaction, and affect. The correlations between the two life satisfaction measures and these variables were examined to assess the construct validity of single-item life satisfaction measures. Results Consistent across three samples, single-item life satisfaction measures demonstrated substantial degree of criterion validity with the SWLS (zero-order r = 0.62 – 0.64; disattenuated r = 0.78 – 0.80). Patterns of statistical significance for correlations with theoretically relevant variables were the same across single-item measures and the SWLS. Single-item measures did not produce systematically different correlations compared to the SWLS (average difference = 0.001 – 0.005). The average absolute difference in the magnitudes of the correlations produced by single-item measures and the SWLS were very small (average absolute difference = 0.015 −0.042). Conclusions Single-item life satisfaction measures performed very similarly compared to the multiple-item SWLS. Social scientists would get virtually identical answer to substantive questions regardless of which measure they use. PMID:24890827

  1. Assessing the validity of single-item life satisfaction measures: results from three large samples.

    Science.gov (United States)

    Cheung, Felix; Lucas, Richard E

    2014-12-01

    The present paper assessed the validity of single-item life satisfaction measures by comparing single-item measures to the Satisfaction with Life Scale (SWLS)-a more psychometrically established measure. Two large samples from Washington (N = 13,064) and Oregon (N = 2,277) recruited by the Behavioral Risk Factor Surveillance System and a representative German sample (N = 1,312) recruited by the Germany Socio-Economic Panel were included in the present analyses. Single-item life satisfaction measures and the SWLS were correlated with theoretically relevant variables, such as demographics, subjective health, domain satisfaction, and affect. The correlations between the two life satisfaction measures and these variables were examined to assess the construct validity of single-item life satisfaction measures. Consistent across three samples, single-item life satisfaction measures demonstrated substantial degree of criterion validity with the SWLS (zero-order r = 0.62-0.64; disattenuated r = 0.78-0.80). Patterns of statistical significance for correlations with theoretically relevant variables were the same across single-item measures and the SWLS. Single-item measures did not produce systematically different correlations compared to the SWLS (average difference = 0.001-0.005). The average absolute difference in the magnitudes of the correlations produced by single-item measures and the SWLS was very small (average absolute difference = 0.015-0.042). Single-item life satisfaction measures performed very similarly compared to the multiple-item SWLS. Social scientists would get virtually identical answer to substantive questions regardless of which measure they use.

  2. Meta-analytic guidelines for evaluating single-item reliabilities of personality instruments.

    Science.gov (United States)

    Spörrle, Matthias; Bekk, Magdalena

    2014-06-01

    Personality is an important predictor of various outcomes in many social science disciplines. However, when personality traits are not the principal focus of research, for example, in global comparative surveys, it is often not possible to assess them extensively. In this article, we first provide an overview of the advantages and challenges of single-item measures of personality, a rationale for their construction, and a summary of alternative ways of assessing their reliability. Second, using seven diverse samples (Ntotal = 4,263) we develop the SIMP-G, the German adaptation of the Single-Item Measures of Personality, an instrument assessing the Big Five with one item per trait, and evaluate its validity and reliability. Third, we integrate previous research and our data into a first meta-analysis of single-item reliabilities of personality measures, and provide researchers with guidelines and recommendations for the evaluation of single-item reliabilities. © The Author(s) 2013.

  3. Item Pool Design for an Operational Variable-Length Computerized Adaptive Test

    Science.gov (United States)

    He, Wei; Reckase, Mark D.

    2014-01-01

    For computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pools for CATs, not only is the item pool size important but also the distribution of item parameters and practical considerations such as content distribution…

  4. The use of predicted values for item parameters in item response theory models: An application in intelligence tests

    NARCIS (Netherlands)

    Matteucci, M.; S. Mignani, Prof.; Veldkamp, Bernard P.

    2012-01-01

    In testing, item response theory models are widely used in order to estimate item parameters and individual abilities. However, even unidimensional models require a considerable sample size so that all parameters can be estimated precisely. The introduction of empirical prior information about

  5. Stochastic order in dichotomous item response models for fixed tests, research adaptive tests, or multiple abilities

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1995-01-01

    Dichotomous item response theory (IRT) models can be viewed as families of stochastically ordered distributions of responses to test items. This paper explores several properties of such distributiom. The focus is on the conditions under which stochastic order in families of conditional

  6. Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André

    2016-01-01

    Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…

  7. An Effect Size Measure for Raju's Differential Functioning for Items and Tests

    Science.gov (United States)

    Wright, Keith D.; Oshima, T. C.

    2015-01-01

    This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…

  8. A simple and fast item selection procedure for adaptive testing

    NARCIS (Netherlands)

    Veerkamp, W.J.J.; Veerkamp, Wim J.J.; Berger, Martijn; Berger, Martijn P.F.

    1994-01-01

    Items with the highest discrimination parameter values in a logistic item response theory (IRT) model do not necessarily give maximum information. This paper shows which discrimination parameter values (as a function of the guessing parameter and the distance between person ability and item

  9. Creating a Database for Test Items in National Examinations ...

    African Journals Online (AJOL)

    on how to use Database Management System (DBMS) to store questions produced during 'Items Generation' for easy selection of questions, good discrimination index, high security provision, good item-difficulty stratification, easy item analysis, a good retrieval system, specification for hardware requirement and software ...

  10. Cross-National Prevalence of Traditional Bullying, Traditional Victimization, Cyberbullying and Cyber-Victimization: Comparing Single-Item and Multiple-Item Approaches of Measurement

    Science.gov (United States)

    Yanagida, Takuya; Gradinger, Petra; Strohmeier, Dagmar; Solomontos-Kountouri, Olga; Trip, Simona; Bora, Carmen

    2016-01-01

    Many large-scale cross-national studies rely on a single-item measurement when comparing prevalence rates of traditional bullying, traditional victimization, cyberbullying, and cyber-victimization between countries. However, the reliability and validity of single-item measurement approaches are highly problematic and might be biased. Data from…

  11. Impact of Test Design, Item Quality, and Item Bank Size on the Psychometric Properties of Computer-Based Credentialing Examinations

    Science.gov (United States)

    Xing, Dehui; Hambleton, Ronald K.

    2004-01-01

    Computer-based testing by credentialing agencies has become common; however, selecting a test design is difficult because several good ones are available - parallel forms, computer adaptive (CAT), and multistage (MST). In this study, three computer-based test designs under some common examination conditions were investigated. Item bank size and…

  12. The Psychological Effect of Errors in Standardized Language Test Items on EFL Students' Responses to the Following Item

    Science.gov (United States)

    Khaksefidi, Saman

    2017-01-01

    This study investigates the psychological effect of a wrong question with wrong items on answering to the next question in a test of structure. Forty students selected through stratified random sampling are given 15 questions of a standardized test namely a TOEFL structure test in which questions number 7 and number 11 are wrong and their answers…

  13. Testing enhances both encoding and retrieval for both tested and untested items.

    Science.gov (United States)

    Cho, Kit W; Neely, James H; Crocco, Stephanie; Vitrano, Deana

    2017-07-01

    In forward testing effects, taking a test enhances memory for subsequently studied material. These effects have been observed for previously studied and tested items, a potentially item-specific testing effect, and newly studied untested items, a purely generalized testing effect. We directly compared item-specific and generalized forward testing effects using procedures to separate testing benefits due to encoding versus retrieval. Participants studied two lists of Swahili-English word pairs, with the second study list containing "new" pairs intermixed with the previously studied "old" pairs. Participants completed a review phase in which they took a cued-recall test on only the "old" pairs or restudied them. In Experiments 1a, 1b, and 2, the review phase was given either before or after the second study list. Testing benefited memory to the same degree for both "new" and "old" pairs, suggesting that there were no pair-specific benefits of testing. The larger benefit from testing when review was given before rather than after the second study list suggests that the memory enhancement was due to both testing-enhanced encoding and testing-enhanced retrieval. To better equate generalized testing effects for "new" and "old" pairs, Experiment 3 intermixed them in the review phase. A statistically significant pair-specific testing effect for "old" items was now observed. Overall, these results show that forward testing effects are due to both testing-enhanced encoding and retrieval effects and that direct, pair-specific forward testing benefits are considerably smaller than indirect, generalized forward testing benefits.

  14. Detection of person misfit in computerized adaptive tests with polytomous items

    NARCIS (Netherlands)

    van Krimpen-Stoop, Edith; Meijer, R.R.

    2000-01-01

    Item scores that do not fit an assumed item response theory model may cause the latent trait value to be estimated inaccurately. For computerized adaptive tests (CAT) with dichotomous items, several person-fit statistics for detecting nonfitting item score patterns have been proposed. Both for

  15. An emotional functioning item bank of 24 items for computerized adaptive testing (CAT) was established

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Gamper, Eva-Maria; Costantini, Anna

    2016-01-01

    that 24 items could be included in a unidimensional IRT model. DIF did not seem to have any significant impact on the estimation of EF. Evaluations indicated that the CAT measure may reduce sample size requirements by up to 50% compared to the QLQ-C30 EF scale without reducing power. CONCLUSION...

  16. What factors make science test items especially difficult for students from minority groups?

    Directory of Open Access Journals (Sweden)

    Are Turmo

    2012-06-01

    Full Text Available Substantial gaps in science performance between majority and minority students are often found instandardized tests used in primary school. But at the item level, the gaps may vary significantly. Theaims of this study are: (1 to identify features of the test items in science (grade 5 and grade 8 students that can potentially explain group differences; and (2 to analyze what factors make test itemsespecially difficult for minority students. Explanatory variables such as reading load, item difficulty,item writing load, and use of the multiple-choice format are found to be major factors. The analysis reveals no empirical relationships between performance gap and either item subject domain, item test location, or the number of illustrations used in the item. Subtle issues regarding the design ofitems may influence the size of the performance gap at item level over and above the main explanatory variables. The gap can be reduced significantly by choosing “minority friendly” items.

  17. Air Force Officer Qualifying Test Form T: Initial Item-, Test-, Factor-, and Composite-Level Analyses

    Science.gov (United States)

    2016-12-01

    Air Force Officer Qualifying Test, Form T1 AFOQT T2 Air Force Officer Qualifying Test, Form T2 AGFI Adjusted Goodness of Fit Index AI Air Force...Qualifying Test Electrical Maze subtest g General mental ability factor GFI Goodness of Fit Index GLS Generalized Least Squares GS Air Force...AFRL-RH-WP-TR-2016-0093 AIR FORCE OFFICER QUALIFYING TEST FORM T: INITIAL ITEM-, TEST-, FACTOR-, AND COMPOSITE-LEVEL ANALYSES

  18. The Effects of Violating Standard Item Writing Principles on Tests and Students: The Consequences of Using Flawed Test Items on Achievement Examinations in Medical Education

    Science.gov (United States)

    Downing, Steven M.

    2005-01-01

    The purpose of this research was to study the effects of violations of standard multiple-choice item writing principles on test characteristics, student scores, and pass-fail outcomes. Four basic science examinations, administered to year-one and year-two medical students, were randomly selected for study. Test items were classified as either…

  19. Utility of critical items within the Recognition Memory Test and Word Choice Test.

    Science.gov (United States)

    Erdodi, Laszlo A; Tyson, Bradley T; Abeare, Christopher A; Zuccato, Brandon G; Rai, Jaspreet K; Seke, Kristian R; Sagar, Sanya; Roth, Robert M

    2017-03-17

    This study was designed to examine the clinical utility of critical items within the Recognition Memory Test (RMT) and the Word Choice Test (WCT). Archival data were collected from a mixed clinical sample of 202 patients clinically referred for neuropsychological testing (54.5% male; mean age = 45.3 years; mean level of education = 13.9 years). The credibility of a given response set was psychometrically defined using three separate composite measures, each of which was based on multiple independent performance validity indicators. Critical items improved the classification accuracy of both tests. They increased sensitivity by correctly identifying an additional 2-17% of the invalid response sets that passed the traditional cutoffs based on total score. They also increased specificity by providing additional evidence of noncredible performance in response sets that failed the total score cutoff. The combination of failing the traditional cutoff, but passing critical items was associated with increased risk of misclassifying the response set as invalid. Critical item analysis enhances the diagnostic power of both the RMT and WCT. Given that critical items require no additional test material or administration time, but help reduce both false positive and false negative errors, they represent a versatile, valuable, and time- and cost-effective supplement to performance validity assessment.

  20. Single-item measure for assessing quality of life in children with drug-resistant epilepsy.

    Science.gov (United States)

    Conway, Lauryn; Widjaja, Elysa; Smith, Mary Lou

    2018-03-01

    The current study investigated the psychometric properties of a single-item quality of life (QOL) measure, the Global Quality of Life in Childhood Epilepsy question (G-QOLCE), in children with drug-resistant epilepsy. Data came from the Impact of Pediatric Epilepsy Surgery on Health-Related Quality of Life Study (PESQOL), a multicenter prospective cohort study (n = 118) with observations collected at baseline and at 6 months of follow-up on children aged 4-18 years. QOL was measured with the QOLCE-76 and KIDSCREEN-27. The G-QOLCE was an overall QOL question derived from the QOLCE-76. Construct validity and reliability were assessed with Spearman's correlation and intraclass correlation coefficient (ICC). Responsiveness was examined through distribution-based and anchor-based methods. The G-QOLCE showed moderate (r ≥ 0.30) to strong (r ≥ 0.50) correlations with composite scores, and most subscales of the QOLCE-76 and KIDSCREEN-27 at baseline and 6-month follow-up. The G-QOLCE had moderate test-retest reliability (ICC range: 0.49-0.72) and was able to detect clinically important change in patients' QOL (standardized response mean: 0.38; probability of change: 0.65; Guyatt's responsiveness statistics: 0.62 and 0.78). Caregiver anxiety and family functioning contributed most strongly to G-QOLCE scores over time. Results offer promising preliminary evidence regarding the validity, reliability, and responsiveness of the proposed single-item QOL measure. The G-QOLCE is a potentially useful tool that can be feasibly administered in a busy clinical setting to evaluate clinical status and impact of treatment outcomes in pediatric epilepsy.

  1. Functional Distractors: Implications for Test-Item Writing and Test Design.

    Science.gov (United States)

    Haladyna, Thomas M.; Downing, Steven M.

    The proposition that the optimal number of options in a multiple choice test item is three was examined. The concept of functional distractor, a plausible wrong answer that is negatively discriminating when total test performance is the criterion, is discussed. Three distinct groups of achievers (high, middle, and low) on a national standardized…

  2. Self-Paced Physics, Documentation Report, Test Item Bank 5.3.

    Science.gov (United States)

    New York Inst. of Tech., Old Westbury.

    As a supplement to the principal reports, a compilation of criterion check items and diagnostic test items identified by terminal objectives is presented in this document relating to the U. S. Naval Academy Self-Paced Physics Course. Included are a progress check item bank, student terminal objective key sheets, quarterly diagnostic tests and…

  3. Origin bias of test items compromises the validity and fairness of curriculum comparisons

    NARCIS (Netherlands)

    Muijtjens, Arno M. M.; Schuwirth, Lambert W. T.; Cohen-Schotanus, Janke; van der Vleuten, Cees P. M.

    2007-01-01

    OBJECTIVE To determine whether items of progress tests used for inter-curriculum comparison favour students from the medical school where the items were produced (i.e. whether the origin bias of test items is a potential confounder in comparisons between curricula). METHODS We investigated scores of

  4. Developing a Strategy for Using Technology-Enhanced Items in Large-Scale Standardized Tests

    Science.gov (United States)

    Bryant, William

    2017-01-01

    As large-scale standardized tests move from paper-based to computer-based delivery, opportunities arise for test developers to make use of items beyond traditional selected and constructed response types. Technology-enhanced items (TEIs) have the potential to provide advantages over conventional items, including broadening construct measurement,…

  5. Quality Multiple-Choice Test Questions: Item-Writing Guidelines and an Analysis of Auditing Testbanks.

    Science.gov (United States)

    Hansen, James D.; Dexter, Lee

    1997-01-01

    Analysis of test item banks in 10 auditing textbooks found that 75% of questions violated one or more guidelines for multiple-choice items. In comparison, 70% of a certified public accounting exam bank had no violations. (SK)

  6. Working Memory for Sequences of Temporal Durations Reveals a Volatile Single-Item Store.

    Science.gov (United States)

    Manohar, Sanjay G; Husain, Masud

    2016-01-01

    When a sequence is held in working memory, different items are retained with differing fidelity. Here we ask whether a sequence of brief time intervals that must be remembered show recency effects, similar to those observed in verbal and visuospatial working memory. It has been suggested that prioritizing some items over others can be accounted for by a "focus of attention," maintaining some items in a privileged state. We therefore also investigated whether such benefits are vulnerable to disruption by attention or expectation. Participants listened to sequences of one to five tones, of varying durations (200 ms to 2 s). Subsequently, the length of one of the tones in the sequence had to be reproduced by holding a key. The discrepancy between the reproduced and actual durations quantified the fidelity of memory for auditory durations. Recall precision decreased with the number of items that had to be remembered, and was better for the first and last items of sequences, in line with set-size and serial position effects seen in other modalities. To test whether attentional filtering demands might impair performance, an irrelevant variation in pitch was introduced in some blocks of trials. In those blocks, memory precision was worse for sequences that consisted of only one item, i.e., the smallest memory set-size. Thus, when irrelevant information was present, the benefit of having only one item in memory is attenuated. Finally we examined whether expectation could interfere with memory. On half the trials, the number of items in the upcoming sequence was cued. When the number of items was known in advance, performance was paradoxically worse when the sequence consisted of only one item. Thus the benefit of having only one item to remember is stronger when it is unexpectedly the only item. Our results suggest that similar mechanisms are used to hold auditory time durations in working memory, as for visual or verbal stimuli. Further, solitary items were remembered

  7. Working memory for sequences of temporal durations reveals a volatile single-item store

    Directory of Open Access Journals (Sweden)

    Sanjay G Manohar

    2016-10-01

    Full Text Available When a sequence is held in working memory, different items are retained with differing fidelity. Here we ask whether a sequence of brief time intervals that must be remembered show recency effects, similar to those observed in verbal and visuospatial working memory. It has been suggested that prioritising some items over others can be accounted for by a focus of attention, maintaining some items in a privileged state. We therefore also investigated whether such benefits are vulnerable to disruption by attention or expectation. Participants listened to sequences of one to five tones, of varying durations (200ms to 2s. Subsequently, the length of one of the tones in the sequence had to be reproduced by holding a key. The discrepancy between the reproduced and actual durations quantified the fidelity of memory for auditory durations. Recall precision decreased with the number of items that had to be remembered, and was better for the first and last items of sequences, in line with set-size and serial position effects seen in other modalities. To test whether attentional filtering demands might impair performance, an irrelevant variation in pitch was introduced in some blocks of trials. In those blocks, memory precision was worse for sequences that consisted of only one item, i.e. the smallest memory set size. Thus, when irrelevant information was present, the benefit of having only one item in memory is attenuated. Finally we examined whether expectation could interfere with memory. On half the trials, the number of items in the upcoming sequence was cued. When the number of items was known in advance, performance was paradoxically worse when the sequence consisted of only one item. Thus the benefit of having only one item to remember is stronger when it is unexpectedly the only item. Our results suggest that similar mechanisms are used to hold auditory time durations in working memory, as for visual or verbal stimuli. Further, solitary items were

  8. International Semiotics: Item Difficulty and the Complexity of Science Item Illustrations in the PISA-2009 International Test Comparison

    Science.gov (United States)

    Solano-Flores, Guillermo; Wang, Chao; Shade, Chelsey

    2016-01-01

    We examined multimodality (the representation of information in multiple semiotic modes) in the context of international test comparisons. Using Program of International Student Assessment (PISA)-2009 data, we examined the correlation of the difficulty of science items and the complexity of their illustrations. We observed statistically…

  9. A more general model for testing measurement invariance and differential item functioning.

    Science.gov (United States)

    Bauer, Daniel J

    2017-09-01

    The evaluation of measurement invariance is an important step in establishing the validity and comparability of measurements across individuals. Most commonly, measurement invariance has been examined using 1 of 2 primary latent variable modeling approaches: the multiple groups model or the multiple-indicator multiple-cause (MIMIC) model. Both approaches offer opportunities to detect differential item functioning within multi-item scales, and thereby to test measurement invariance, but both approaches also have significant limitations. The multiple groups model allows 1 to examine the invariance of all model parameters but only across levels of a single categorical individual difference variable (e.g., ethnicity). In contrast, the MIMIC model permits both categorical and continuous individual difference variables (e.g., sex and age) but permits only a subset of the model parameters to vary as a function of these characteristics. The current article argues that moderated nonlinear factor analysis (MNLFA) constitutes an alternative, more flexible model for evaluating measurement invariance and differential item functioning. We show that the MNLFA subsumes and combines the strengths of the multiple group and MIMIC models, allowing for a full and simultaneous assessment of measurement invariance and differential item functioning across multiple categorical and/or continuous individual difference variables. The relationships between the MNLFA model and the multiple groups and MIMIC models are shown mathematically and via an empirical demonstration. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  10. Fully Polynomial Approximation Schemes for Single-Item Capacitated Economic Lot-Sizing Problems

    NARCIS (Netherlands)

    C.P.M. van Hoesel; A.P.M. Wagelmans (Albert)

    1997-01-01

    textabstractNP-hard cases of the single-item capacitated lot-sizing problem have been the topic of extensive research and continue to receive considerable attention. However, surprisingly few theoretical results have been published on approximation methods for these problems. To the best of our

  11. Examination of Different Item Response Theory Models on Tests Composed of Testlets

    Science.gov (United States)

    Kogar, Esin Yilmaz; Kelecioglu, Hülya

    2017-01-01

    The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and…

  12. Modeling Local Item Dependence Due to Common Test Format with a Multidimensional Rasch Model

    Science.gov (United States)

    Baghaei, Purya; Aryadoust, Vahid

    2015-01-01

    Research shows that test method can exert a significant impact on test takers' performance and thereby contaminate test scores. We argue that common test method can exert the same effect as common stimuli and violate the conditional independence assumption of item response theory models because, in general, subsets of items which have a shared…

  13. Single item measures of emotional exhaustion and depersonalization are useful for assessing burnout in medical professionals.

    Science.gov (United States)

    West, Colin P; Dyrbye, Liselotte N; Sloan, Jeff A; Shanafelt, Tait D

    2009-12-01

    Burnout has negative effects on work performance and patient care. The current standard for burnout assessment is the Maslach Burnout Inventory (MBI), a well-validated instrument consisting of 22 items answered on a 7-point Likert scale. However, the length of the MBI can limit its utility in physician surveys. To evaluate the performance of two questions relative to the full MBI for measuring burnout. Cross-sectional data from 2,248 medical students, 333 internal medicine residents, 465 internal medicine faculty, and 7,905 practicing surgeons. The single questions with the highest factor loading on the emotional exhaustion (EE) ("I feel burned out from my work") and depersonalization (DP) ("I have become more callous toward people since I took this job") domains of burnout were evaluated in four large samples of medical students, internal medicine residents, internal medicine faculty, and practicing surgeons. Spearman correlations between the single EE question and the full EE domain score minus that question ranged from 0.76-0.83. Spearman correlations between the single DP question and the full DP domain score minus that question ranged from 0.61-0.72. Responses to the single item measures of emotional exhaustion and depersonalization stratified risk of high burnout in the relevant domain on the full MBI, with consistent patterns across the four sampled groups. Single item measures of emotional exhaustion and depersonalization provide meaningful information on burnout in medical professionals.

  14. An empirical comparison of Item Response Theory and Classical Test Theory

    Directory of Open Access Journals (Sweden)

    Špela Progar

    2008-11-01

    Full Text Available Based on nonlinear models between the measured latent variable and the item response, item response theory (IRT enables independent estimation of item and person parameters and local estimation of measurement error. These properties of IRT are also the main theoretical advantages of IRT over classical test theory (CTT. Empirical evidence, however, often failed to discover consistent differences between IRT and CTT parameters and between invariance measures of CTT and IRT parameter estimates. In this empirical study a real data set from the Third International Mathematics and Science Study (TIMSS 1995 was used to address the following questions: (1 How comparable are CTT and IRT based item and person parameters? (2 How invariant are CTT and IRT based item parameters across different participant groups? (3 How invariant are CTT and IRT based item and person parameters across different item sets? The findings indicate that the CTT and the IRT item/person parameters are very comparable, that the CTT and the IRT item parameters show similar invariance property when estimated across different groups of participants, that the IRT person parameters are more invariant across different item sets, and that the CTT item parameters are at least as much invariant in different item sets as the IRT item parameters. The results furthermore demonstrate that, with regards to the invariance property, IRT item/person parameters are in general empirically superior to CTT parameters, but only if the appropriate IRT model is used for modelling the data.

  15. SINGLE HEATER TEST FINAL REPORT

    Energy Technology Data Exchange (ETDEWEB)

    J.B. Cho

    1999-05-01

    The Single Heater Test is the first of the in-situ thermal tests conducted by the U.S. Department of Energy as part of its program of characterizing Yucca Mountain in Nevada as the potential site for a proposed deep geologic repository for the disposal of spent nuclear fuel and high-level nuclear waste. The Site Characterization Plan (DOE 1988) contained an extensive plan of in-situ thermal tests aimed at understanding specific aspects of the response of the local rock-mass around the potential repository to the heat from the radioactive decay of the emplaced waste. With the refocusing of the Site Characterization Plan by the ''Civilian Radioactive Waste Management Program Plan'' (DOE 1994), a consolidated thermal testing program emerged by 1995 as documented in the reports ''In-Situ Thermal Testing Program Strategy'' (DOE 1995) and ''Updated In-Situ Thermal Testing Program Strategy'' (CRWMS M&O 1997a). The concept of the Single Heater Test took shape in the summer of 1995 and detailed planning and design of the test started with the beginning fiscal year 1996. The overall objective of the Single Heater Test was to gain an understanding of the coupled thermal, mechanical, hydrological, and chemical processes that are anticipated to occur in the local rock-mass in the potential repository as a result of heat from radioactive decay of the emplaced waste. This included making a priori predictions of the test results using existing models and subsequently refining or modifying the models, on the basis of comparative and interpretive analyses of the measurements and predictions. A second, no less important, objective was to try out, in a full-scale field setting, the various instruments and equipment to be employed in the future on a much larger, more complex, thermal test of longer duration, such as the Drift Scale Test. This ''shake down'' or trial aspect of the Single Heater Test applied

  16. SINGLE HEATER TEST FINAL REPORT

    International Nuclear Information System (INIS)

    J.B. Cho

    1999-01-01

    The Single Heater Test is the first of the in-situ thermal tests conducted by the U.S. Department of Energy as part of its program of characterizing Yucca Mountain in Nevada as the potential site for a proposed deep geologic repository for the disposal of spent nuclear fuel and high-level nuclear waste. The Site Characterization Plan (DOE 1988) contained an extensive plan of in-situ thermal tests aimed at understanding specific aspects of the response of the local rock-mass around the potential repository to the heat from the radioactive decay of the emplaced waste. With the refocusing of the Site Characterization Plan by the ''Civilian Radioactive Waste Management Program Plan'' (DOE 1994), a consolidated thermal testing program emerged by 1995 as documented in the reports ''In-Situ Thermal Testing Program Strategy'' (DOE 1995) and ''Updated In-Situ Thermal Testing Program Strategy'' (CRWMS M and O 1997a). The concept of the Single Heater Test took shape in the summer of 1995 and detailed planning and design of the test started with the beginning fiscal year 1996. The overall objective of the Single Heater Test was to gain an understanding of the coupled thermal, mechanical, hydrological, and chemical processes that are anticipated to occur in the local rock-mass in the potential repository as a result of heat from radioactive decay of the emplaced waste. This included making a priori predictions of the test results using existing models and subsequently refining or modifying the models, on the basis of comparative and interpretive analyses of the measurements and predictions. A second, no less important, objective was to try out, in a full-scale field setting, the various instruments and equipment to be employed in the future on a much larger, more complex, thermal test of longer duration, such as the Drift Scale Test. This ''shake down'' or trial aspect of the Single Heater Test applied not just to the hardware, but also to the teamwork and cooperation between

  17. Single event upset test programs

    International Nuclear Information System (INIS)

    Russen, L.C.

    1984-11-01

    It has been shown that the heavy ions in cosmic rays can give rise to single event upsets in VLSI random access memory devices (RAMs). Details are given of the programs written to test 1K, 4K, 16K and 64K memories during their irradiation with heavy charged ions, in order to simulate the effects of cosmic rays in space. The test equipment, which is used to load the memory device to be tested with a known bit pattern, and subsequently interrogate it for upsets, or ''flips'', is fully described. (author)

  18. [Difference analysis among majors in medical parasitology exam papers by test item bank proposition].

    Science.gov (United States)

    Jia, Lin-Zhi; Ya-Jun, Ma; Cao, Yi; Qian, Fen; Li, Xiang-Yu

    2012-04-30

    The quality index among "Medical Parasitology" exam papers and measured data for students in three majors from the university in 2010 were compared and analyzed. The exam papers were formed from the test item bank. The alpha reliability coefficients of the three exam papers were above 0.70. The knowledge structure and capacity structure of the exam papers were basically balanced. But the alpha reliability coefficients of the second major was the lowest, mainly due to quality of test items in the exam paper and the failure of revising the index of test item bank in time. This observation demonstrated that revising the test items and their index in the item bank according to the measured data can improve the quality of test item bank proposition and reduce the difference among exam papers.

  19. Effects of Reducing the Cognitive Load of Mathematics Test Items on Student Performance

    Directory of Open Access Journals (Sweden)

    Susan C. Gillmor

    2015-01-01

    Full Text Available This study explores a new item-writing framework for improving the validity of math assessment items. The authors transfer insights from Cognitive Load Theory (CLT, traditionally used in instructional design, to educational measurement. Fifteen, multiple-choice math assessment items were modified using research-based strategies for reducing extraneous cognitive load. An experimental design with 222 middle-school students tested the effects of the reduced cognitive load items on student performance and anxiety. Significant findings confirm the main research hypothesis that reducing the cognitive load of math assessment items improves student performance. Three load-reducing item modifications are identified as particularly effective for reducing item difficulty: signalling important information, aesthetic item organization, and removing extraneous content. Load reduction was not shown to impact student anxiety. Implications for classroom assessment and future research are discussed.

  20. The construction of parallel tests from IRT-based item banks

    NARCIS (Netherlands)

    Boekkooi-Timminga, Ellen

    1989-01-01

    The construction of parallel tests from item response theory (IRT) based item banks is discussed. Tests are considered parallel whenever their information functions are identical. After the methods for constructing parallel tests are considered, the computational complexity of 0-1 linear programming

  1. Detection of Differential Item Functioning Using Lagrange Multiplier Tests. Research Report 96-02.

    Science.gov (United States)

    Glas, Cees A. W.

    In this paper it is shown that differential item functioning can be evaluated using the Lagrange multiplier test or C. R. Rao's efficient score test. The test is presented in the framework of a number of item response theory (IRT) models such as the Rasch model, the one-parameter logistic model, the two-parameter logistic model, the generalized…

  2. Application of item response theory to tests of substance-related associative memory.

    Science.gov (United States)

    Shono, Yusuke; Grenard, Jerry L; Ames, Susan L; Stacy, Alan W

    2014-09-01

    A substance-related word-association test (WAT) is one of the commonly used indirect tests of substance-related implicit associative memory and has been shown to predict substance use. This study applied an item response theory (IRT) modeling approach to evaluate psychometric properties of the alcohol- and marijuana-related WATs and their items among 775 ethnically diverse at-risk adolescents. After examining the IRT assumptions, item fit, and differential item functioning (DIF) across gender and age groups, the original 18 WAT items were reduced to 14 and 15 items in the alcohol- and marijuana-related WAT, respectively. Thereafter, unidimensional one- and two-parameter logistic models (1PL and 2PL models) were fitted to the revised WAT items. The results demonstrated that both alcohol- and marijuana-related WATs have good psychometric properties. These results were discussed in light of the framework of a unified concept of construct validity (Messick, 1975, 1989, 1995).

  3. Above-Level Test Item Functioning across Examinee Age Groups

    Science.gov (United States)

    Warne, Russell T.; Doty, Kristine J.; Malbica, Anne Marie; Angeles, Victor R.; Innes, Scott; Hall, Jared; Masterson-Nixon, Kelli

    2016-01-01

    "Above-level testing" (also called "above-grade testing," "out-of-level testing," and "off-level testing") is the practice of administering to a child a test that is designed for an examinee population that is older or in a more advanced grade. Above-level testing is frequently used to help educators design…

  4. Development of a lack of appetite item bank for computer-adaptive testing (CAT).

    Science.gov (United States)

    Thamsborg, Lise Holst; Petersen, Morten Aa; Aaronson, Neil K; Chie, Wei-Chu; Costantini, Anna; Holzner, Bernhard; Verdonck-de Leeuw, Irma M; Young, Teresa; Groenvold, Mogens

    2015-06-01

    A significant proportion of oncological patients experiences lack of appetite. Precise measurement is relevant to improve the management of lack of appetite. The so-called computer-adaptive test (CAT) allows for adaptation of the questionnaire to the individual patient, thereby optimizing measurement precision. The EORTC Quality of Life Group is developing a CAT version of the widely used EORTC QLQ-C30 questionnaire. Here, we report on the development of the lack of appetite CAT. The EORTC approach to CAT development comprises four phases: literature search, operationalization, pre-testing, and field testing. Phases 1-3 are described in this paper. First, a list of items was retrieved from the literature. This was refined, deleting redundant and irrelevant items. Next, new items fitting the "QLQ-C30 item style" were created. These were evaluated by international samples of experts and cancer patients. The literature search generated a list of 146 items. After a comprehensive item selection procedure, the list was reduced to 24 items. These formed the basis for 21 new items fitting the QLQ-C30 item style. Expert evaluations (n = 10) and patient interviews (n = 49) reduced the list to 12 lack of appetite items. Phases 1-3 resulted in 12 lack of appetite candidate items. Based on a field testing (phase 4), the psychometric characteristics of the items will be assessed and the final item bank will be generated. This CAT item bank is expected to provide precise and efficient measurement of lack of appetite while still being backward compatible to the original QLQ-C30 scale.

  5. Which single-item measures of overactive bladder symptom treatment correlate best with patient satisfaction?

    Science.gov (United States)

    Michel, Martin C; Oelke, Matthias; Vogel, Monika; de la Rosette, Jean J M C H

    2011-04-01

    While complex symptom scales are important research tools, simpler, preferably single item scales may be more useful for routine clinical practise in the evaluation of patients with overactive bladder syndrome (OAB). This study aimed to compare multiple single-item scales at baseline and after treatment with patient-reported overall rating of treatment efficacy. In a pre-planned secondary analysis of a previously reported observational study, 4,450 patients were evaluated at baseline and after 12 weeks open-label treatment with solifenacin. Apart from episode counting for classical OAB symptoms, the following single-item rating scales were applied: Indevus Urgency Severity Scale, Urgency Perception Scale, a Visual Analog Scale (VAS), quality of life question of the IPSS, and general health and bladder problem questions of the King's Health Questionnaire (KHQ). At baseline OAB symptoms correlated at best moderately with each (r = 0.285-0.508) other or with any of the rating scales (r = 0.060-0.399). Pair-wise correlations between treatment-associated symptom or scale improvements tended to be tighter (r = 0.225-0.588). When compared to patient-reported efficacy, the VAS (r = 0.487) and the bladder problem question of the KHQ (r = 0.452) showed the tightest correlation, whereas all symptom and rating scale improvements exhibited poor correlation with patient-reported tolerability (r ≤ 0.283). The VAS and the bladder problem question of the KHQ show the greatest promise as single-item scales to assess problem intensity in OAB patients. Copyright © 2011 Wiley-Liss, Inc.

  6. Robust Automated Test Assembly for Testlet-Based Tests: An Illustration with Analytical Reasoning Items

    Directory of Open Access Journals (Sweden)

    Bernard P. Veldkamp

    2017-12-01

    Full Text Available In many high-stakes testing programs, testlets are used to increase efficiency. Since responses to items belonging to the same testlet not only depend on the latent ability but also on correct reading, understanding, and interpretation of the stimulus, the assumption of local independence does not hold. Testlet response theory (TRT models have been developed to deal with this dependency. For both logit and probit testlet models, a random testlet effect is added to the standard logit and probit item response theory (IRT models. Even though this testlet effect might make the IRT models more realistic, application of these models in practice leads to new questions, for example, in automated test assembly (ATA. In many test assembly models, goals have been formulated for the amount of information the test should provide about the candidates. The amount of Fisher Information is often maximized or it has to meet a prespecified target. Since TRT models have a random testlet effect, Fisher Information contains a random effect as well. The question arises as to how this random effect in ATA should be dealt with. A method based on robust optimization techniques for dealing with uncertainty in test assembly due to random testlet effects is presented. The method is applied in the context of a high-stakes testing program, and the impact of this robust test assembly method is studied. Results are discussed, advantages of the use of robust test assembly are mentioned, and recommendations about the use of the new method are given.

  7. A Comparison of Multidimensional Item Selection Methods in Simple and Complex Test Designs

    Directory of Open Access Journals (Sweden)

    Eren Halil ÖZBERK

    2017-03-01

    Full Text Available In contrast with the previous studies, this study employed various test designs (simple and complex which allow the evaluation of the overall ability score estimations across multiple real test conditions. In this study, four factors were manipulated, namely the test design, number of items per dimension, correlation between dimensions and item selection methods. Using the generated item and ability parameters, dichotomous item responses were generated in by using M3PL compensatory multidimensional IRT model with specified correlations. MCAT composite ability score accuracy was evaluated using absolute bias (ABSBIAS, correlation and the root mean square error (RMSE between true and estimated ability scores. The results suggest that the multidimensional test structure, number of item per dimension and correlation between dimensions had significant effect on item selection methods for the overall score estimations. For simple structure test design it was found that V1 item selection has the lowest absolute bias estimations for both long and short tests while estimating overall scores. As the model gets complex KL item selection method performed better than other two item selection method.

  8. A review of methods for evaluating the fit of item score patterns on a test

    NARCIS (Netherlands)

    Meijer, R.R.; Sijtsma, Klaas

    1999-01-01

    Methods are discussed that can be used to investigate the fit of an item score pattern to a test model. Model-based tests and personality inventories are administered to more than 100 million people a year and, as a result, individual fit is of great concern. Item Response Theory (IRT) modeling and

  9. The Development and Management of Banks of Performance Based Test Items.

    Science.gov (United States)

    Curtis, H. A., Ed.

    Symposium papers presented at an Annual Meeting of the National Council on Measurement in Education (Chicago, 1972), all of which concern banks of test items for use in constructing criterion referenced tests, comprise this document. The first paper, "Locally Produced Item Banks" by Thomas J. Slocum, presents information on the…

  10. A Review of Methods for Detection of Test and Item Bias.

    Science.gov (United States)

    Breunig, Nancy A.

    Few issues have provoked as much controversy as the methods for detecting item and test bias. A recent illustration of the controversy surrounding this issue could be seen in the emotional reactions to the publication of "The Bell Curve." This paper reviews methods of evaluating both item and test bias. Small heuristic data sets are…

  11. Latent Trait Theory Applications to Test Item Bias Methodology. Research Memorandum No. 1.

    Science.gov (United States)

    Osterlind, Steven J.; Martois, John S.

    This study discusses latent trait theory applications to test item bias methodology. A real data set is used in describing the rationale and application of the Rasch probabilistic model item calibrations across various ethnic group populations. A high school graduation proficiency test covering reading comprehension, writing mechanics, and…

  12. Writing Multiple-Choice Test Items that Promote and Measure Critical Thinking.

    Science.gov (United States)

    Morrison, Susan; Free, Kathleen Walsh

    2001-01-01

    Presents guidelines for developing multiple-choice tests to measure critical thinking in nursing. Explains the rationale for test items and describes item criteria, including measurement of cognition at the application level and above, multilogical thinking, and high level of discrimination. (Contains 38 references.) (SK)

  13. Relationships among Classical Test Theory and Item Response Theory Frameworks via Factor Analytic Models

    Science.gov (United States)

    Kohli, Nidhi; Koran, Jennifer; Henn, Lisa

    2015-01-01

    There are well-defined theoretical differences between the classical test theory (CTT) and item response theory (IRT) frameworks. It is understood that in the CTT framework, person and item statistics are test- and sample-dependent. This is not the perception with IRT. For this reason, the IRT framework is considered to be theoretically superior…

  14. Applications of NLP Techniques to Computer-Assisted Authoring of Test Items for Elementary Chinese

    Science.gov (United States)

    Liu, Chao-Lin; Lin, Jen-Hsiang; Wang, Yu-Chun

    2010-01-01

    The authors report an implemented environment for computer-assisted authoring of test items and provide a brief discussion about the applications of NLP techniques for computer assisted language learning. Test items can serve as a tool for language learners to examine their competence in the target language. The authors apply techniques for…

  15. Concurrent validity of single-item measures of emotional exhaustion and depersonalization in burnout assessment.

    Science.gov (United States)

    West, Colin P; Dyrbye, Liselotte N; Satele, Daniel V; Sloan, Jeff A; Shanafelt, Tait D

    2012-11-01

    Burnout is a common problem among physicians and physicians-in-training. The Maslach Burnout Inventory (MBI) is the gold standard for burnout assessment, but the length of this well-validated 22-item instrument can limit its feasibility for survey research. To evaluate the concurrent validity of two questions relative to the full MBI for measuring the association of burnout with published outcomes. DESIGN, PARTICIPANTS, AND MAIN MEASURES: The single questions "I feel burned out from my work" and "I have become more callous toward people since I took this job," representing the emotional exhaustion and depersonalization domains of burnout, respectively, were evaluated in published studies of medical students, internal medicine residents, and practicing surgeons. We compared predictive models for the association of each question, versus the full MBI, using longitudinal data on burnout and suicidality from 2006 and 2007 for 858 medical students at five United States medical schools, cross-sectional data on burnout and serious thoughts of dropping out of medical school from 2007 for 2222 medical students at seven United States medical schools, and cross-sectional data on burnout and unprofessional attitudes and behaviors from 2009 for 2566 medical students at seven United States medical schools. We also assessed results for longitudinal data on burnout and perceived major medical errors from 2003 to 2009 for 321 Mayo Clinic Rochester internal medicine residents and cross-sectional data on burnout and both perceived major medical errors and suicidality from 2008 for 7,905 respondents to a national survey of members of the American College of Surgeons. Point estimates of effect for models based on the single-item measures were uniformly consistent with those reported for models based on the full MBI. The single-item measures of emotional exhaustion and depersonalization exhibited strong associations with each published outcome (all p ≤ 0.008). No conclusion regarding

  16. Piecewise Polynomial Fitting with Trend Item Removal and Its Application in a Cab Vibration Test

    Directory of Open Access Journals (Sweden)

    Wu Ren

    2018-01-01

    Full Text Available The trend item of a long-term vibration signal is difficult to remove. This paper proposes a piecewise integration method to remove trend items. Examples of direct integration without trend item removal, global integration after piecewise polynomial fitting with trend item removal, and direct integration after piecewise polynomial fitting with trend item removal were simulated. The results showed that direct integration of the fitted piecewise polynomial provided greater acceleration and displacement precision than the other two integration methods. A vibration test was then performed on a special equipment cab. The results indicated that direct integration by piecewise polynomial fitting with trend item removal was highly consistent with the measured signal data. However, the direct integration method without trend item removal resulted in signal distortion. The proposed method can help with frequency domain analysis of vibration signals and modal parameter identification for such equipment.

  17. Postexamination analysis of objective tests using the three-parameter item response theory.

    Science.gov (United States)

    Tavakol, Mohsen; Rahimi-Madiseh, Mohammad; Dennick, Reg

    2014-01-01

    Although the importance of item response theory (IRT) has been emphasized in health and medical education, in practice, few psychometricians in nurse education have used these methods to create tests that discriminate well at any level of student ability. The purpose of this study is to evaluate the psychometric properties of a real objective test using three-parameter IRT. Three-parameter IRT was used to monitor and improve the quality of the test items. Item parameter indices, item characteristic curves (ICCs), test information functions, and test characteristic curves reveal aberrant items which do not assess the construct being measured. The results of this study provide useful information for educators to improve the quality of assessment, teaching strategies, and curricula.

  18. Development of an item bank for computerized adaptive test (CAT) measurement of pain.

    Science.gov (United States)

    Petersen, Morten Aa; Aaronson, Neil K; Chie, Wei-Chu; Conroy, Thierry; Costantini, Anna; Hammerlid, Eva; Hjermstad, Marianne J; Kaasa, Stein; Loge, Jon H; Velikova, Galina; Young, Teresa; Groenvold, Mogens

    2016-01-01

    Patient-reported outcomes should ideally be adapted to the individual patient while maintaining comparability of scores across patients. This is achievable using computerized adaptive testing (CAT). The aim here was to develop an item bank for CAT measurement of the pain domain as measured by the EORTC QLQ-C30 questionnaire. The development process consisted of four steps: (1) literature search, (2) formulation of new items and expert evaluations, (3) pretesting and (4) field-testing and psychometric analyses for the final selection of items. In step 1, we identified 337 pain items from the literature. Twenty-nine new items fitting the QLQ-C30 item style were formulated in step 2 that were reduced to 26 items by expert evaluations. Based on interviews with 31 patients from Denmark, France and the UK, the list was further reduced to 21 items in step 3. In phase 4, responses were obtained from 1103 cancer patients from five countries. Psychometric evaluations showed that 16 items could be retained in a unidimensional item bank. Evaluations indicated that use of the CAT measure may reduce sample size requirements with 15-25% compared to using the QLQ-C30 pain scale. We have established an item bank of 16 items suitable for CAT measurement of pain. While being backward compatible with the QLQ-C30, the new item bank will significantly improve measurement precision of pain. We recommend initiating CAT measurement by screening for pain using the two original QLQ-C30 pain items. The EORTC pain CAT is currently available for "experimental" purposes.

  19. Using personality item characteristics to predict single-item reliability, retest reliability, and self-other agreement

    NARCIS (Netherlands)

    de Vries, Reinout Everhard; Realo, Anu; Allik, Jüri

    2016-01-01

    The use of reliability estimates is increasingly scrutinized as scholars become more aware that test–retest stability and self–other agreement provide a better approximation of the theoretical and practical usefulness of an instrument than its internal reliability. In this study, we investigate item

  20. Development of an item bank for computerized adaptive test (CAT) measurement of pain

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Aaronson, Neil K; Chie, Wei-Chu

    2016-01-01

    by the EORTC QLQ-C30 questionnaire. METHODS: The development process consisted of four steps: (1) literature search, (2) formulation of new items and expert evaluations, (3) pretesting and (4) field-testing and psychometric analyses for the final selection of items. RESULTS: In step 1, we identified 337 pain....... CONCLUSIONS: We have established an item bank of 16 items suitable for CAT measurement of pain. While being backward compatible with the QLQ-C30, the new item bank will significantly improve measurement precision of pain. We recommend initiating CAT measurement by screening for pain using the two original QLQ......PURPOSE: Patient-reported outcomes should ideally be adapted to the individual patient while maintaining comparability of scores across patients. This is achievable using computerized adaptive testing (CAT). The aim here was to develop an item bank for CAT measurement of the pain domain as measured...

  1. A mathematical model for order splitting in a multiple supplier single-item inventory system

    DEFF Research Database (Denmark)

    Abginehchi, Soheil; Farahani, Reza Zanjirani; Rezapour, Shabnam

    2013-01-01

    The policy of simultaneously splitting replenishment orders among several suppliers has received considerable attention in the last few years and continues to attract the attention of researchers. In this paper, we develop a mathematical model which considers multiple-supplier single-item inventory...... systems. The item acquisition lead times of suppliers are random variables. Backorder is allowed and shortage cost is charged based on not only per unit in shortage but also per time unit. Continuous review (s,Q) policy has been assumed. When the inventory level depletes to a reorder level, the total......, procurement cost, inventory holding cost, and shortage cost, is minimized. We also conduct extensive numerical experiments to show the advantages of our model compared with the models in the literature. According to our extensive experiments, the model developed in this paper is the best model...

  2. Differential Item Functioning (DIF) among Spanish-Speaking English Language Learners (ELLs) in State Science Tests

    Science.gov (United States)

    Ilich, Maria O.

    Psychometricians and test developers evaluate standardized tests for potential bias against groups of test-takers by using differential item functioning (DIF). English language learners (ELLs) are a diverse group of students whose native language is not English. While they are still learning the English language, they must take their standardized tests for their school subjects, including science, in English. In this study, linguistic complexity was examined as a possible source of DIF that may result in test scores that confound science knowledge with a lack of English proficiency among ELLs. Two years of fifth-grade state science tests were analyzed for evidence of DIF using two DIF methods, Simultaneous Item Bias Test (SIBTest) and logistic regression. The tests presented a unique challenge in that the test items were grouped together into testlets---groups of items referring to a scientific scenario to measure knowledge of different science content or skills. Very large samples of 10, 256 students in 2006 and 13,571 students in 2007 were examined. Half of each sample was composed of Spanish-speaking ELLs; the balance was comprised of native English speakers. The two DIF methods were in agreement about the items that favored non-ELLs and the items that favored ELLs. Logistic regression effect sizes were all negligible, while SIBTest flagged items with low to high DIF. A decrease in socioeconomic status and Spanish-speaking ELL diversity may have led to inconsistent SIBTest effect sizes for items used in both testing years. The DIF results for the testlets suggested that ELLs lacked sufficient opportunity to learn science content. The DIF results further suggest that those constructed response test items requiring the student to draw a conclusion about a scientific investigation or to plan a new investigation tended to favor ELLs.

  3. Quantitative penetration testing with item response theory (extended version)

    NARCIS (Netherlands)

    Arnold, Florian; Pieters, Wolter; Stoelinga, Mariëlle Ida Antoinette

    2013-01-01

    Existing penetration testing approaches assess the vulnerability of a system by determining whether certain attack paths are possible in practice. Therefore, penetration testing has thus far been used as a qualitative research method. To enable quantitative approaches to security risk management,

  4. Discordancy tests for outlier detection in multi-item questionnaires

    NARCIS (Netherlands)

    Zijlstra, W.P.; van der Ark, L.A.; Sijtsma, K.

    2013-01-01

    The sensitivity and the specificity of four outlier scores were studied for four different discordancy tests. The outlier scores were the Mahalanobis distance, a robust version of the Mahalanobis distance, and two measures tailored to discrete data, known as O+ and G+. The discordancy tests were

  5. Quantitative Penetration Testing with Item Response Theory (extended version)

    NARCIS (Netherlands)

    Arnold, F.; Pieters, W.; Stoelinga, M.I.A.

    2013-01-01

    Existing penetration testing approaches assess the vulnerability of a system by determining whether certain attack paths are possible in practice. Thus, penetration testing has so far been used as a qualitative research method. To enable quantitative approaches to security risk management, including

  6. The effects of linguistic modification on ESL students' comprehension of nursing course test items.

    Science.gov (United States)

    Bosher, Susan; Bowles, Melissa

    2008-01-01

    Recent research has indicated that language may be a source of construct-irrelevant variance for non-native speakers of English, or English as a second language (ESL) students, when they take exams. As a result, exams may not accurately measure knowledge of nursing content. One accommodation often used to level the playing field for ESL students is linguistic modification, a process by which the reading load of test items is reduced while the content and integrity of the item are maintained. Research on the effects of linguistic modification has been conducted on examinees in the K-12 population, but is just beginning in other areas. This study describes the collaborative process by which items from a pathophysiology exam were linguistically modified and subsequently evaluated for comprehensibility by ESL students. Findings indicate that in a majority of cases, modification improved examinees' comprehension of test items. Implications for test item writing and future research are discussed.

  7. Effects of Calibration Sample Size and Item Bank Size on Ability Estimation in Computerized Adaptive Testing

    Science.gov (United States)

    Sahin, Alper; Weiss, David J.

    2015-01-01

    This study aimed to investigate the effects of calibration sample size and item bank size on examinee ability estimation in computerized adaptive testing (CAT). For this purpose, a 500-item bank pre-calibrated using the three-parameter logistic model with 10,000 examinees was simulated. Calibration samples of varying sizes (150, 250, 350, 500,…

  8. Criterion-Referenced Test (CRT) Items for Air Conditioning, Heating and Refrigeration.

    Science.gov (United States)

    Davis, Diane, Ed.

    These criterion-referenced test (CRT) items for air conditioning, heating, and refrigeration are keyed to the Missouri Air Conditioning, Heating, and Refrigeration Competency Profile. The items are designed to work with both the Vocational Instructional Management System and Vocational Administrative Management System. For word processing and…

  9. The Prediction of Item Parameters Based on Classical Test Theory and Latent Trait Theory

    Science.gov (United States)

    Anil, Duygu

    2008-01-01

    In this study, the prediction power of the item characteristics based on the experts' predictions on conditions try-out practices cannot be applied was examined for item characteristics computed depending on classical test theory and two-parameters logistic model of latent trait theory. The study was carried out on 9914 randomly selected students…

  10. Assessing the Performance of Classical Test Theory Item Discrimination Estimators in Monte Carlo Simulations

    Science.gov (United States)

    Bazaldua, Diego A. Luna; Lee, Young-Sun; Keller, Bryan; Fellers, Lauren

    2017-01-01

    The performance of various classical test theory (CTT) item discrimination estimators has been compared in the literature using both empirical and simulated data, resulting in mixed results regarding the preference of some discrimination estimators over others. This study analyzes the performance of various item discrimination estimators in CTT:…

  11. Development of an item bank and computer adaptive test for role functioning.

    Science.gov (United States)

    Anatchkova, Milena D; Rose, Matthias; Ware, John E; Bjorner, Jakob B

    2012-11-01

    Role functioning (RF) is a key component of health and well-being and an important outcome in health research. The aim of this study was to develop an item bank to measure impact of health on role functioning. A set of different instruments including 75 newly developed items asking about the impact of health on role functioning was completed by 2,500 participants. Established item response theory methods were used to develop an item bank based on the generalized partial credit model. Comparison of group mean bank scores of participants with different self-reported general health status and chronic conditions was used to test the external validity of the bank. After excluding items that did not meet established requirements, the final item bank consisted of a total of 64 items covering three areas of role functioning (family, social, and occupational). Slopes in the bank ranged between .93 and 4.37; the mean threshold range was -1.09 to -2.25. Item bank-based scores were significantly different for participants with and without chronic conditions and with different levels of self-reported general health. An item bank assessing health impact on RF across three content areas has been successfully developed. The bank can be used for development of short forms or computerized adaptive tests to be applied in the assessment of role functioning as one of the common denominators across applications of generic health assessment.

  12. A comparison of discriminant logistic regression and Item Response Theory Likelihood-Ratio Tests for Differential Item Functioning (IRTLRDIF) in polytomous short tests.

    Science.gov (United States)

    Hidalgo, María D; López-Martínez, María D; Gómez-Benito, Juana; Guilera, Georgina

    2016-01-01

    Short scales are typically used in the social, behavioural and health sciences. This is relevant since test length can influence whether items showing DIF are correctly flagged. This paper compares the relative effectiveness of discriminant logistic regression (DLR) and IRTLRDIF for detecting DIF in polytomous short tests. A simulation study was designed. Test length, sample size, DIF amount and item response categories number were manipulated. Type I error and power were evaluated. IRTLRDIF and DLR yielded Type I error rates close to nominal level in no-DIF conditions. Under DIF conditions, Type I error rates were affected by test length DIF amount, degree of test contamination, sample size and number of item response categories. DLR showed a higher Type I error rate than did IRTLRDIF. Power rates were affected by DIF amount and sample size, but not by test length. DLR achieved higher power rates than did IRTLRDIF in very short tests, although the high Type I error rate involved means that this result cannot be taken into account. Test length had an important impact on the Type I error rate. IRTLRDIF and DLR showed a low power rate in short tests and with small sample sizes.

  13. Modeling Information Accumulation in Psychological Tests Using Item Response Times

    Science.gov (United States)

    Ranger, Jochen; Kuhn, Jörg-Tobias

    2015-01-01

    In this article, a latent trait model is proposed for the response times in psychological tests. The latent trait model is based on the linear transformation model and subsumes popular models from survival analysis, like the proportional hazards model and the proportional odds model. Core of the model is the assumption that an unspecified monotone…

  14. The quadratic relationship between difficulty of intelligence test items and their correlations with working memory

    Directory of Open Access Journals (Sweden)

    Tomasz eSmoleń

    2015-08-01

    Full Text Available Fluid intelligence (Gf is a crucial cognitive ability that involves abstract reasoning in order to solve novel problems. Recent research demonstrated that Gf strongly depends on the individual effectiveness of working memory (WM. We investigated a popular claim that if the storage capacity underlay the WM-Gf correlation, then such a correlation should increase with an increasing number of items or rules (load in a Gf test. As often no such link is observed, on that basis the storage-capacity account is rejected, and alternative accounts of Gf (e.g., related to executive control or processing speed are proposed. Using both analytical inference and numerical simulations, we demonstrated that the load-dependent change in correlation is primarily a function of the amount of floor/ceiling effect for particular items. Thus, the item-wise WM correlation of a Gf test depends on its overall difficulty, and the difficulty distribution across its items. When the early test items yield huge ceiling, but the late items do not approach floor, that correlation will increase throughout the test. If the early items locate themselves between ceiling and floor, but the late items approach floor, the respective correlation will decrease. For a hallmark Gf test, the Raven test, whose items span from ceiling to floor, the quadratic relationship is expected, and it was shown empirically using a large sample and two types of WMC tasks. In consequence, no changes in correlation due to varying WM/Gf load, or lack of them, can yield an argument for or against any theory of WM/Gf. Moreover, as the mathematical properties of the correlation formula make it relatively immune to ceiling/floor effects for overall moderate correlations, only minor changes (if any in the WM-Gf correlation should be expected for many psychological tests.

  15. Ability or Access-Ability: Differential Item Functioning of Items on Alternate Performance-Based Assessment Tests for Students with Visual Impairments

    Science.gov (United States)

    Zebehazy, Kim T.; Zigmond, Naomi; Zimmerman, George J.

    2012-01-01

    Introduction: This study investigated differential item functioning (DIF) of test items on Pennsylvania's Alternate System of Assessment (PASA) for students with visual impairments and severe cognitive disabilities and what the reasons for the differences may be. Methods: The Wilcoxon signed ranks test was used to analyze differences in the scores…

  16. Assessment of Fatigue in Rheumatoid Arthritis: A Psychometric Comparison of Single-item, Multiitem, and Multidimensional Measures

    NARCIS (Netherlands)

    Oude Voshaar, M.A.H.; Klooster, P.M. ten; Bode, C.; Vonkeman, H.E.; Glas, C.A.; Jansen, T.L.Th.A.; Albada-Kuipers, I. van; Riel, P.L.C.M. van; Laar, M.A. van der

    2015-01-01

    OBJECTIVE: To compare the psychometric functioning of multidimensional disease-specific, multiitem generic, and single-item measures of fatigue in patients with rheumatoid arthritis (RA). METHODS: Confirmatory factor analysis (CFA) and longitudinal item response theory (IRT) modeling were used to

  17. Assessment of fatigue in rheumatoid arthritis: a psychometric comparison of single-item, multiitem, and multidimensional measures

    NARCIS (Netherlands)

    Oude Voshaar, Antonius H.; ten Klooster, Peter M.; Bode, Christina; Vonkeman, Harald Erwin; Glas, Cornelis A.W.; Jansen, Tim; van Albeda-Kuijpers, Iet; van Riel, Piet L.C.M.; van de Laar, Mart A F J

    2015-01-01

    OBJECTIVE: To compare the psychometric functioning of multidimensional disease-specific, multiitem generic, and single-item measures of fatigue in patients with rheumatoid arthritis (RA). METHODS: Confirmatory factor analysis (CFA) and longitudinal item response theory (IRT) modeling were used to

  18. Comparison of classical test theory and item response theory in individual change assessment

    NARCIS (Netherlands)

    Jabrayilov, R.; Emons, W.H.M.; Sijtsma, K.

    2016-01-01

    Clinical psychologists are advised to assess clinical and statistical significance when assessing change in individual patients. Individual change assessment can be conducted using either the methodologies of classical test theory (CTT) or item response theory (IRT). Researchers have been optimistic

  19. Comparison of Classical Test Theory and Item Response Theory in Individual Change Assessment

    NARCIS (Netherlands)

    Jabrayilov, Ruslan; Emons, Wilco H. M.; Sijtsma, Klaas

    2016-01-01

    Clinical psychologists are advised to assess clinical and statistical significance when assessing change in individual patients. Individual change assessment can be conducted using either the methodologies of classical test theory (CTT) or item response theory (IRT). Researchers have been optimistic

  20. Development of a lack of appetite item bank for computer-adaptive testing (CAT)

    DEFF Research Database (Denmark)

    Thamsborg, Lise Laurberg Holst; Petersen, Morten Aa; Aaronson, Neil K

    2015-01-01

    PURPOSE: A significant proportion of oncological patients experiences lack of appetite. Precise measurement is relevant to improve the management of lack of appetite. The so-called computer-adaptive test (CAT) allows for adaptation of the questionnaire to the individual patient, thereby optimizing...... measurement precision. The EORTC Quality of Life Group is developing a CAT version of the widely used EORTC QLQ-C30 questionnaire. Here, we report on the development of the lack of appetite CAT. METHODS: The EORTC approach to CAT development comprises four phases: literature search, operationalization, pre-testing......, and field testing. Phases 1-3 are described in this paper. First, a list of items was retrieved from the literature. This was refined, deleting redundant and irrelevant items. Next, new items fitting the "QLQ-C30 item style" were created. These were evaluated by international samples of experts and cancer...

  1. Classification Accuracy of Mixed Format Tests: A Bi-Factor Item Response Theory Approach.

    Science.gov (United States)

    Wang, Wei; Drasgow, Fritz; Liu, Liwen

    2016-01-01

    Mixed format tests (e.g., a test consisting of multiple-choice [MC] items and constructed response [CR] items) have become increasingly popular. However, the latent structure of item pools consisting of the two formats is still equivocal. Moreover, the implications of this latent structure are unclear: For example, do constructed response items tap reasoning skills that cannot be assessed with multiple choice items? This study explored the dimensionality of mixed format tests by applying bi-factor models to 10 tests of various subjects from the College Board's Advanced Placement (AP) Program and compared the accuracy of scores based on the bi-factor analysis with scores derived from a unidimensional analysis. More importantly, this study focused on a practical and important question-classification accuracy of the overall grade on a mixed format test. Our findings revealed that the degree of multidimensionality resulting from the mixed item format varied from subject to subject, depending on the disattenuated correlation between scores from MC and CR subtests. Moreover, remarkably small decrements in classification accuracy were found for the unidimensional analysis when the disattenuated correlations exceeded 0.90.

  2. Item response theory analysis of cognitive tests in people with dementia: a systematic review.

    Science.gov (United States)

    McGrory, Sarah; Doherty, Jason M; Austin, Elizabeth J; Starr, John M; Shenkin, Susan D

    2014-02-19

    Performance on psychometric tests is key to diagnosis and monitoring treatment of dementia. Results are often reported as a total score, but there is additional information in individual items of tests which vary in their difficulty and discriminatory value. Item difficulty refers to an ability level at which the probability of responding correctly is 50%. Discrimination is an index of how well an item can differentiate between patients of varying levels of severity. Item response theory (IRT) analysis can use this information to examine and refine measures of cognitive functioning. This systematic review aimed to identify all published literature which had applied IRT to instruments assessing global cognitive function in people with dementia. A systematic review was carried out across Medline, Embase, PsychInfo and CINHAL articles. Search terms relating to IRT and dementia were combined to find all IRT analyses of global functioning scales of dementia. Of 384 articles identified four studies met inclusion criteria including a total of 2,920 people with dementia from six centers in two countries. These studies used three cognitive tests (MMSE, ADAS-Cog, BIMCT) and three IRT methods (Item Characteristic Curve analysis, Samejima's graded response model, the 2-Parameter Model). Memory items were most difficult. Naming the date in the MMSE and memory items, specifically word recall, of the ADAS-cog were most discriminatory. Four published studies were identified which used IRT on global cognitive tests in people with dementia. This technique increased the interpretative power of the cognitive scales, and could be used to provide clinicians with key items from a larger test battery which would have high predictive value. There is need for further studies using IRT in a wider range of tests involving people with dementia of different etiology and severity.

  3. Measuring single constructs by single items: Constructing an even shorter version of the “Short Five” personality inventory

    Science.gov (United States)

    Konstabel, Kenn; Lönnqvist, Jan-Erik; Leikas, Sointu; García Velázquez, Regina; Qin, Hiaying; Verkasalo, Markku; Walkowitz, Gari

    2017-01-01

    The aim of this study was to construct a short, 30-item personality questionnaire that would be, in terms of content and meaning of the scores, as comparable as possible with longer, well-established inventories such as NEO PI-R and its clones. To do this, we shortened the formerly constructed 60-item “Short Five” (S5) by half so that each subscale would be represented by a single item. We compared all possibilities of selecting 30 items (preserving balanced keying within each domain of the five-factor model) in terms of correlations with well-established scales, self-peer correlations, and clarity of meaning, and selected an optimal combination for each domain. The resulting shortened questionnaire, XS5, was compared to the original S5 using data from student samples in 6 different countries (Estonia, Finland, UK, Germany, Spain, and China), and a representative Finnish sample. The correlations between XS5 domain scales and their longer counterparts from well-established scales ranged from 0.74 to 0.84; the difference from the equivalent correlations for full version of S5 or from meta-analytic short-term dependability coefficients of NEO PI-R was not large. In terms of prediction of external criteria (emotional experience and self-reported behaviours), there were no important differences between XS5, S5, and the longer well-established scales. Controlling for acquiescence did not improve the prediction of criteria, self-peer correlations, or correlations with longer scales, but it did improve internal reliability and, in some analyses, comparability of the principal component structure. XS5 can be recommended as an economic measure of the five-factor model of personality at the level of domain scales; it has reasonable psychometric properties, fair correlations with longer well-established scales, and it can predict emotional experience and self-reported behaviours no worse than S5. When subscales are essential, we would still recommend using the full version

  4. Measuring single constructs by single items: Constructing an even shorter version of the "Short Five" personality inventory.

    Directory of Open Access Journals (Sweden)

    Kenn Konstabel

    Full Text Available The aim of this study was to construct a short, 30-item personality questionnaire that would be, in terms of content and meaning of the scores, as comparable as possible with longer, well-established inventories such as NEO PI-R and its clones. To do this, we shortened the formerly constructed 60-item "Short Five" (S5 by half so that each subscale would be represented by a single item. We compared all possibilities of selecting 30 items (preserving balanced keying within each domain of the five-factor model in terms of correlations with well-established scales, self-peer correlations, and clarity of meaning, and selected an optimal combination for each domain. The resulting shortened questionnaire, XS5, was compared to the original S5 using data from student samples in 6 different countries (Estonia, Finland, UK, Germany, Spain, and China, and a representative Finnish sample. The correlations between XS5 domain scales and their longer counterparts from well-established scales ranged from 0.74 to 0.84; the difference from the equivalent correlations for full version of S5 or from meta-analytic short-term dependability coefficients of NEO PI-R was not large. In terms of prediction of external criteria (emotional experience and self-reported behaviours, there were no important differences between XS5, S5, and the longer well-established scales. Controlling for acquiescence did not improve the prediction of criteria, self-peer correlations, or correlations with longer scales, but it did improve internal reliability and, in some analyses, comparability of the principal component structure. XS5 can be recommended as an economic measure of the five-factor model of personality at the level of domain scales; it has reasonable psychometric properties, fair correlations with longer well-established scales, and it can predict emotional experience and self-reported behaviours no worse than S5. When subscales are essential, we would still recommend using the

  5. Measuring single constructs by single items: Constructing an even shorter version of the "Short Five" personality inventory.

    Science.gov (United States)

    Konstabel, Kenn; Lönnqvist, Jan-Erik; Leikas, Sointu; García Velázquez, Regina; Qin, Hiaying; Verkasalo, Markku; Walkowitz, Gari

    2017-01-01

    The aim of this study was to construct a short, 30-item personality questionnaire that would be, in terms of content and meaning of the scores, as comparable as possible with longer, well-established inventories such as NEO PI-R and its clones. To do this, we shortened the formerly constructed 60-item "Short Five" (S5) by half so that each subscale would be represented by a single item. We compared all possibilities of selecting 30 items (preserving balanced keying within each domain of the five-factor model) in terms of correlations with well-established scales, self-peer correlations, and clarity of meaning, and selected an optimal combination for each domain. The resulting shortened questionnaire, XS5, was compared to the original S5 using data from student samples in 6 different countries (Estonia, Finland, UK, Germany, Spain, and China), and a representative Finnish sample. The correlations between XS5 domain scales and their longer counterparts from well-established scales ranged from 0.74 to 0.84; the difference from the equivalent correlations for full version of S5 or from meta-analytic short-term dependability coefficients of NEO PI-R was not large. In terms of prediction of external criteria (emotional experience and self-reported behaviours), there were no important differences between XS5, S5, and the longer well-established scales. Controlling for acquiescence did not improve the prediction of criteria, self-peer correlations, or correlations with longer scales, but it did improve internal reliability and, in some analyses, comparability of the principal component structure. XS5 can be recommended as an economic measure of the five-factor model of personality at the level of domain scales; it has reasonable psychometric properties, fair correlations with longer well-established scales, and it can predict emotional experience and self-reported behaviours no worse than S5. When subscales are essential, we would still recommend using the full version of

  6. A Comparison of Procedures for Content-Sensitive Item Selection in Computerized Adaptive Tests.

    Science.gov (United States)

    Kingsbury, G. Gage; Zara, Anthony R.

    1991-01-01

    This simulation investigated two procedures that reduce differences between paper-and-pencil testing and computerized adaptive testing (CAT) by making CAT content sensitive. Results indicate that the price in terms of additional test items of using constrained CAT for content balancing is much smaller than that of using testlets. (SLD)

  7. Test of Achievement in Quantitative Economics for Secondary Schools: Construction and Validation Using Item Response Theory

    Science.gov (United States)

    Eleje, Lydia I.; Esomonu, Nkechi P. M.

    2018-01-01

    A Test to measure achievement in quantitative economics among secondary school students was developed and validated in this study. The test is made up 20 multiple choice test items constructed based on quantitative economics sub-skills. Six research questions guided the study. Preliminary validation was done by two experienced teachers in…

  8. An Integer-Programming Approach to Item Pool Design. Law School Admission Council Computerized Testing Report. LSAC Research Report Series.

    Science.gov (United States)

    van der Linden, Wim J.; Veldkamp, Bernard P.; Reese, Lynda M.

    Presented is an integer-programming approach to item pool design that can be used to calculate an optimal blueprint for an item pool to support an existing testing program. The results are optimal in the sense that they minimize the efforts involved in actually producing the items as revealed by current item writing patterns. Also presented is an…

  9. Analysis Test of Understanding of Vectors with the Three-Parameter Logistic Model of Item Response Theory and Item Response Curves Technique

    Science.gov (United States)

    Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

    2016-01-01

    This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming…

  10. Development of abbreviated eight-item form of the Penn Verbal Reasoning Test.

    Science.gov (United States)

    Bilker, Warren B; Wierzbicki, Michael R; Brensinger, Colleen M; Gur, Raquel E; Gur, Ruben C

    2014-12-01

    The ability to reason with language is a highly valued cognitive capacity that correlates with IQ measures and is sensitive to damage in language areas. The Penn Verbal Reasoning Test (PVRT) is a 29-item computerized test for measuring abstract analogical reasoning abilities using language. The full test can take over half an hour to administer, which limits its applicability in large-scale studies. We previously described a procedure for abbreviating a clinical rating scale and a modified procedure for reducing tests with a large number of items. Here we describe the application of the modified method to reducing the number of items in the PVRT to a parsimonious subset of items that accurately predicts the total score. As in our previous reduction studies, a split sample is used for model fitting and validation, with cross-validation to verify results. We find that an 8-item scale predicts the total 29-item score well, achieving a correlation of .9145 for the reduced form for the model fitting sample and .8952 for the validation sample. The results indicate that a drastically abbreviated version, which cuts administration time by more than 70%, can be safely administered as a predictor of PVRT performance. © The Author(s) 2014.

  11. Development of Abbreviated Eight-Item Form of the Penn Verbal Reasoning Test

    Science.gov (United States)

    Bilker, Warren B.; Wierzbicki, Michael R.; Brensinger, Colleen M.; Gur, Raquel E.; Gur, Ruben C.

    2014-01-01

    The ability to reason with language is a highly valued cognitive capacity that correlates with IQ measures and is sensitive to damage in language areas. The Penn Verbal Reasoning Test (PVRT) is a 29-item computerized test for measuring abstract analogical reasoning abilities using language. The full test can take over half an hour to administer, which limits its applicability in large-scale studies. We previously described a procedure for abbreviating a clinical rating scale and a modified procedure for reducing tests with a large number of items. Here we describe the application of the modified method to reducing the number of items in the PVRT to a parsimonious subset of items that accurately predicts the total score. As in our previous reduction studies, a split sample is used for model fitting and validation, with cross-validation to verify results. We find that an 8-item scale predicts the total 29-item score well, achieving a correlation of .9145 for the reduced form for the model fitting sample and .8952 for the validation sample. The results indicate that a drastically abbreviated version, which cuts administration time by more than 70%, can be safely administered as a predictor of PVRT performance. PMID:24577310

  12. Fuzzy Decision-Making Approach in Geometric Programming for a Single Item EOQ Model

    Directory of Open Access Journals (Sweden)

    Monalisha Pattnaik

    2015-06-01

    Full Text Available Background and methods: Fuzzy decision-making approach is allowed in geometric programming for a single item EOQ model with dynamic ordering cost and demand-dependent unit cost. The setup cost varies with the quantity produced/purchased and the modification of objective function with storage area in the presence of imprecisely estimated parameters are investigated.  It incorporates all concepts of a fuzzy arithmetic approach, the quantity ordered, and demand per unit compares both fuzzy geometric programming technique and other models for linear membership functions.  Results and conclusions: Investigation of the properties of an optimal solution allows developing an algorithm whose validity is illustrated through an example problem and the results discu ssed. Sensitivity analysis of the optimal solution is also studied with respect to changes in different parameter values.  

  13. Item Response Theory Model Empat Parameter Logistik Pada Computerized Adaptive Test

    Directory of Open Access Journals (Sweden)

    Aslam Fatkhudin

    2016-01-01

    Full Text Available One of the computer-based testing is the Computerized Adaptive Test (CAT, which is a computer-based testing system where the items were given to the participants adapted to test the ability of the participants. Assessment methods are usually applied in CAT is Item Response Theory (IRT. IRT models are most commonly used today is the model 3 Parameter Logistic (3PL, which is about the discrimination, difficulty and guessing. However 3PL IRT models have not provided information more objectively test the ability of participants. The opinion of the test participants were tested items were also to be considered. In this study using CAT in combination with IRT model of 4PL. In this research, the development of CAT which uses about 4 parameters, namely the discrimination, difficulty, guessing and questionnaires. The questions used were about UAS 1 English subjects. Samples were taken from 30 students answer with the best value of the total 172 students spread across 6 classes to measure the parameter estimation problem. Further testing using CAT application 4PL IRT models compared to CAT 3PL IRT models. From research done shows that the CAT application combined with IRT models 4PL can measure the ability of the test taker shorter or faster and also opportunities participants correctly answered the test items was done tend to be better than the 3PL IRT models.   Keywords: Ability; CAT; IRT; 3PL; 4PL; Probability; Test

  14. Cognitive Testing of Tobacco Use Items for Administration to Cancer Patients and Survivors in Clinical Research

    Science.gov (United States)

    Land, Stephanie R.; Warren, Graham W.; Crafts, Jennifer; Hatsukami, Dorothy; Ostroff, Jamie S.; Willis, Gordon; Chollette, Veronica; Mitchell, Sandra A.; Folz, Jasmine; Gulley, James L.; Szabo, Eva; Brandon, Thomas H.; Duffy, Sonia; Toll, Benjamin

    2017-01-01

    Background There are currently no standardized measures of tobacco use and secondhand smoke exposure in patients diagnosed with cancer, and this gap hinders the conduct of studies examining the impact of tobacco on cancer treatment outcomes. Our objective was to evaluate and refine questionnaire items proposed by an expert task force to assess tobacco use. Methods Trained interviewers conducted cognitive testing with cancer patients age 21 or older with a history of tobacco use and cancer diagnosis of any stage and organ site, recruited at the National Institutes of Health Clinical Center (Bethesda, MD). Iterative rounds of testing and item modification were conducted to identify and resolve cognitive issues (comprehension, memory retrieval, decision/judgment, response mapping) and instrument navigation issues until no items warranted further significant modification. Results Thirty participants (6 current cigarette smokers, 1 current cigar smoker, 23 former cigarette smokers) were enrolled from September 2014 to February 2015. Most items functioned well. However, qualitative testing identified wording ambiguities related to cancer diagnosis and treatment trajectory, such as “treatment” and “surgery”; difficulties with lifetime recall; errors in estimating quantities; and difficulties with instrument navigation. Revisions to item wording, format, order, response options, and instructions resulted in a questionnaire that demonstrated navigational ease as well as good question comprehension and response accuracy. Conclusions The NCI-AACR Cancer Patient Tobacco Use Questionnaire (C-TUQ) can be utilized as a standardized item set to accelerate investigation of tobacco use in the cancer setting. PMID:27019325

  15. Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

    Directory of Open Access Journals (Sweden)

    Suttida Rakkapao

    2016-10-01

    Full Text Available This study investigated the multiple-choice test of understanding of vectors (TUV, by applying item response theory (IRT. The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test’s distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.

  16. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    Science.gov (United States)

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  17. Test-retest reliability of Eurofit Physical Fitness items for children with visual impairments

    NARCIS (Netherlands)

    Houwen, Suzanne; Visscher, Chris; Hartman, Esther; Lemmink, Koen A. P. M.

    The purpose of this study was to examine the test-retest reliability of physical fitness items from the European Test of Physical Fitness (Eurofit) for children with visual impairments. A sample of 21 children, ages 6-12 years, that were recruited from a special school for children with visual

  18. Using Item Response Theory to Evaluate Measurement Precision of Selection Tests at the French Pilot Training

    NARCIS (Netherlands)

    Veldhuis, M.|info:eu-repo/dai/nl/338041869; Matton, N.; Vautier, S.

    2012-01-01

    In pilot selection settings, decisions are often based on cutoff scores. In item response theory the measurement precision of a test score can be evaluated by its degree of information. We investigated whether the maximum of test information corresponded to the cutoff zone for 10 cognitive ability

  19. Influences of Item Content and Format on the Dimensionality of Tests Combining Multiple-Choice and Open-Response Items: An Application of the Poly-DIMTEST Procedure.

    Science.gov (United States)

    Perkhounkova, Yelena; Dunbar, Stephen B.

    The DIMTEST statistical procedure was used in a confirmatory manner to explore the dimensionality structures of three kinds of achievement tests: multiple-choice tests, constructed-response tests, and tests combining both formats. The DIMTEST procedure is based on estimating conditional covariances of the responses to the item pairs. The analysis…

  20. Model choice and sample size in item response theory analysis of aphasia tests.

    Science.gov (United States)

    Hula, William D; Fergadiotis, Gerasimos; Martin, Nadine

    2012-05-01

    The purpose of this study was to identify the most appropriate item response theory (IRT) measurement model for aphasia tests requiring 2-choice responses and to determine whether small samples are adequate for estimating such models. Pyramids and Palm Trees (Howard & Patterson, 1992) test data that had been collected from individuals with aphasia were analyzed, and the resulting item and person estimates were used to develop simulated test data for 3 sample size conditions. The simulated data were analyzed using a standard 1-parameter logistic (1-PL) model and 3 models that accounted for the influence of guessing: augmented 1-PL and 2-PL models and a 3-PL model. The model estimates obtained from the simulated data were compared to their known true values. With small and medium sample sizes, an augmented 1-PL model was the most accurate at recovering the known item and person parameters; however, no model performed well at any sample size. Follow-up simulations confirmed that the large influence of guessing and the extreme easiness of the items contributed substantially to the poor estimation of item difficulty and person ability. Incorporating the assumption of guessing into IRT models improves parameter estimation accuracy, even for small samples. However, caution should be exercised in interpreting scores obtained from easy 2-choice tests, regardless of whether IRT modeling or percentage correct scoring is used.

  1. Fostering a student's skill for analyzing test items through an authentic task

    Science.gov (United States)

    Setiawan, Beni; Sabtiawan, Wahyu Budi

    2017-08-01

    Analyzing test items is a skill that must be mastered by prospective teachers, in order to determine the quality of test questions which have been written. The main aim of this research was to describe the effectiveness of authentic task to foster the student's skill for analyzing test items involving validity, reliability, item discrimination index, level of difficulty, and distractor functioning through the authentic task. The participant of the research is students of science education study program, science and mathematics faculty, Universitas Negeri Surabaya, enrolled for assessment course. The research design was a one-group posttest design. The treatment in this study is that the students were provided an authentic task facilitating the students to develop test items, then they analyze the items like a professional assessor using Microsoft Excel and Anates Software. The data of research obtained were analyzed descriptively, such as the analysis was presented by displaying the data of students' skill, then they were associated with theories or previous empirical studies. The research showed the task facilitated the students to have the skills. Thirty-one students got a perfect score for the analyzing, five students achieved 97% mastery, two students had 92% mastery, and another two students got 89% and 79% of mastery. The implication of the finding was the students who get authentic tasks forcing them to perform like a professional, the possibility of the students for achieving the professional skills will be higher at the end of learning.

  2. Examining the Impact of Drifted Polytomous Anchor Items on Test Characteristic Curve (TCC) Linking and IRT True Score Equating. Research Report. ETS RR-12-09

    Science.gov (United States)

    Li, Yanmei

    2012-01-01

    In a common-item (anchor) equating design, the common items should be evaluated for item parameter drift. Drifted items are often removed. For a test that contains mostly dichotomous items and only a small number of polytomous items, removing some drifted polytomous anchor items may result in anchor sets that no longer resemble mini-versions of…

  3. Using response-time constraints in item selection to control for differential speededness in computerized adaptive testing

    NARCIS (Netherlands)

    van der Linden, Willem J.; Scrams, David J.; Schnipke, Deborah L.

    2003-01-01

    This paper proposes an item selection algorithm that can be used to neutralize the effect of time limits in computer adaptive testing. The method is based on a statistical model for the response-time distributions of the test takers on the items in the pool that is updated each time a new item has

  4. A single-item global job satisfaction measure is associated with quantitative blood immune indices in white-collar employees.

    Science.gov (United States)

    Nakata, Akinori; Irie, Masahiro; Takahashi, Masaya

    2013-01-01

    Although a single-item job satisfaction measure has been shown to be reliable and inclusive as multiple-item scales in relation to health, studies including immunological data are few. The purpose of this study was to evaluate the validity of single-item job and family life satisfaction based on its association with immune indices. A total of 189 white-collar employees (70% men) underwent a blood draw for the measurement of natural killer (NK), total T, and B cell counts as well as plasma immunoglobulin (Ig) G concentrations and completed single-item job and family life satisfaction measures, respectively. The response options for satisfaction measures were 'dissatisfied' (coded 1) to 'satisfied' (coded 4). Spearman's partial correlations controlling for cofactors revealed that increased job satisfaction was positively associated with NK cells (rsp=0.201, p=0.007) and IgG (rsp=0.178, p=0.018), while family life satisfaction was unrelated to immune indices. Those who reported a combination of low job/low family life satisfaction had significantly lower NK and higher B cell counts than those with a high job/high family life satisfaction. Our study suggests that the single-item summary measure of job satisfaction, but not family life satisfaction, may be a valid tool to evaluate immune status in healthy white-collar employees.

  5. Measuring social health in the patient-reported outcomes measurement information system (PROMIS): item bank development and testing.

    Science.gov (United States)

    Hahn, Elizabeth A; Devellis, Robert F; Bode, Rita K; Garcia, Sofia F; Castel, Liana D; Eisen, Susan V; Bosworth, Hayden B; Heinemann, Allen W; Rothrock, Nan; Cella, David

    2010-09-01

    To develop a social health measurement framework, to test items in diverse populations and to develop item response theory (IRT) item banks. A literature review guided framework development of Social Function and Social Relationships sub-domains. Items were revised based on patient feedback, and Social Function items were field-tested. Analyses included exploratory factor analysis (EFA), confirmatory factor analysis (CFA), two-parameter IRT modeling and evaluation of differential item functioning (DIF). The analytic sample included 956 general population respondents who answered 56 Ability to Participate and 56 Satisfaction with Participation items. EFA and CFA identified three Ability to Participate sub-domains. However, because of positive and negative wording, and content redundancy, many items did not fit the IRT model, so item banks do not yet exist. EFA, CFA and IRT identified two preliminary Satisfaction item banks. One item exhibited trivial age DIF. After extensive item preparation and review, EFA-, CFA- and IRT-guided item banks help provide increased measurement precision and flexibility. Two Satisfaction short forms are available for use in research and clinical practice. This initial validation study resulted in revised item pools that are currently undergoing testing in new clinical samples and populations.

  6. Science Literacy: How do High School Students Solve PISA Test Items?

    Science.gov (United States)

    Wati, F.; Sinaga, P.; Priyandoko, D.

    2017-09-01

    The Programme for International Students Assessment (PISA) does assess students’ science literacy in a real-life contexts and wide variety of situation. Therefore, the results do not provide adequate information for the teacher to excavate students’ science literacy because the range of materials taught at schools depends on the curriculum used. This study aims to investigate the way how junior high school students in Indonesia solve PISA test items. Data was collected by using PISA test items in greenhouse unit employed to 36 students of 9th grade. Students’ answer was analyzed qualitatively for each item based on competence tested in the problem. The way how students answer the problem exhibits their ability in particular competence which is influenced by a number of factors. Those are students’ unfamiliarity with test construction, low performance on reading, low in connecting available information and question, and limitation on expressing their ideas effectively and easy-read. As the effort, selected PISA test items can be used in accordance teaching topic taught to familiarize students with science literacy.

  7. The effect of heightened awareness of observation on consumption of a multi-item laboratory test meal in females.

    Science.gov (United States)

    Robinson, Eric; Proctor, Michael; Oldham, Melissa; Masic, Una

    2016-09-01

    Human eating behaviour is often studied in the laboratory, but whether the extent to which a participant believes that their food intake is being measured influences consumption of different meal items is unclear. Our main objective was to examine whether heightened awareness of observation of food intake affects consumption of different food items during a lunchtime meal. One hundred and fourteen female participants were randomly assigned to an experimental condition designed to heighten participant awareness of observation or a condition in which awareness of observation was lower, before consuming an ad libitum multi-item lunchtime meal in a single session study. Under conditions of heightened awareness, participants tended to eat less of an energy dense snack food (cookies) in comparison to the less aware condition. Consumption of other meal items and total energy intake were similar in the heightened awareness vs. less aware condition. Exploratory secondary analyses suggested that the effect heightened awareness had on reduced cookie consumption was dependent on weight status, as well as trait measures of dietary restraint and disinhibition, whereby only participants with overweight/obesity, high disinhibition or low restraint reduced their cookie consumption. Heightened awareness of observation may cause females to reduce their consumption of an energy dense snack food during a test meal in the laboratory and this effect may be moderated by participant individual differences. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  8. The validity of the Satisfaction with Life Scale in adolescents and a comparison with single-item life satisfaction measures: a preliminary study.

    Science.gov (United States)

    Jovanović, Veljko

    2016-12-01

    The validity of the life satisfaction measures commonly used among adults has been rarely examined in adolescent samples. The present research had two main goals: (1) to evaluate the structural validity of the Satisfaction with Life Scale (SWLS) among adolescents and to test measurement invariance across gender; (2) to compare the criterion and convergent validity of the SWLS and single-item life satisfaction measures among adolescents. Three samples of Serbian adolescents were recruited for the present research. Study 1 (N = 481, M age  = 17.01 years) examined the structure of the SWLS via confirmatory factor analysis (CFA) and evaluated measurement invariance of the SWLS across gender by a multi-group CFA. Study 2 (N = 283, M age  = 17.34 years) and Study 3 (N = 220, M age  = 16.73 years) compared the convergent validity of the SWLS and single-item life satisfaction measures. The results of Study 1 supported the original one-factor model of the SWLS among adolescents and provided evidence for strong measurement invariance of the SWLS across gender. The findings of Study 2 and Study 3 showed that the SWLS and single-item measures were equally valid and strongly associated (r = .734 in Study 2 and r = .668 in Study 3). No substantial differences in correlations with school success and well-being indicators were found between the SWLS and single-item measures. Our findings support the use of the SWLS among adolescents and indicate that single-item life satisfaction measures perform as well as the SWLS in adolescent samples.

  9. Dual-Objective Item Selection Criteria in Cognitive Diagnostic Computerized Adaptive Testing

    Science.gov (United States)

    Kang, Hyeon-Ah; Zhang, Susu; Chang, Hua-Hua

    2017-01-01

    The development of cognitive diagnostic-computerized adaptive testing (CD-CAT) has provided a new perspective for gaining information about examinees' mastery on a set of cognitive attributes. This study proposes a new item selection method within the framework of dual-objective CD-CAT that simultaneously addresses examinees' attribute mastery…

  10. Multiple Imputation of Item Scores in Test and Questionnaire Data, and Influence on Psychometric Results

    Science.gov (United States)

    van Ginkel, Joost R.; van der Ark, L. Andries; Sijtsma, Klaas

    2007-01-01

    The performance of five simple multiple imputation methods for dealing with missing data were compared. In addition, random imputation and multivariate normal imputation were used as lower and upper benchmark, respectively. Test data were simulated and item scores were deleted such that they were either missing completely at random, missing at…

  11. Examples of Item Banks to Support Local Test Development: Two Case Studies With Reactions.

    Science.gov (United States)

    Estes, Gary D., Ed.

    This report and compilation of papers summarizes information collected by an Assessment Development and Use Project, initiated by the Northwest Regional Educational Laboratory (NWREL) to assist test development efforts by state and local agencies. Specific item banking applications are reported in two case studies, selected because they represent…

  12. Reading ability and print exposure: item response theory analysis of the author recognition test.

    Science.gov (United States)

    Moore, Mariah; Gordon, Peter C

    2015-12-01

    In the author recognition test (ART), participants are presented with a series of names and foils and are asked to indicate which ones they recognize as authors. The test is a strong predictor of reading skill, and this predictive ability is generally explained as occurring because author knowledge is likely acquired through reading or other forms of print exposure. In this large-scale study (1,012 college student participants), we used item response theory (IRT) to analyze item (author) characteristics in order to facilitate identification of the determinants of item difficulty, provide a basis for further test development, and optimize scoring of the ART. Factor analysis suggested a potential two-factor structure of the ART, differentiating between literary and popular authors. Effective and ineffective author names were identified so as to facilitate future revisions of the ART. Analyses showed that the ART is a highly significant predictor of the time spent encoding words, as measured using eyetracking during reading. The relationship between the ART and time spent reading provided a basis for implementing a higher penalty for selecting foils, rather than the standard method of ART scoring (names selected minus foils selected). The findings provide novel support for the view that the ART is a valid indicator of reading volume. Furthermore, they show that frequency data can be used to select items of appropriate difficulty, and that frequency data from corpora based on particular time periods and types of texts may allow adaptations of the test for different populations.

  13. Reading Ability and Print Exposure: Item Response Theory Analysis of the Author Recognition Test

    Science.gov (United States)

    Moore, Mariah; Gordon, Peter C.

    2015-01-01

    In the Author Recognition Test (ART) participants are presented with a series of names and foils and are asked to indicate which ones they recognize as authors. The test is a strong predictor of reading skill, with this predictive ability generally explained as occurring because author knowledge is likely acquired through reading or other forms of print exposure. This large-scale study (1012 college student participants) used Item Response Theory (IRT) to analyze item (author) characteristics to facilitate identification of the determinants of item difficulty, provide a basis for further test development, and to optimize scoring of the ART. Factor analysis suggests a potential two factor structure of the ART differentiating between literary vs. popular authors. Effective and ineffective author names were identified so as to facilitate future revisions of the ART. Analyses showed that the ART is a highly significant predictor of time spent encoding words as measured using eye-tracking during reading. The relationship between the ART and time spent reading provided a basis for implementing a higher penalty for selecting foils, rather than the standard method of ART scoring (names selected minus foils selected). The findings provide novel support for the view that the ART is a valid indicator of reading volume. Further, they show that frequency data can be used to select items of appropriate difficulty and that frequency data from corpora based on particular time periods and types of text may allow test adaptation for different populations. PMID:25410405

  14. A hierarchical framework for modeling speed and accuracy on test items

    NARCIS (Netherlands)

    van der Linden, Willem J.

    2005-01-01

    Current modeling of response times on test items has been influenced by the experimental paradigm of reaction-time research in psychology. For instance, some of the models have a parameter structure that was chosen to represent a speed-accuracy tradeoff, while others equate speed directly with

  15. Age-related differential item functioning in tests of face and car recognition ability.

    Science.gov (United States)

    Sunday, Mackenzie A; Lee, Woo-Yeol; Gauthier, Isabel

    2018-01-01

    The presence of differential item functioning (DIF) in a test suggests bias that could disadvantage members of a certain group. Previous work with tests of visual learning abilities found significant DIF related to age groups in a car test (Lee, Cho, McGugin, Van Gulick, & Gauthier, 2015), but not in a face test (Cho et al., 2015). The presence of age DIF is a threat to the validity of the test even for studies where aging is not of interest. Here, we assessed whether this pattern of age DIF for cars and not faces would also apply to new tests targeting the same abilities with a new matching task that uses two studied items per trial. We found evidence for DIF in matching tests for faces and for cars, though with encouragingly small effect sizes. Even though the age DIF was small enough at the test level to be acceptable for most uses, we also asked whether the specific format of our matching tasks may induce some age-related DIF regardless of domain. We decomposed the face matching task into its components, and using new data from subjects performing these simpler tasks, found evidence that the age DIF was driven by the similarity of the two faces presented at study on each trial. Overall, our results suggest that using a matching format, especially for cars, reduces age-related DIF, and that a simpler matching task with only one study item per trial could reduce age DIF further.

  16. Effects of Item Parameter Drift on Vertical Scaling with the Nonequivalent Groups with Anchor Test (NEAT) Design

    Science.gov (United States)

    Ye, Meng; Xin, Tao

    2014-01-01

    The authors explored the effects of drifting common items on vertical scaling within the higher order framework of item parameter drift (IPD). The results showed that if IPD occurred between a pair of test levels, the scaling performance started to deviate from the ideal state, as indicated by bias of scaling. When there were two items drifting…

  17. ITEM ANALYSIS IN MULTIPLE-CHOICE LISTENING TESTS FROM CILS CERTIFICATE IN ITALIAN AS A FOREIGN LANGUAGE

    Directory of Open Access Journals (Sweden)

    Paulo Torresan

    2015-12-01

    Full Text Available This paper analyses three multiple-choice listening tests from CILS certificate in Italian as a Foreign Language (level B1, summer session 2009 and 2012.Item Analysis involves examining the behavior of each individual item based on statistical data on answers from a sample. It offers answers to questions such as: do the items allow for sufficient discrimination between candidates of different skill levels? Do the keys and distractors work appropriately?Our investigation reveals certain issues of undercalibration, non-correspondence between items and information present in the text, and item distribution. In one case, the item’s construction risks misleading the test taker (item #2, summer session 2012.As well as providing an example of item analysis, this study allows the reader to gain awareness of how difficult it is to design an exercise widely used in both testing and teaching centers, that is, the multiple-choice question.

  18. Item response theory: applications of modern test theory in medical education.

    Science.gov (United States)

    Downing, Steven M

    2003-08-01

    Item response theory (IRT) measurement models are discussed in the context of their potential usefulness in various medical education settings such as assessment of achievement and evaluation of clinical performance. The purpose of this article is to compare and contrast IRT measurement with the more familiar classical measurement theory (CMT) and to explore the benefits of IRT applications in typical medical education settings. CMT, the more common measurement model used in medical education, is straightforward and intuitive. Its limitation is that it is sample-dependent, in that all statistics are confounded with the particular sample of examinees who completed the assessment. Examinee scores from IRT are independent of the particular sample of test questions or assessment stimuli. Also, item characteristics, such as item difficulty, are independent of the particular sample of examinees. The IRT characteristic of invariance permits easy equating of examination scores, which places scores on a constant measurement scale and permits the legitimate comparison of student ability change over time. Three common IRT models and their statistical assumptions are discussed. IRT applications in computer-adaptive testing and as a method useful for adjusting rater error in clinical performance assessments are overviewed. IRT measurement is a powerful tool used to solve a major problem of CMT, that is, the confounding of examinee ability with item characteristics. IRT measurement addresses important issues in medical education, such as eliminating rater error from performance assessments.

  19. Prediction of true test scores from observed item scores and ancillary data.

    Science.gov (United States)

    Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

    2015-05-01

    In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. © 2015 The British Psychological Society.

  20. Robustness of two single-item self-esteem measures: cross-validation with a measure of stigma in a sample of psychiatric patients.

    Science.gov (United States)

    Bagley, Christopher

    2005-08-01

    Robins' Single-item Self-esteem Inventory was compared with a single item from the Coopersmith Self-esteem. Although a new scoring format was used, there was good evidence of cross-validation in 83 current and former psychiatric patients who completed Harvey's adapted measure of stigma felt and experienced by users of mental health services. Scores on the two single-item self-esteem measures correlated .76 (p self-esteem in users of mental health services.

  1. Why Students Answer TIMSS Science Test Items the Way They Do

    Science.gov (United States)

    Harlow, Ann; Jones, Alister

    2004-04-01

    The purpose of this study was to explore how Year 8 students answered Third International Mathematics and Science Study (TIMSS) questions and whether the test questions represented the scientific understanding of these students. One hundred and seventy-seven students were tested using written test questions taken from the science test used in the Third International Mathematics and Science Study. The degree to which a sample of 38 children represented their understanding of the topics in a written test compared to the level of understanding that could be elicited by an interview is presented in this paper. In exploring student responses in the interview situation this study hoped to gain some insight into the science knowledge that students held and whether or not the test items had been able to elicit this knowledge successfully. We question the usefulness and quality of data from large-scale summative assessments on their own to represent student scientific understanding and conclude that large scale written test items, such as TIMSS, on their own are not a valid way of exploring students'' understanding of scientific concepts. Considerable caution is therefore needed in exploiting the outcomes of international achievement testing when considering educational policy changes or using TIMSS data on their own to represent student understanding.

  2. A six-item scale for overall, emotional and social loneliness: Confirmatory tests on survey

    OpenAIRE

    de Jong-Gierveld, J.; van Tilburg, T.G.

    2006-01-01

    Loneliness is an indicator of social well-being and pertains to the feeling of missing an intimate relationship (emotional loneliness) or missing a wider social network (social loneliness). The 11-item De Jong Gierveld Loneliness Scale has proved to be a valid and reliable measurement instrument for overall, emotional, and social loneliness, although its length has sometimes rendered it difficult to use in large surveys. In this study, the authors empirically tested a shortened version of the...

  3. A 6-item scale for overall, emotional, and social loneliness: Confirmatory tests on survey data

    OpenAIRE

    de Jong-Gierveld, J.; van Tilburg, T.G.

    2006-01-01

    Loneliness is an indicator of social well-being and pertains to the feeling of missing an intimate relationship (emotional loneliness) or missing a wider social network (social loneliness). The 11-item De Jong Gierveld Loneliness Scale has proved to be a valid and reliable measurement instrument for overall, emotional, and social loneliness, although its length has sometimes rendered it difficult to use in large surveys. In this study, the authors empirically tested a shortened version of the...

  4. Limited-information goodness-of-fit testing of hierarchical item factor models.

    Science.gov (United States)

    Cai, Li; Hansen, Mark

    2013-05-01

    In applications of item response theory, assessment of model fit is a critical issue. Recently, limited-information goodness-of-fit testing has received increased attention in the psychometrics literature. In contrast to full-information test statistics such as Pearson's X(2) or the likelihood ratio G(2) , these limited-information tests utilize lower-order marginal tables rather than the full contingency table. A notable example is Maydeu-Olivares and colleagues'M2 family of statistics based on univariate and bivariate margins. When the contingency table is sparse, tests based on M2 retain better Type I error rate control than the full-information tests and can be more powerful. While in principle the M2 statistic can be extended to test hierarchical multidimensional item factor models (e.g., bifactor and testlet models), the computation is non-trivial. To obtain M2 , a researcher often has to obtain (many thousands of) marginal probabilities, derivatives, and weights. Each of these must be approximated with high-dimensional numerical integration. We propose a dimension reduction method that can take advantage of the hierarchical factor structure so that the integrals can be approximated far more efficiently. We also propose a new test statistic that can be substantially better calibrated and more powerful than the original M2 statistic when the test is long and the items are polytomous. We use simulations to demonstrate the performance of our new methods and illustrate their effectiveness with applications to real data. © 2012 The British Psychological Society.

  5. Differential functioning of mini-mental test items according to disease.

    Science.gov (United States)

    Prieto, G; Delgado, A R; Perea, M V; Ladera, V

    2011-10-01

    Comparing the height of males and females would be impossible if the measuring device did not have the same properties for both populations. In a similar way, the cognitive level of diverse groups of patients should not be compared if the test has different measurement properties for these groups. Lack of Differential Item Functioning (DIF) is a condition for measurement invariance between populations. The most internationally used screening test for dementia, the MMSE (or Mini-mental State Examination), has been analysed using an advanced psychometric technique, the Rasch Model. The objective was to determine the invariance of mini-mental measurements from diverse groups: Parkinson's disease patients, Alzheimer's type dementia and normal subjects. The hypothesis was that the scores would not show DIF against any of these groups. The total sample was composed of 400 subjects. Significant differences between groups were found. However, the quantitative comparison only makes sense if no evidence against measurement invariance was found: given the kind of items showing DIF against Parkinson's disease patients, the MMSE seems to underestimate the cognitive level of these patients. Despite the extended use of this test, 11 items out of 30 show DIF and consequently score comparisons between groups are not justified. Copyright © 2010 Sociedad Española de Neurología. Published by Elsevier Espana. All rights reserved.

  6. Differential item and test functioning methodology indicated that item response bias was not a substantial cause of country differences in mental well-being.

    Science.gov (United States)

    Forero, Carlos G; Adroher, Núria D; Stewart-Brown, Sarah; Castellví, Pere; Codony, Miquel; Vilagut, Gemma; Mompart, Anna; Tresseres, Ricard; Colom, Joan; Castro, José I; Alonso, Jordi

    2014-12-01

    Establishing the cross-cultural equivalence of the mental well-being construct, as measured with the Warwick-Edinburg Mental Well-being Scale (WEMWBS), by studying potential construct validity biases in two countries with previously reported score differences. We compared the WEMWBS total scores and item responses in Scotland (N = 779) and Catalonia (N = 1,900) general population samples. To assess whether the questionnaire spuriously favored higher scores in Catalonia, we tested for differential item functioning (DIF) by applying ordinal logistic regression on Item Response Theory scores. DIF was tested with likelihood ratio tests and standard effect measures (McFadden Pseudo R(2), >0.13; relative parameter change, >5%), and differential test functioning (DTF) was tested by plotting differences between full-test and purified (i.e., without DIF items) score estimates. Catalonia showed higher levels of mental well-being than Scotland (Cohen d = 0.84). Three of 14 WEMWBS items showed small amounts of DIF. DIF did not accrue to DTF, as shown by intraclass correlation coefficient (ICC, 0.999) and case-by-case differences (maximum, 0.12 SD) between total and purified scores. Population differences remained mainly constant across sociodemographics and health outcomes. The WEMWBS measures a distinct well-being construct that is stable across countries, implying that Scotland and Catalonia populations are effectively different in the distribution of mental well-being. This result adds to previous psychometric information and supports WEMWBS as a valid unbiased measures for individual and cross-cultural comparisons. Copyright © 2014 Elsevier Inc. All rights reserved.

  7. Effectiveness of Combining Statistical Tests and Effect Sizes When Using Logistic Discriminant Function Regression to Detect Differential Item Functioning for Polytomous Items

    Science.gov (United States)

    Gómez-Benito, Juana; Hidalgo, Maria Dolores; Zumbo, Bruno D.

    2013-01-01

    The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R[superscript 2] and…

  8. Investigation of Specific Learning Disability and Testing Accommodations Based Differential Item Functioning Using a Multilevel Multidimensional Mixture Item Response Theory Model

    Science.gov (United States)

    Finch, W. Holmes; Hernández Finch, Maria E.

    2013-01-01

    The assessment of test data for the presence of differential item functioning (DIF) is a key component of instrument development and validation. Among the many methods that have been used successfully in such analyses is the mixture modeling approach. Using this approach to identify the presence of DIF has been touted as potentially superior for…

  9. Redefining diagnostic symptoms of depression using Rasch analysis: testing an item bank suitable for DSM-V and computer adaptive testing.

    Science.gov (United States)

    Mitchell, Alex J; Smith, Adam B; Al-salihy, Zerak; Rahim, Twana A; Mahmud, Mahmud Q; Muhyaldin, Asma S

    2011-10-01

    We aimed to redefine the optimal self-report symptoms of depression suitable for creation of an item bank that could be used in computer adaptive testing or to develop a simplified screening tool for DSM-V. Four hundred subjects (200 patients with primary depression and 200 non-depressed subjects), living in Iraqi Kurdistan were interviewed. The Mini International Neuropsychiatric Interview (MINI) was used to define the presence of major depression (DSM-IV criteria). We examined symptoms of depression using four well-known scales delivered in Kurdish. The Partial Credit Model was applied to each instrument. Common-item equating was subsequently used to create an item bank and differential item functioning (DIF) explored for known subgroups. A symptom level Rasch analysis reduced the original 45 items to 24 items of the original after the exclusion of 21 misfitting items. A further six items (CESD13 and CESD17, HADS-D4, HADS-D5 and HADS-D7, and CDSS3 and CDSS4) were removed due to misfit as the items were added together to form the item bank, and two items were subsequently removed following the DIF analysis by diagnosis (CESD20 and CDSS9, both of which were harder to endorse for women). Therefore the remaining optimal item bank consisted of 17 items and produced an area under the curve (AUC) of 0.987. Using a bank restricted to the optimal nine items revealed only minor loss of accuracy (AUC = 0.989, sensitivity 96%, specificity 95%). Finally, when restricted to only four items accuracy was still high (AUC was still 0.976; sensitivity 93%, specificity 96%). An item bank of 17 items may be useful in computer adaptive testing and nine or even four items may be used to develop a simplified screening tool for DSM-V major depressive disorder (MDD). Further examination of this item bank should be conducted in different cultural settings.

  10. The impact of parameter estimation on computerized adaptive testing with item cloning

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    2005-01-01

    Item cloning techniques can greatly reduce the cost of item writing and enhance the flexibility of item presentation. An important consequence of cloning is that it may cause variability in the item parameters. Recently, Glas and van der Linden (in press, 2005) proposed a multilevel item response

  11. The impact of parameter estimation on computerized adaptive testing with item cloning

    NARCIS (Netherlands)

    Glas, Cornelis A.W.

    Item cloning techniques can greatly reduce the cost of item writing and enhance the flexibility of item presentation. An important consequence of cloning is that it may cause variability in the item parameters. Recently, Glas and van der Linden (in press, 2005) proposed a multilevel item response

  12. On the Relationship between Classical Test Theory and Item Response Theory: From One to the Other and Back

    Science.gov (United States)

    Raykov, Tenko; Marcoulides, George A.

    2016-01-01

    The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…

  13. Item analysis of single-peaked response data : the psychometric evaluation of bipolar measurement scales

    NARCIS (Netherlands)

    Polak, Maaike Geertruida

    2011-01-01

    The thesis explains the fundamental difference between unipolar and bipolar measurement scales for psychological characteristics. We explore the use of correspondence analysis (CA), a technique that is similar to principal component analysis and is available in SAS and SPSS, to select items that

  14. Setting Passing Scores on Passage-Based Tests: A Comparison of Traditional and Single-Passage Bookmark Methods

    Science.gov (United States)

    Skaggs, Gary; Hein, Serge F.; Awuor, Risper

    2007-01-01

    In this study, a variation of the bookmark standard setting procedure for passage-based tests is proposed in which separate ordered item booklets are created for the items associated with each passage. This variation is compared to the traditional bookmark procedure for a fifth-grade reading test. The results showed that the single-passage…

  15. RT-based memory detection : Item saliency effects in the single-probe and the multiple-probe protocol

    NARCIS (Netherlands)

    Verschuere, B.; Kleinberg, B.; Theocharidou, K.

    RT-based memory detection may provide an efficient means to assess recognition of concealed information. There is, however, considerable heterogeneity in detection rates, and we explored two potential moderators: item saliency and test protocol. Participants tried to conceal low salient (e.g.,

  16. Science Library of Test Items. Volume Ten. Mastery Testing Programme. [Mastery Tests Series 2.] Tests M14-M26.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As part of a series of tests to measure mastery of specific skills in the natural sciences, copies of tests 14 through 26 include: (14) calculating an average; (15) identifying parts of the scientific method; (16) reading a geological map; (17) identifying elements, mixtures and compounds; (18) using Ohm's law in calculation; (19) interpreting…

  17. Utility and validity of a single-item visual analog scale for measuring dental anxiety in clinical practice.

    Science.gov (United States)

    Appukuttan, Devapriya; Vinayagavel, Mythreyi; Tadepalli, Anupama

    2014-06-01

    We evaluated whether a visual analog scale (VAS) was comparable to the multi-item Modified Dental Anxiety Scale (MDAS) in assessing dental anxiety in clinical practice. In total, 200 consecutive patients aged 20-70 years who presented at the dental outpatient department of SRM Dental College, Chennai were enrolled. The test-retest value for the VAS was 0.968. The Spearman rank correlations between the VAS and MDAS items and total score were significant (P dental visit and the VAS also showed a strong correlation (r = 0.473, P dental phobia. The weighted kappa was 69% for agreement between MDAS and the VAS in identifying patients with and without dental anxiety at cut-offs of 13 and 4.75, respectively. The VAS was found to be a valid measure and was comparable to the multi-item MDAS.

  18. Testing Three-Item Versions for Seven of Young's Maladaptive Schema

    Science.gov (United States)

    Blau, Gary; DiMino, John; Sheridan, Natalie; Pred, Robert S.; Beverly, Clyde; Chessler, Marcy

    2015-01-01

    The Young Schema Questionnaire (YSQ) in either long-form (205- item) or short-form (75-item or 90-item) versions has demonstrated its clinical usefulness for assessing early maladaptive schemas. However, even a 75 or 90-item "short form", particularly when combined with other measures, can represent a lengthy…

  19. Test-retest reliability of selected items of Health Behaviour in School-aged Children (HBSC survey questionnaire in Beijing, China

    Directory of Open Access Journals (Sweden)

    Liu Yang

    2010-08-01

    Full Text Available Abstract Background Children's health and health behaviour are essential for their development and it is important to obtain abundant and accurate information to understand young people's health and health behaviour. The Health Behaviour in School-aged Children (HBSC study is among the first large-scale international surveys on adolescent health through self-report questionnaires. So far, more than 40 countries in Europe and North America have been involved in the HBSC study. The purpose of this study is to assess the test-retest reliability of selected items in the Chinese version of the HBSC survey questionnaire in a sample of adolescents in Beijing, China. Methods A sample of 95 male and female students aged 11 or 15 years old participated in a test and retest with a three weeks interval. Student Identity numbers of respondents were utilized to permit matching of test-retest questionnaires. 23 items concerning physical activity, sedentary behaviour, sleep and substance use were evaluated by using the percentage of response shifts and the single measure Intraclass Correlation Coefficients (ICC with 95% confidence interval (CI for all respondents and stratified by gender and age. Items on substance use were only evaluated for school children aged 15 years old. Results The percentage of no response shift between test and retest varied from 32% for the item on computer use at weekends to 92% for the three items on smoking. Of all the 23 items evaluated, 6 items (26% showed a moderate reliability, 12 items (52% displayed a substantial reliability and 4 items (17% indicated almost perfect reliability. No gender and age group difference of the test-retest reliability was found except for a few items on sedentary behaviour. Conclusions The overall findings of this study suggest that most selected indicators in the HBSC survey questionnaire have satisfactory test-retest reliability for the students in Beijing. Further test-retest studies in a large

  20. Sleep Can Reduce the Testing Effect: It Enhances Recall of Restudied Items but Can Leave Recall of Retrieved Items Unaffected

    Science.gov (United States)

    Bäuml, Karl-Heinz T.; Holterman, Christoph; Abel, Magdalena

    2014-01-01

    The testing effect refers to the finding that retrieval practice in comparison to restudy of previously encoded contents can improve memory performance and reduce time-dependent forgetting. Naturally, long retention intervals include both wake and sleep delay, which can influence memory contents differently. In fact, sleep immediately after…

  1. Validity of a single item food security questionnaire in Arctic Canada.

    Science.gov (United States)

    Urke, Helga Bjørnøy; Cao, Zhirong R; Egeland, Grace M

    2014-06-01

    Assess sensitivity and specificity of each of the 18 US Department of Agriculture (USDA) Household Food Security Scale Module (HFSSM) questionnaire items to determine whether a rapid assessment of child and adult food insecurity is feasible in an Inuit population. Food insecurity prevalence was assessed by the 18-item USDA HFSSM in a randomized sample of Inuit households participating in the Inuit Health Survey and the Nunavut Inuit Child Health Survey. Questions were evaluated for sensitivity, specificity, predictive value (+/2), and total percent accuracy for adult and child food insecurity (yes/no). Child food security items were evaluated for both surveys. For children, the question “In the last 12 months, were there times when it was not possible to feed the children a healthy meal because there was not enough money?” had the best performance in both samples with a sensitivity and specificity of 92.3% and 97.3%, respectively, for the Inuit Health Survey, and 88.5% and 95.4% for the Nunavut Inuit Child Health Survey. For adults, the question “In the last 12 months, were there times when the food for you and your family just did not last and there was no money to buy more?” demonstrated a sensitivity of 93.0% and a specificity of 93.4%. Rapid assessment of child and adult food insecurity is feasible and may be a useful tool for health care and social service providers. However, as prevalence and severity of food insecurity change over time, rapid assessment techniques should not replace periodic screening by using the full USDA HFSSM questionnaire.

  2. Applications of Multidimensional Item Response Theory Models with Covariates to Longitudinal Test Data. Research Report. ETS RR-16-21

    Science.gov (United States)

    Fu, Jianbin

    2016-01-01

    The multidimensional item response theory (MIRT) models with covariates proposed by Haberman and implemented in the "mirt" program provide a flexible way to analyze data based on item response theory. In this report, we discuss applications of the MIRT models with covariates to longitudinal test data to measure skill differences at the…

  3. Evaluating the validity of the Work Role Functioning Questionnaire (Canadian French version) using classical test theory and item response theory.

    Science.gov (United States)

    Hong, Quan Nha; Coutu, Marie-France; Berbiche, Djamal

    2017-01-01

    The Work Role Functioning Questionnaire (WRFQ) was developed to assess workers' perceived ability to perform job demands and is used to monitor presenteeism. Still few studies on its validity can be found in the literature. The purpose of this study was to assess the items and factorial composition of the Canadian French version of the WRFQ (WRFQ-CF). Two measurement approaches were used to test the WRFQ-CF: Classical Test Theory (CTT) and non-parametric Item Response Theory (IRT). A total of 352 completed questionnaires were analyzed. A four-factor and three-factor model models were tested and shown respectively good fit with 14 items (Root Mean Square Error of Approximation (RMSEA) = 0.06, Standardized Root Mean Square Residual (SRMR) = 0.04, Bentler Comparative Fit Index (CFI) = 0.98) and with 17 items (RMSEA = 0.059, SRMR = 0.048, CFI = 0.98). Using IRT, 13 problematic items were identified, of which 9 were common with CTT. This study tested different models with fewer problematic items found in a three-factor model. Using a non-parametric IRT and CTT for item purification gave complementary results. IRT is still scarcely used and can be an interesting alternative method to enhance the quality of a measurement instrument. More studies are needed on the WRFQ-CF to refine its items and factorial composition.

  4. Two Test Items to Explore High School Students' Beliefs of Sample Size When Sampling from Large Populations

    Science.gov (United States)

    Bill, Anthony; Henderson, Sally; Penman, John

    2010-01-01

    Two test items that examined high school students' beliefs of sample size for large populations using the context of opinion polls conducted prior to national and state elections were developed. A trial of the two items with 21 male and 33 female Year 9 students examined their naive understanding of sample size: over half of students chose a…

  5. Development of the four-item Letter and Shape Drawing test (LSD-4): A brief bedside test of visuospatial function.

    Science.gov (United States)

    Williams, Olugbenga Alaba; O'Connell, Henry; Leonard, Maeve; Awan, Fahad; White, Debbie; McKenna, Frank; Hannigan, Ailish; Cullen, Walter; Exton, Chris; Enudi, Walter; Dunne, Colum; Adamis, Dimitrios; Meagher, David

    2017-01-01

    Conventional bedside tests of visuospatial function such as the Clock Drawing (CDT) and Intersecting Pentagons (IPT) lack consistency in delivery and interpretation. We compared performance on a novel test of visuospatial ability - the LSD - with the IPT, CDT and MMSE in 180 acute elderly medical inpatients [mean age 79.7±7.1 (range 62-96); 91 females (50.6%)]. 124 (69%) scored ≤23 on the MMSE; 60 with mild (score 18-23) and 64 with severe (score ≤17) impairment. 78 (43%) scored ≥6 on the CDT, while for the IPT, 87 (47%) scored ≥4. The CDT and IPT agreed on the classification of 138 patients (77%) with modest-strong agreement with the MMSE categories. Correlation between the LSD and visuospatial tests was high. A four-item version of the LSD incorporating items 1,10,12,15 had high correlation with the LSD-15 and strong association with MMSE categories. The LSD-4 provides a brief and easily interpreted bedside test of visuospatial function that has high coverage of elderly patients with neurocognitive impairment, good agreement with conventional tests of visuospatial ability and favourable ability to identify significant cognitive impairment. [181 words]. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  6. A unified factor-analytic approach to the detection of item and test bias: Illustration with the effect of providing calculators to students with dyscalculia

    Directory of Open Access Journals (Sweden)

    Lee, M. K.

    2016-01-01

    Full Text Available An absence of measurement bias against distinct groups is a prerequisite for the use of a given psychological instrument in scientific research or high-stakes assessment. Factor analysis is the framework explicitly adopted for the identification of such bias when the instrument consists of a multi-test battery, whereas item response theory is employed when the focus narrows to a single test composed of discrete items. Item response theory can be treated as a mild nonlinearization of the standard factor model, and thus the essential unity of bias detection at the two levels merits greater recognition. Here we illustrate the benefits of a unified approach with a real-data example, which comes from a statewide test of mathematics achievement where examinees diagnosed with dyscalculia were accommodated with calculators. We found that items that can be solved by explicit arithmetical computation became easier for the accommodated examinees, but the quantitative magnitude of this differential item functioning (measurement bias was small.

  7. Adaptation and validation into Portuguese language of the six-item cognitive impairment test (6CIT).

    Science.gov (United States)

    Apóstolo, João Luís Alves; Paiva, Diana Dos Santos; Silva, Rosa Carla Gomes da; Santos, Eduardo José Ferreira Dos; Schultz, Timothy John

    2017-07-25

    The six-item cognitive impairment test (6CIT) is a brief cognitive screening tool that can be administered to older people in 2-3 min. To adapt the 6CIT for the European Portuguese and determine its psychometric properties based on a sample recruited from several contexts (nursing homes; universities for older people; day centres; primary health care units). The original 6CIT was translated into Portuguese and the draft Portuguese version (6CIT-P) was back-translated and piloted. The accuracy of the 6CIT-P was assessed by comparison with the Portuguese Mini-Mental State Examination (MMSE). A convenience sample of 550 older people from various geographical locations in the north and centre of the country was used. The test-retest reliability coefficient was high (r = 0.95). The 6CIT-P also showed good internal consistency (α = 0.88) and corrected item-total correlations ranged between 0.32 and 0.90. Total 6CIT-P and MMSE scores were strongly correlated. The proposed 6CIT-P threshold for cognitive impairment is ≥10 in the Portuguese population, which gives sensitivity of 82.78% and specificity of 84.84%. The accuracy of 6CIT-P, as measured by area under the ROC curve, was 0.91. The 6CIT-P has high reliability and validity and is accurate when used to screen for cognitive impairment.

  8. Development of an item bank for the EORTC Role Functioning Computer Adaptive Test (EORTC RF-CAT)

    DEFF Research Database (Denmark)

    Gamper, Eva-Maria; Petersen, Morten Aa.; Aaronson, Neil

    2016-01-01

    BACKGROUND: Role functioning (RF) as a core construct of health-related quality of life (HRQOL) comprises aspects of occupational and social roles relevant for patients in all treatment phases as well as for survivors. The objective of the current study was to improve its assessment by developing......, and evaluation of the psychometric performance of the RF-CAT. RESULTS: Phases I-III yielded a list of 12 items eligible for phase IV field-testing. The field-testing sample included 1,023 patients from Austria, Denmark, Italy, and the UK. Psychometric evaluation and item response theory analyses yielded 10 items...... with good psychometric properties. The resulting item bank exhibits excellent reliability (mean reliability = 0.85, median = 0.95). Using the RF-CAT may allow sample size savings from 11 % up to 50 % compared to using the QLQ-C30 RF scale. CONCLUSIONS: The RF-CAT item bank improves the precision...

  9. Analyzing Item Generation with Natural Language Processing Tools for the "TOEIC"® Listening Test. Research Report. ETS RR-17-52

    Science.gov (United States)

    Yoon, Su-Youn; Lee, Chong Min; Houghton, Patrick; Lopez, Melissa; Sakano, Jennifer; Loukina, Anastasia; Krovetz, Bob; Lu, Chi; Madani, Nitin

    2017-01-01

    In this study, we developed assistive tools and resources to support TOEIC® Listening test item generation. There has recently been an increased need for a large pool of items for these tests. This need has, in turn, inspired efforts to increase the efficiency of item generation while maintaining the quality of the created items. We aimed to…

  10. Analysis of reagent lot-to-lot comparability tests in five immunoassay items.

    Science.gov (United States)

    Kim, Hyun Soo; Kang, Hee Jung; Whang, Dong Hee; Lee, Seong Gyu; Park, Min Jeong; Park, Ji-Young; Lee, Kyu Man

    2012-01-01

    We investigated the degree of lot-to-lot reagent variation for 5 common immunoassay items. We measured the commercial as well as in-house controls for α-fetoprotein (AFP), ferritin, CA19-9, quantitative hepatitis B surface antigen (HBsAg), and hepatitis B surface antibody (anti-HBs) 10 times each by using both the old and the new lot of reagents whenever a reagent lot was changed, over a period of 10 months. The differences in the mean control values, the percent difference (% difference), and the difference to between-run standard deviation ratio (D:SD ratio) between successive lots were calculated. The % difference in mean control values between 2 reagent lots ranged from 0.1 to 17.5% for AFP, 1.0 to 18.6% for ferritin, 0.6 to 14.3% for CA19-9, 0.6 to 16.2% for HBsAg, and 0.1 to 17.7% for anti-HBs except negative controls of HBsAg and anti-HBs. The maximum D:SD ratios between 2 lots were 4.37 for AFP, 4.39 for ferritin, 2.43 for CA19-9, 1.64 for HBsAg, and 4.16 for anti-HBs. Thus, we have experienced extensive variability in lot-to-lot reagent variation for 5 immunoassay items, indicating that reagent lot-to-lot comparability tests should be continuously performed and that laboratories should determine their own acceptance criteria for each item.

  11. Performance on large-scale science tests: Item attributes that may impact achievement scores

    Science.gov (United States)

    Gordon, Janet Victoria

    , characteristics of test items themselves and/or opportunities to learn. Suggestions for future research are made.

  12. Location Indices for Ordinal Polytomous Items Based on Item Response Theory. Research Report. ETS RR-15-20

    Science.gov (United States)

    Ali, Usama S.; Chang, Hua-Hua; Anderson, Carolyn J.

    2015-01-01

    Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index is needed to describe an item's location along a latent trait continuum. Situations in which a single index would be needed include item selection in computerized adaptive testing or test assembly. Therefore single…

  13. Laboratory tests for single-event effects

    International Nuclear Information System (INIS)

    Buchner, S.; McMorrow, D.; Melinger, J.; Campbell, A.B.

    1996-01-01

    Integrated circuits are currently tested at accelerators for their susceptibility to single-event effects (SEE's). However, because of the cost and limited accessibility associated with accelerator testing, there is considerable interest in developing alternate testing methods. Two laboratory techniques for measuring SEE, one involving a pulsed laser and the other 252 Cf, are described in detail in this paper. The pulsed laser provides information on the spatial and temporal dependence of SEE, information that has proven invaluable in understanding and mitigating SEE in spite of the differences in the physical mechanisms responsible for SEE induced by light and by ions. Considerable effort has been expended on developing 252 Cf as a laboratory test for SEE, but the technique has not found wide use because it is severely limited by the low energy and short range of the emitted ions that are unable to reach junctions either covered with dielectric layers or deep below the surface. In fact, there are documented cases where single-event latchup (SEL) testing with 252 Cf gave significantly different results from accelerator testing. A detailed comparison of laboratory and accelerator SEE data is presented in this review in order to establish the limits of each technique

  14. Do Self Concept Tests Test Self Concept? An Evaluation of the Validity of Items on the Piers Harris and Coopersmith Measures.

    Science.gov (United States)

    Lynch, Mervin D.; Chaves, John

    Items from Peirs-Harris and Coopersmith self-concept tests were evaluated against independent measures on three self-constructs, idealized, empathic, and worth. Construct measurements were obtained with the semantic differential and D statistic. Ratings were obtained from 381 children, grades 4-6. For each test, item ratings and construct measures…

  15. Conditioning factors of test-taking engagement in PIAAC: an exploratory IRT modelling approach considering person and item characteristics

    Directory of Open Access Journals (Sweden)

    Frank Goldhammer

    2017-11-01

    Full Text Available Abstract Background A potential problem of low-stakes large-scale assessments such as the Programme for the International Assessment of Adult Competencies (PIAAC is low test-taking engagement. The present study pursued two goals in order to better understand conditioning factors of test-taking disengagement: First, a model-based approach was used to investigate whether item indicators of disengagement constitute a continuous latent person variable by domain. Second, the effects of person and item characteristics were jointly tested using explanatory item response models. Methods Analyses were based on the Canadian sample of Round 1 of the PIAAC, with N = 26,683 participants completing test items in the domains of literacy, numeracy, and problem solving. Binary item disengagement indicators were created by means of item response time thresholds. Results The results showed that disengagement indicators define a latent dimension by domain. Disengagement increased with lower educational attainment, lower cognitive skills, and when the test language was not the participant’s native language. Gender did not exert any effect on disengagement, while age had a positive effect for problem solving only. An item’s location in the second of two assessment modules was positively related to disengagement, as was item difficulty. The latter effect was negatively moderated by cognitive skill, suggesting that poor test-takers are especially likely to disengage with more difficult items. Conclusions The negative effect of cognitive skill, the positive effect of item difficulty, and their negative interaction effect support the assumption that disengagement is the outcome of individual expectations about success (informed disengagement.

  16. NEXT Single String Integration Test Results

    Science.gov (United States)

    Soulas, George C.; Patterson, Michael J.; Pinero, Luis; Herman, Daniel A.; Snyder, Steven John

    2010-01-01

    As a critical part of NASA's Evolutionary Xenon Thruster (NEXT) test validation process, a single string integration test was performed on the NEXT ion propulsion system. The objectives of this test were to verify that an integrated system of major NEXT ion propulsion system elements meets project requirements, to demonstrate that the integrated system is functional across the entire power processor and xenon propellant management system input ranges, and to demonstrate to potential users that the NEXT propulsion system is ready for transition to flight. Propulsion system elements included in this system integration test were an engineering model ion thruster, an engineering model propellant management system, an engineering model power processor unit, and a digital control interface unit simulator that acted as a test console. Project requirements that were verified during this system integration test included individual element requirements ; integrated system requirements, and fault handling. This paper will present the results of these tests, which include: integrated ion propulsion system demonstrations of performance, functionality and fault handling; a thruster re-performance acceptance test to establish baseline performance: a risk-reduction PMS-thruster integration test: and propellant management system calibration checks.

  17. Randomization and Data-Analysis Items in Quality Standards for Single-Case Experimental Studies

    Science.gov (United States)

    Heyvaert, Mieke; Wendt, Oliver; Van den Noortgate, Wim; Onghena, Patrick

    2015-01-01

    Reporting standards and critical appraisal tools serve as beacons for researchers, reviewers, and research consumers. Parallel to existing guidelines for researchers to report and evaluate group-comparison studies, single-case experimental (SCE) researchers are in need of guidelines for reporting and evaluating SCE studies. A systematic search was…

  18. The six-item Clock Drawing Test – reliability and validity in mild Alzheimer’s disease

    DEFF Research Database (Denmark)

    Jørgensen, Kasper; Kristensen, Maria K; Waldemar, Gunhild

    2015-01-01

    This study presents a reliable, short and practical version of the Clock Drawing Test (CDT) for clinical use and examines its diagnostic accuracy in mild Alzheimer's disease versus elderly nonpatients. Clock drawings from 231 participants were scored independently by four clinical neuropsychologi...... reduced SN slightly. Classification accuracy associated with a score of four or less out of six was very high....... neuropsychologists blind to diagnostic classification. The interrater agreement of individual scoring criteria was analyzed and items with poor or moderate reliability were excluded. The classification accuracy of the resulting scoring system - the six-item CDT - was examined. We explored the effect of further...... reducing the number of scoring items on classification accuracy and estimated classification accuracy associated with performances deviating from the optimal cutoff score. At a cutoff of 5/6, the six-item CDT had a sensitivity (SN) of 0.65 and a specificity of 0.80. Stepwise removal of up to three items...

  19. Determination of radionuclides in environmental test items at CPHR: Traceability and uncertainty calculation

    International Nuclear Information System (INIS)

    Carrazana Gonzalez, J.; Fernandez, I.M.; Capote Ferrera, E.; Rodriguez Castro, G.

    2008-01-01

    Information about how the laboratory of Centro de Proteccion e Higiene de las Radiaciones (CPHR), Cuba establishes its traceability to the International System of Units for the measurement of radionuclides in environmental test items is presented. A comparison among different methodologies of uncertainty calculation, including an analysis of the feasibility of using the Kragten-spreadsheet approach, is shown. In the specific case of the gamma spectrometric assay, the influence of each parameter, and the identification of the major contributor, in the relative difference between the methods of uncertainty calculation (Kragten and partial derivative) is described. The reliability of the uncertainty calculation results reported by the commercial software Gamma 2000 from Silena is analyzed

  20. Determination of radionuclides in environmental test items at CPHR: traceability and uncertainty calculation.

    Science.gov (United States)

    Carrazana González, J; Fernández, I M; Capote Ferrera, E; Rodríguez Castro, G

    2008-11-01

    Information about how the laboratory of Centro de Protección e Higiene de las Radiaciones (CPHR), Cuba establishes its traceability to the International System of Units for the measurement of radionuclides in environmental test items is presented. A comparison among different methodologies of uncertainty calculation, including an analysis of the feasibility of using the Kragten-spreadsheet approach, is shown. In the specific case of the gamma spectrometric assay, the influence of each parameter, and the identification of the major contributor, in the relative difference between the methods of uncertainty calculation (Kragten and partial derivative) is described. The reliability of the uncertainty calculation results reported by the commercial software Gamma 2000 from Silena is analyzed.

  1. Assessing the discriminating power of item and test scores in the linear factor-analysis model

    Directory of Open Access Journals (Sweden)

    Pere J. Ferrando

    2012-01-01

    Full Text Available Las propuestas rigurosas y basadas en un modelo psicométrico para estudiar el impreciso concepto de "capacidad discriminativa" son escasas y generalmente limitadas a los modelos no-lineales para items binarios. En este artículo se propone un marco general para evaluar la capacidad discriminativa de las puntuaciones en ítems y tests que son calibrados mediante el modelo de un factor común. La propuesta se organiza en torno a tres criterios: (a tipo de puntuación, (b rango de discriminación y (c aspecto específico que se evalúa. Dentro del marco propuesto: (a se discuten las relaciones entre 16 medidas, de las cuales 6 parecen ser nuevas, y (b se estudian las relaciones entre ellas. La utilidad de la propuesta en las aplicaciones psicométricas que usan el modelo factorial se ilustra mediante un ejemplo empírico.

  2. Development of coordination system model on single-supplier multi-buyer for multi-item supply chain with probabilistic demand

    Science.gov (United States)

    Olivia, G.; Santoso, A.; Prayogo, D. N.

    2017-11-01

    Nowadays, the level of competition between supply chains is getting tighter and a good coordination system between supply chains members is very crucial in solving the issue. This paper focused on a model development of coordination system between single supplier and buyers in a supply chain as a solution. Proposed optimization model was designed to determine the optimal number of deliveries from a supplier to buyers in order to minimize the total cost over a planning horizon. Components of the total supply chain cost consist of transportation costs, handling costs of supplier and buyers and also stock out costs. In the proposed optimization model, the supplier can supply various types of items to retailers whose item demand patterns are probabilistic. Sensitivity analysis of the proposed model was conducted to test the effect of changes in transport costs, handling costs and production capacities of the supplier. The results of the sensitivity analysis showed a significant influence on the changes in the transportation cost, handling costs and production capacity to the decisions of the optimal numbers of product delivery for each item to the buyers.

  3. The 15-item version of the Boston Naming Test as an index of English proficiency.

    Science.gov (United States)

    Erdodi, Laszlo A; Jongsma, Katherine A; Issa, Meriam

    2017-01-01

    The present study was designed to examine the potential of the Boston Naming Test - Short Form (BNT-15) to provide an objective estimate of English proficiency. A secondary goal was to examine the effect of limited English proficiency (LEP) on neuropsychological test performance. A brief battery of neuropsychological tests was administered to 79 bilingual participants (40.5% male, M Age  = 26.9, M Education  = 14.2). The majority (n = 56) were English dominant (EN), and the rest were Arabic dominant (AR). The BNT-15 was further reduced to 10 items that best discriminated between EN and AR (BNT-10). Participants were divided into low, intermediate, and high English proficiency subsamples based on BNT-10 scores (≤6, 7-8, and ≥9). Performance across groups was compared on neuropsychological tests with high and low verbal mediation. The BNT-15 and BNT-10 respectively correctly identified 89 and 90% of EN and AR participants. Level of English proficiency had a large effect (partial η 2  = .12-.34; Cohen's d = .67-1.59) on tests with high verbal mediation (animal fluency, sentence comprehension, word reading), but no effect on tests with low verbal mediation (auditory consonant trigrams, clock drawing, digit-symbol substitution). The BNT-15 and BNT-10 can function as indices of English proficiency and predict the deleterious effect of LEP on neuropsychological tests with high verbal mediation. Interpreting low scores on such measures as evidence of impairment in examinees with LEP would likely overestimate deficits.

  4. 48 CFR 245.7101-3 - DD Form 1348-1, DoD Single Line Item Release/Receipt Document.

    Science.gov (United States)

    2010-10-01

    ... PROPERTY Plant Clearance Forms 245.7101-3 DD Form 1348-1, DoD Single Line Item Release/Receipt Document. Use for shipments of excess industrial plant equipment and contractor inventory redistribution system...

  5. Multiple Choice English Grammar Test Items That Aid English Grammar Learning for Students of English as a Foreign Language

    OpenAIRE

    Adisutrisno, D. Wagiman

    2008-01-01

    In the teaching of English as a foreign language in Indonesia, the teaching and testing of English grammar are indispensable. To test English grammar mastery, the multiple choice test must be used due to its merit of guaranteeing the fulfillment of the content validity of achievement tests. Unfortunately, the construction of many multiple choice test items has not been based on a very important consideration to aid learning processes. This paper discusses the need to use multiple choice test ...

  6. Developing a Numerical Ability Test for Students of Education in Jordan: An Application of Item Response Theory

    Science.gov (United States)

    Abed, Eman Rasmi; Al-Absi, Mohammad Mustafa; Abu shindi, Yousef Abdelqader

    2016-01-01

    The purpose of the present study is developing a test to measure the numerical ability for students of education. The sample of the study consisted of (504) students from 8 universities in Jordan. The final draft of the test contains 45 items distributed among 5 dimensions. The results revealed that acceptable psychometric properties of the test;…

  7. Diagnostic accuracy of a two-item Drug Abuse Screening Test (DAST-2).

    Science.gov (United States)

    Tiet, Quyen Q; Leyva, Yani E; Moos, Rudolf H; Smith, Brandy

    2017-11-01

    Drug use is prevalent and costly to society, but individuals with drug use disorders (DUDs) are under-diagnosed and under-treated, particularly in primary care (PC) settings. Drug screening instruments have been developed to identify patients with DUDs and facilitate treatment. The Drug Abuse Screening Test (DAST) is one of the most well-known drug screening instruments. However, similar to many such instruments, it is too long for routine use in busy PC settings. This study developed and validated a briefer and more practical DAST for busy PC settings. We recruited 1300 PC patients in two Department of Veterans Affairs (VA) clinics. Participants responded to a structured diagnostic interview. We randomly selected half of the sample to develop and the other half to validate the new instrument. We employed signal detection techniques to select the best DAST items to identify DUDs (based on the MINI) and negative consequences of drug use (measured by the Inventory of Drug Use Consequences). Performance indicators were calculated. The two-item DAST (DAST-2) was 97% sensitive and 91% specific for DUDs in the development sample and 95% sensitive and 89% specific in the validation sample. It was highly sensitive and specific for DUD and negative consequences of drug use in subgroups of patients, including gender, age, race/ethnicity, marital status, educational level, and posttraumatic stress disorder status. The DAST-2 is an appropriate drug screening instrument for routine use in PC settings in the VA and may be applicable in broader range of PC clinics. Published by Elsevier Ltd.

  8. Branched Adaptive Testing with a Rasch-Model-Calibrated Test: Analysing Item Presentation's Sequence Effects Using the Rasch-Model-Based LLTM

    Science.gov (United States)

    Kubinger, Klaus D.; Reif, Manuel; Yanagida, Takuya

    2011-01-01

    Item position effects provoke serious problems within adaptive testing. This is because different testees are necessarily presented with the same item at different presentation positions, as a consequence of which comparing their ability parameter estimations in the case of such effects would not at all be fair. In this article, a specific…

  9. Factor analysis and cut-off score of the 26-item eating attitudes test in a Greek sample

    OpenAIRE

    ANGELIKI DOUKA; EIRINI GRAMMATOPOULOU,; EMMANOUIL SKORDILIS; DIMITRA KOUTSOUKI

    2009-01-01

    Objective: The study examined the cross-cultural validity of the Eating Attitudes Test (EAT-26) in Greece, with 26 items under three subscales ('Dieting', 'Bulimia and Food Preoccupation', 'Oral Control'). Method: A total of 167 Greek undergraduate students (19 to 23 years old), and 20 female patients with Eating Disorders (13 to 42 years old) were examined with exploratory and confirmatory factor analysis. Results: The factor analysis of the EAT-26 revealed a 13 items EAT model, with the thr...

  10. Developing multiple-choices test items as tools for measuring the scientific-generic skills on solar system

    Science.gov (United States)

    Bhakti, Satria Seto; Samsudin, Achmad; Chandra, Didi Teguh; Siahaan, Parsaoran

    2017-05-01

    The aim of research is developing multiple-choices test items as tools for measuring the scientific of generic skills on solar system. To achieve the aim that the researchers used the ADDIE model consisting Of: Analyzing, Design, Development, Implementation, dan Evaluation, all of this as a method research. While The scientific of generic skills limited research to five indicator including: (1) indirect observation, (2) awareness of the scale, (3) inference logic, (4) a causal relation, and (5) mathematical modeling. The participants are 32 students at one of junior high schools in Bandung. The result shown that multiple-choices that are constructed test items have been declared valid by the expert validator, and after the tests show that the matter of developing multiple-choices test items be able to measuring the scientific of generic skills on solar system.

  11. Developing and testing items for the South African Personality Inventory (SAPI )

    NARCIS (Netherlands)

    Hill, C.; Nel, J.A.; van de Vijver, F.J.R.; Meiring, D.; Valchev, V.H.; Adams, B.G.; de Bruin, G.P.

    2013-01-01

    Orientation: A multicultural country like South Africa needs fair cross-cultural psychometric instruments. Research purpose: This article reports on the process of identifying items for, and provides a quantitative evaluation of, the South African Personality Inventory (SAPI) items. Motivation for

  12. Developing and testing items for the South African Personality Inventory (SAPI

    Directory of Open Access Journals (Sweden)

    Carin Hill

    2013-11-01

    Research purpose: This article reports on the process of identifying items for, and provides a quantitative evaluation of, the South African Personality Inventory (SAPI items. Motivation for the study: The study intended to develop an indigenous and psychometrically sound personality instrument that adheres to the requirements of South African legislation and excludes cultural bias. Research design, approach and method: The authors used a cross-sectional design. They measured the nine SAPI clusters identified in the qualitative stage of the SAPI project in 11 separate quantitative studies. Convenience sampling yielded 6735 participants. Statistical analysis focused on the construct validity and reliability of items. The authors eliminated items that showed poor performance, based on common psychometric criteria, and selected the best performing items to form part of the final version of the SAPI. Main findings: The authors developed 2573 items from the nine SAPI clusters. Of these, 2268 items were valid and reliable representations of the SAPI facets. Practical/managerial implications: The authors developed a large item pool. It measures personality in South Africa. Researchers can refine it for the SAPI. Furthermore, the project illustrates an approach that researchers can use in projects that aim to develop culturally-informed psychological measures. Contribution/value-add: Personality assessment is important for recruiting, selecting and developing employees. This study contributes to the current knowledge about the early processes researchers follow when they develop a personality instrument that measures personality fairly in different cultural groups, as the SAPI does.

  13. De item-reeks van de cognitieve screening test vergeleken met die van de mini-mental state examination

    NARCIS (Netherlands)

    Schmand, B.; Deelman, B. G.; Hooijer, C.; Jonker, C.; Lindeboom, J.

    1996-01-01

    The items of the ¿mini-mental state examination' (MMSE) and a Dutch dementia screening instrument, the ¿cognitive screening test' (CST), as well as the ¿geriatric mental status schedule' (GMS) and the ¿Dutch adult reading test' (DART), were administered to 4051 elderly people aged 65 to 84 years.

  14. Validity and usefulness of a single-item measure of patient-reported bother from side effects of cancer therapy.

    Science.gov (United States)

    Pearman, Timothy P; Beaumont, Jennifer L; Mroczek, Daniel; O'Connor, Mary; Cella, David

    2018-03-01

    The improving efficacy of cancer treatment has resulted in an increasing array of treatment-related symptoms and associated burdens imposed on individuals undergoing aggressive treatment of their disease. Often, clinical trials compare therapies that have different types, and severities, of adverse effects. Whether rated by clinicians or patients themselves, it can be difficult to know which side effect profile is more disruptive or bothersome to patients. A simple summary index of bother can help to adjudicate the variability in adverse effects across treatments being compared with each other. Across 4 studies, a total of 5765 patients enrolled in cooperative group studies and industry-sponsored clinical trials were the subjects of the current study. Patients were diagnosed with a range of primary cancer sites, including bladder, brain, breast, colon/rectum, head/neck, hepatobiliary, kidney, lung, ovary, pancreas, and prostate as well as leukemia and lymphoma. All patients were administered the Functional Assessment of Cancer Therapy-General version (FACT-G). The single item "I am bothered by side effects of treatment" (GP5), rated on a 5-point Likert scale, is part of the FACT-G. To determine its validity as a useful summary measure from the patient perspective, it was correlated with individual and aggregated clinician-rated adverse events and patient reports of their general ability to enjoy life. Analyses of pharmaceutical trials demonstrated that mean GP5 scores ("I am bothered by side effects of treatment") significantly differed by maximum adverse event grade (PEffect sizes ranged from 0.13 to 0.46. Analyses of cooperative group trials demonstrated a significant correlation between GP5 and item GF3 ("I am able to enjoy life") in the predicted direction. The single FACT-G item "I am bothered by side effects of treatment" is significantly associated with clinician-reported adverse events and with patients' ability to enjoy their lives. It has promise as an

  15. Improving Single Event Effects Testing Through Software

    Science.gov (United States)

    Banker, M. W.

    2011-01-01

    Radiation encountered in space environments can be damaging to microelectronics and potentially cause spacecraft failure. Single event effects (SEE) are a type of radiation effect that occur when an ion strikes a device. Single event gate rupture (SEGR) is a type of SEE that can cause failure in power transistors. Unlike other SEE rates in which a constant linear energy transfer (LET) can be used, SEGR rates sometimes require a non-uniform LET to be used to be accurate. A recent analysis shows that SEGR rates are most easily calculated when the environment is described as a stopping rate per unit volume for each ion species. Stopping rates in silicon for pertinent ions were calculated using the Stopping and Range of Ions in Matter (SRIM) software and CREME-MC software. A reference table was generated and can be used by others to calculate SEGR rates for a candidate device. Additionally, lasers can be used to simulate SEEs, providing more control and information at lower cost than heavy ion testing. The electron/hole pair generation rate from a laser pulse in a semiconductor can be related to the LET of an ion. MATLAB was used to generate a plot to easily make this comparison.

  16. Testing Procedure for the Single Fiber Fragmentation Test

    DEFF Research Database (Denmark)

    Feih, Stefanie; Wonsyld, Karen; Minzari, Daniel

    , specimens with one E-glass fiber placed inside an epoxy or polyester matrix were used. Elongating the specimens with a mini tensile tester, which was placed under a microscope, leads to fiber fragmentations. Different bonding strengths between fiber and matrix result in differences in the critical fracture......This report describes the details of the single fiber fragmentation test as conducted at the materials research department (AFM) at Risø. The equipment and specimen manufacture is described in detail. Furthermore, examples of results interpretation are given. For the experiments in this report...... length for the fiber and fracture characteristics....

  17. Development of an item bank and computer adaptive test for role functioning

    DEFF Research Database (Denmark)

    Anatchkova, Milena D; Rose, Matthias; Ware, John E

    2012-01-01

    Role functioning (RF) is a key component of health and well-being and an important outcome in health research. The aim of this study was to develop an item bank to measure impact of health on role functioning....

  18. The psychometric properties of the "Reading the Mind in the Eyes" Test: an item response theory (IRT) analysis.

    Science.gov (United States)

    Preti, Antonio; Vellante, Marcello; Petretto, Donatella R

    2017-05-01

    The "Reading the Mind in the Eyes" Test (hereafter: Eyes Test) is considered an advanced task of the Theory of Mind aimed at assessing the performance of the participant in perspective-takingthat is, the ability to sense or understand other people's cognitive and emotional states. In this study, the item response theory analysis was applied to the adult version of the Eyes Test. The Italian version of the Eyes Test was administered to 200 undergraduate students of both genders (males = 46%). Modified parallel analysis (MPA) was used to test unidimensionality. Marginal maximum likelihood estimation was used to fit the 1-, 2-, and 3-parameter logistic (PL) model to the data. Differential Item Functioning (DIF) due to gender was explored with five independent methods. MPA provided evidence in favour of unidimensionality. The Rasch model (1-PL) was superior to the other two models in explaining participants' responses to the Eyes Test. There was no robust evidence of gender-related DIF in the Eyes Test, although some differences may exist for some items as a reflection of real differences by group. The study results support a one-factor model of the Eyes Test. Performance on the Eyes Test is defined by the participant's ability in perspective-taking. Researchers should cease using arbitrarily selected subscores in assessing the performance of participants to the Eyes Test. Lack of gender-related DIF favours the use of the Eyes Test in the investigation of gender differences concerning empathy and social cognition.

  19. Item and Test Analysis to Identify Quality Multiple Choice Questions (MCQs) from an Assessment of Medical Students of Ahmedabad, Gujarat.

    Science.gov (United States)

    Gajjar, Sanju; Sharma, Rashmi; Kumar, Pradeep; Rana, Manish

    2014-01-01

    Multiple choice questions (MCQs) are frequently used to assess students in different educational streams for their objectivity and wide reach of coverage in less time. However, the MCQs to be used must be of quality which depends upon its difficulty index (DIF I), discrimination index (DI) and distracter efficiency (DE). To evaluate MCQs or items and develop a pool of valid items by assessing with DIF I, DI and DE and also to revise/ store or discard items based on obtained results. Study was conducted in a medical school of Ahmedabad. An internal examination in Community Medicine was conducted after 40 hours teaching during 1(st) MBBS which was attended by 148 out of 150 students. Total 50 MCQs or items and 150 distractors were analyzed. Data was entered and analyzed in MS Excel 2007 and simple proportions, mean, standard deviations, coefficient of variation were calculated and unpaired t test was applied. Out of 50 items, 24 had "good to excellent" DIF I (31 - 60%) and 15 had "good to excellent" DI (> 0.25). Mean DE was 88.6% considered as ideal/ acceptable and non functional distractors (NFD) were only 11.4%. Mean DI was 0.14. Poor DI (students and some issues with framing of at least some of the MCQs. Increased proportion of NFDs (incorrect alternatives selected by students) in an item decrease DE and makes it easier. There were 15 items with 17 NFDs, while rest items did not have any NFD with mean DE of 100%. Study emphasizes the selection of quality MCQs which truly assess the knowledge and are able to differentiate the students of different abilities in correct manner.

  20. The development and validation of a novel outcome measure to quantify mobility in the dysvascular lower extremity amputee: the amputee single item mobility measure

    Science.gov (United States)

    Norvell, Daniel C; Williams, Rhonda M; Turner, Aaron P; Czerniecki, Joseph M

    2016-01-01

    Objective: This study describes the development and psychometric evaluation of a novel patient-reported single-item mobility measure. Design: Prospective cohort study. Setting: Four Veteran’s Administration Medical Centers. Subjects: Individuals undergoing their first major unilateral lower extremity amputation; 198 met inclusion criteria; of these, 113 (57%) enrolled. Interventions: None. Main measures: The Amputee Single Item Mobility Measure, a single item measure with scores ranging from 0 to 6, was developed by an expert panel, and concurrently administered with the Locomotor Capabilities Index-5 (LCI-5) and other outcome measures at six weeks, four months, and 12 months post-amputation. Criterion and construct validity, responsiveness, and floor/ceiling effects were evaluated. Responsiveness was assessed using the standardized response mean. Results: The overall mean 12-month Amputee Single Item Mobility Measure score was 3.39 ±1.4. Scores for transmetatarsal, transtibial, and transfemoral amputees were 4.2 (±1.3), 3.2 (±1.5), and 2.9 (±1.1), respectively. Amputee Single Item Mobility Measure scores demonstrated “large” and statistically significant correlations with the LCI-5 scores at six weeks (r = 0.72), four months (r = 0.81), and 12 months (r = 0.86). At four months and 12 months, the correlation between Amputee Single Item Mobility Measure scores and hours of prosthetic use were r = 0.69 and r = 0.66, respectively, and between Amputee Single Item Mobility Measure scores and Trinity Amputation and Prosthesis Experience Scales functional restriction scores were r = 0.45 and r = 0.67, respectively. Amputee Single Item Mobility Measure scores increased significantly from six weeks to 12 months post-amputation. Minimal floor/ceiling effects were demonstrated. Conclusions: In the unilateral dysvascular amputee, the Amputee Single Item Mobility Measure has strong criterion and construct validity, excellent

  1. Practical Consequences of Item Response Theory Model Misfit in the Context of Test Equating with Mixed-Format Test Data.

    Science.gov (United States)

    Zhao, Yue; Hambleton, Ronald K

    2017-01-01

    In item response theory (IRT) models, assessing model-data fit is an essential step in IRT calibration. While no general agreement has ever been reached on the best methods or approaches to use for detecting misfit, perhaps the more important comment based upon the research findings is that rarely does the research evaluate IRT misfit by focusing on the practical consequences of misfit. The study investigated the practical consequences of IRT model misfit in examining the equating performance and the classification of examinees into performance categories in a simulation study that mimics a typical large-scale statewide assessment program with mixed-format test data. The simulation study was implemented by varying three factors, including choice of IRT model, amount of growth/change of examinees' abilities between two adjacent administration years, and choice of IRT scaling methods. Findings indicated that the extent of significant consequences of model misfit varied over the choice of model and IRT scaling methods. In comparison with mean/sigma (MS) and Stocking and Lord characteristic curve (SL) methods, separate calibration with linking and fixed common item parameter (FCIP) procedure was more sensitive to model misfit and more robust against various amounts of ability shifts between two adjacent administrations regardless of model fit. SL was generally the least sensitive to model misfit in recovering equating conversion and MS was the least robust against ability shifts in recovering the equating conversion when a substantial degree of misfit was present. The key messages from the study are that practical ways are available to study model fit, and, model fit or misfit can have consequences that should be considered when choosing an IRT model. Not only does the study address the consequences of IRT model misfit, but also it is our hope to help researchers and practitioners find practical ways to study model fit and to investigate the validity of particular IRT

  2. Differential Item Functioning in While-Listening Performance Tests: The Case of the International English Language Testing System (IELTS) Listening Module

    Science.gov (United States)

    Aryadoust, Vahid

    2012-01-01

    This article investigates a version of the International English Language Testing System (IELTS) listening test for evidence of differential item functioning (DIF) based on gender, nationality, age, and degree of previous exposure to the test. Overall, the listening construct was found to be underrepresented, which is probably an important cause…

  3. An evaluation of computerized adaptive testing for general psychological distress: combining GHQ-12 and Affectometer-2 in an item bank for public mental health research.

    Science.gov (United States)

    Stochl, Jan; Böhnke, Jan R; Pickett, Kate E; Croudace, Tim J

    2016-05-20

    Recent developments in psychometric modeling and technology allow pooling well-validated items from existing instruments into larger item banks and their deployment through methods of computerized adaptive testing (CAT). Use of item response theory-based bifactor methods and integrative data analysis overcomes barriers in cross-instrument comparison. This paper presents the joint calibration of an item bank for researchers keen to investigate population variations in general psychological distress (GPD). Multidimensional item response theory was used on existing health survey data from the Scottish Health Education Population Survey (n = 766) to calibrate an item bank consisting of pooled items from the short common mental disorder screen (GHQ-12) and the Affectometer-2 (a measure of "general happiness"). Computer simulation was used to evaluate usefulness and efficacy of its adaptive administration. A bifactor model capturing variation across a continuum of population distress (while controlling for artefacts due to item wording) was supported. The numbers of items for different required reliabilities in adaptive administration demonstrated promising efficacy of the proposed item bank. Psychometric modeling of the common dimension captured by more than one instrument offers the potential of adaptive testing for GPD using individually sequenced combinations of existing survey items. The potential for linking other item sets with alternative candidate measures of positive mental health is discussed since an optimal item bank may require even more items than these.

  4. Assessment of chromium(VI) release from 848 jewellery items by use of a diphenylcarbazide spot test

    DEFF Research Database (Denmark)

    Bregnbak, David; Johansen, Jeanne D.; Hamann, Dathan

    2016-01-01

    We recently evaluated and validated a diphenylcarbazide(DPC)-based screening spot test that can detect the release of chromium(VI) ions (≥0.5 ppm) from various metallic items and leather goods (1). We then screened a selection of metal screws, leather shoes, and gloves, as well as 50 earrings...

  5. Biological Science: An Ecological Approach. BSCS Green Version. Teacher's Resource Book and Test Item Bank. Sixth Edition.

    Science.gov (United States)

    Biological Sciences Curriculum Study, Colorado Springs.

    This book consists of four sections: (1) "Supplemental Materials"; (2) "Supplemental Investigations"; (3) "Test Item Bank"; and (4) "Blackline Masters." The first section provides additional background material related to selected chapters and investigations in the student book. Included are a periodic table of the elements, genetics problems and…

  6. Differential Item Functioning Assessment in Cognitive Diagnostic Modeling: Application of the Wald Test to Investigate DIF in the DINA Model

    Science.gov (United States)

    Hou, Likun; de la Torre, Jimmy; Nandakumar, Ratna

    2014-01-01

    Analyzing examinees' responses using cognitive diagnostic models (CDMs) has the advantage of providing diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this article, the Wald test is proposed to examine DIF in the context of CDMs. This study…

  7. Probabilistic Approaches to Examining Linguistic Features of Test Items and Their Effect on the Performance of English Language Learners

    Science.gov (United States)

    Solano-Flores, Guillermo

    2014-01-01

    This article addresses validity and fairness in the testing of English language learners (ELLs)--students in the United States who are developing English as a second language. It discusses limitations of current approaches to examining the linguistic features of items and their effect on the performance of ELL students. The article submits that…

  8. An Analysis of Cross Racial Identity Scale Scores Using Classical Test Theory and Rasch Item Response Models

    Science.gov (United States)

    Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie

    2013-01-01

    Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…

  9. Cognitive testing of tobacco use items for administration to patients with cancer and cancer survivors in clinical research.

    Science.gov (United States)

    Land, Stephanie R; Warren, Graham W; Crafts, Jennifer L; Hatsukami, Dorothy K; Ostroff, Jamie S; Willis, Gordon B; Chollette, Veronica Y; Mitchell, Sandra A; Folz, Jasmine N M; Gulley, James L; Szabo, Eva; Brandon, Thomas H; Duffy, Sonia A; Toll, Benjamin A

    2016-06-01

    To the authors' knowledge, there are currently no standardized measures of tobacco use and secondhand smoke exposure in patients diagnosed with cancer, and this gap hinders the conduct of studies examining the impact of tobacco on cancer treatment outcomes. The objective of the current study was to evaluate and refine questionnaire items proposed by an expert task force to assess tobacco use. Trained interviewers conducted cognitive testing with cancer patients aged ≥21 years with a history of tobacco use and a cancer diagnosis of any stage and organ site who were recruited at the National Institutes of Health Clinical Center in Bethesda, Maryland. Iterative rounds of testing and item modification were conducted to identify and resolve cognitive issues (comprehension, memory retrieval, decision/judgment, and response mapping) and instrument navigation issues until no items warranted further significant modification. Thirty participants (6 current cigarette smokers, 1 current cigar smoker, and 23 former cigarette smokers) were enrolled from September 2014 to February 2015. The majority of items functioned well. However, qualitative testing identified wording ambiguities related to cancer diagnosis and treatment trajectory, such as "treatment" and "surgery"; difficulties with lifetime recall; errors in estimating quantities; and difficulties with instrument navigation. Revisions to item wording, format, order, response options, and instructions resulted in a questionnaire that demonstrated navigational ease as well as good question comprehension and response accuracy. The Cancer Patient Tobacco Use Questionnaire (C-TUQ) can be used as a standardized item set to accelerate the investigation of tobacco use in the cancer setting. Cancer 2016;122:1728-34. © 2016 American Cancer Society. © 2016 American Cancer Society.

  10. 49 CFR 238.311 - Single car test.

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 4 2010-10-01 2010-10-01 false Single car test. 238.311 Section 238.311... Requirements for Tier I Passenger Equipment § 238.311 Single car test. (a) Except for self-propelled passenger cars, single car tests of all passenger cars and all unpowered vehicles used in passenger trains shall...

  11. Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

    Science.gov (United States)

    Michaelides, Michalis P.; Haertel, Edward H.

    2014-01-01

    The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

  12. Effect of Item Response Theory (IRT) Model Selection on Testlet-Based Test Equating. Research Report. ETS RR-14-19

    Science.gov (United States)

    Cao, Yi; Lu, Ru; Tao, Wei

    2014-01-01

    The local item independence assumption underlying traditional item response theory (IRT) models is often not met for tests composed of testlets. There are 3 major approaches to addressing this issue: (a) ignore the violation and use a dichotomous IRT model (e.g., the 2-parameter logistic [2PL] model), (b) combine the interdependent items to form a…

  13. Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients

    DEFF Research Database (Denmark)

    Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J. B.

    2017-01-01

    on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). METHODS: In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients...

  14. Dynamic Testing of Analogical Reasoning in 5- to 6-Year-Olds: Multiple-Choice versus Constructed-Response Training Items

    Science.gov (United States)

    Stevenson, Claire E.; Heiser, Willem J.; Resing, Wilma C. M.

    2016-01-01

    Multiple-choice (MC) analogy items are often used in cognitive assessment. However, in dynamic testing, where the aim is to provide insight into potential for learning and the learning process, constructed-response (CR) items may be of benefit. This study investigated whether training with CR or MC items leads to differences in the strategy…

  15. Threats to Validity When Using Open-Ended Items in International Achievement Studies: Coding Responses to the PISA 2012 Problem-Solving Test in Finland

    Science.gov (United States)

    Arffman, Inga

    2016-01-01

    Open-ended (OE) items are widely used to gather data on student performance in international achievement studies. However, several factors may threaten validity when using such items. This study examined Finnish coders' opinions about threats to validity when coding responses to OE items in the PISA 2012 problem-solving test. A total of 6…

  16. Re-Examining Test Item Issues in the TIMSS Mathematics and Science Assessments

    Science.gov (United States)

    Wang, Jianjun

    2011-01-01

    As the largest international study ever taken in history, the Trend in Mathematics and Science Study (TIMSS) has been held as a benchmark to measure U.S. student performance in the global context. In-depth analyses of the TIMSS project are conducted in this study to examine key issues of the comparative investigation: (1) item flaws in mathematics…

  17. A 6-item scale for overall, emotional and social loneliness: confirmatory tests on survey data

    NARCIS (Netherlands)

    de Jong Gierveld, J.; van Tilburg, T.

    2006-01-01

    Loneliness is an indicator of social well-being and pertains to the feeling of missing an intimate relationship (emotional loneliness) or missing a wider social network (social loneliness). The 11-item De Jong Gierveld Loneliness Scale has proved to be a valid and reliable measurement instrument for

  18. Industrial Arts Test Development Book 2: Resource Items for Ceramics, Graphic Arts, Metals, Plastics.

    Science.gov (United States)

    New York State Education Dept., Albany. Bureau of Industrial Arts Education.

    This publication encompasses questions for Ceramics, Graphic Arts, Metals, and Plastics for the second of a series. The use of this publication and the previously published (1973) book containing resource items for Drawing, Electricity/Electronics, Power Mechanics, and Woods (ED 109 457) will provide complete coverage of the basic series courses…

  19. A six-item scale for overall, emotional and social loneliness: Confirmatory tests on survey

    NARCIS (Netherlands)

    de Jong-Gierveld, J.; van Tilburg, T.G.

    2006-01-01

    Loneliness is an indicator of social well-being and pertains to the feeling of missing an intimate relationship (emotional loneliness) or missing a wider social network (social loneliness). The 11-item De Jong Gierveld Loneliness Scale has proved to be a valid and reliable measurement instrument for

  20. A 6-item scale for overall, emotional, and social loneliness: Confirmatory tests on survey data

    NARCIS (Netherlands)

    de Jong-Gierveld, J.; van Tilburg, T.G.

    2006-01-01

    Loneliness is an indicator of social well-being and pertains to the feeling of missing an intimate relationship (emotional loneliness) or missing a wider social network (social loneliness). The 11-item De Jong Gierveld Loneliness Scale has proved to be a valid and reliable measurement instrument for

  1. Evaluating the Wald Test for Item-Level Comparison of Saturated and Reduced Models in Cognitive Diagnosis

    Science.gov (United States)

    de la Torre, Jimmy; Lee, Young-Sun

    2013-01-01

    This article used the Wald test to evaluate the item-level fit of a saturated cognitive diagnosis model (CDM) relative to the fits of the reduced models it subsumes. A simulation study was carried out to examine the Type I error and power of the Wald test in the context of the G-DINA model. Results show that when the sample size is small and a…

  2. Using existing questionnaires in latent class analysis: should we use summary scores or single items as input? A methodological study using a cohort of patients with low back pain

    Directory of Open Access Journals (Sweden)

    Nielsen AM

    2016-04-01

    of more subgroups and more distinct clinical characteristics. Conclusion: In these data, application of both the summary-score strategy and the single-item strategy in the LCA subgrouping resulted in clinically interpretable subgroups, but the single-item strategy generally revealed more distinguishing characteristics. These results 1 warrant further analyses in other data sets to determine the consistency of this finding, and 2 warrant investigation in longitudinal data to test whether the finer detail provided by the single-item strategy results in improved prediction of outcomes and treatment response. Keywords: classification, data mining, subgrouping, clinical interpretability, questionnaire, low back pain

  3. Test Equating of the Medical Licensing Examination in 2003 and 2004 Based on the Item Response Theory

    Directory of Open Access Journals (Sweden)

    Mi Kyoung Yim

    2006-07-01

    Full Text Available The passing rate of the Medical Licensing Examination has been variable, which probably originated from the difference in the difficulty of items and/or difference in the ability level of examinees. We tried to explain the origin of the difference using the test equating method based on the item response theory. The number of items and examinees were 500, 3,647 in 2003 and 550, 3,879 in 2004. Common item nonequivalent group design was used for 30 common items. Item and ability parameters were calculated by three parametric logistic models using ICL. Scale transformation and true score equating were executed using ST and PIE. The mean of difficulty index of the year 2003 was ??.957 (SD 2.628 and that of 2004 after equating was ??.456 (SD 3.399. The mean of discrimination index of year 2003 was 0.487 (SD 0.242 and that of 2004 was 0.363 (SD 0.193. The mean of ability parameter of year 2003 was 0.00617 (SD 0.96605 and that of year 2004 was 0.94636 (SD 1.32960. The difference of the equated true score at the same ability level was high at the range of score of 200??50. The reason for the difference in passing rates over two consecutive years was due to the fact that the Examination in 2004 was easier and the abilities of the examinees in 2004 were higher. In addition, the passing rates of examinees with score of 270??94 in 2003, and those with 322??43 in 2004, were affected by the examination year.

  4. Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients.

    Science.gov (United States)

    Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J B; Conroy, Thierry; Tomaszewski, Krzysztof A; Young, Teresa; Petersen, Morten Aa

    2017-11-01

    The European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Group is developing computerized adaptive testing (CAT) versions of all EORTC Quality of Life Questionnaire (QLQ-C30) scales with the aim to enhance measurement precision. Here we present the results on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients. This evaluation included an assessment of dimensionality, fit to the item response theory (IRT) model, differential item functioning (DIF), and measurement properties. A total of 1030 cancer patients completed the 44 candidate items on CF. Of these, 34 items could be included in a unidimensional IRT model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient assessment of HRQOL of cancer patients, without loss of comparability of results.

  5. Open Single Item of Perceived Risk Factors (OSIPRF toward Cardiovascular Diseases Is an Appropriate Instrument for Evaluating Psychological Symptoms

    Directory of Open Access Journals (Sweden)

    Mozhgan Saeidi

    2016-12-01

    Full Text Available Psychological symptoms are considered as one of the aspects and consequences of cardiovascular diseases (CVDs, management of which can precipitate and facilitate the process of recovery. Evaluation of the psychological symptoms can increase awareness of treatment team regarding patients’ mental health, which can be beneficial for designing treatment programs (1. However, time-consuming process of interviews and assessment by questionnaires lead to fatigue and lack of patient cooperation, which may be problematic for healthcare evaluators. Therefore, the use of brief and suitable alternatives is always recommended.The use of practical and easy to implement instruments is constantly emphasized. A practical method for assessing patients' psychological status is examining causal beliefs and attitudes about the disease. The causal beliefs and perceived risk factors by patients, which are significantly related to the actual risk factors for CVDs (2, are not only related to psychological adjustment and mental health but also have an impact on patients’ compliance with treatment recommendations (3.It seems that several risk factors are at play regarding the perceived risk factors for CVDs such as gender (4, age (5, and most importantly, patients’ psychological status (3. Accordingly, evaluation of causal beliefs and perceived risk factors by patients could probably be a shortcut method for evaluation of patients’ psychological health. In recent years, Saeidi and Komasi (5 proposed a question and investigated the perceived risk factors with an open single item: “What do you think is the main cause of your illness?”. According to the authors, the perceived risk factors are recorded in five categories including biological (age, gender, and family history, environmental (dust, smoke, passive smoking, toxic substances, and effects of war, physiological (diabetes, hypertension, hyperlipidemia, and obesity, behavioral (lack of exercise, nutrition

  6. Construct validity and responsiveness of the single-item presenteeism question in patients with lower back pain for the measurement of presenteeism.

    Science.gov (United States)

    Kigozi, Jesse; Lewis, Martyn; Jowett, Sue; Barton, Pelham; Coast, Joanna

    2014-03-01

    Validity and responsiveness study using a randomized clinical trial and prospective cohort study of patients with low back pain (LBP). To provide evidence for construct validity and responsiveness to change of a single-item presenteeism question (SIPQ) in patients with LBP. The SIPQ is a simple, easy to administer tool that has been used to measure the impact of back pain on reduced productivity at work (presenteeism) as a standalone measure. Evidence supporting the validity and responsiveness of the SIPQ among patients with back pain is however lacking. The SIPQ was administered to patients consulting for back pain in a randomized controlled trial (N = 851) and a cohort intervention study (N = 922). Construct validity was assessed using convergent, divergent, and known-group validity. The validity investigation included assessing associations between the SIPQ and pain, disability, psychological, health status, and quality of life measures. Responsiveness was assessed using external indicators of change as comparators, evaluating correlation of clinical change scores and effect size statistics. Moderate to strong correlations were found between presenteeism and pain (r: 0.44-0.77), disability (r: 0.53-0.70), and 12-Item Short Form Health Survey physical dimensions (r: -0.66 to -0.55). Presenteeism was strongly associated with disease-specific pain and disability scales. The SIPQ was responsive to changes in productivity-presenteeism change scores indicated strong correlation with change scores, and high responsiveness in distribution- and anchor-based testing. The SIPQ is a potentially valid and responsive tool for assessing the impact of back pain on presenteeism. This SIPQ could, with relative ease, facilitate further research on the estimation of presenteeism within economic evaluation studies of musculoskeletal conditions, thus providing policymakers with estimates of economic impact of musculoskeletal disease. Further evidence is, however, merited to assess

  7. Single Event Testing on Complex Devices: Test Like You Fly versus Test-Specific Design Structures

    Science.gov (United States)

    Berg, Melanie; LaBel, Kenneth A.

    2014-01-01

    We present a framework for evaluating complex digital systems targeted for harsh radiation environments such as space. Focus is limited to analyzing the single event upset (SEU) susceptibility of designs implemented inside Field Programmable Gate Array (FPGA) devices. Tradeoffs are provided between application-specific versus test-specific test structures.

  8. 'Do you think you suffer from depression?' Reevaluating the use of a single item question for the screening of depression in older primary care patients

    DEFF Research Database (Denmark)

    Ayalon, Liat; Goldfracht, Margalit; Bech, Per

    2010-01-01

    to existing depression screening tools. METHODS: A cross sectional sample of 153 older primary care patients. Participants completed several depression-screening measures (e.g. a single depression screen, Patient Health Questionnaire-9, Major Depression Inventory, Visual Analogue Scale). Measures were......OBJECTIVES: The majority of older adults seek depression treatment in primary care. Despite impressive efforts to integrate depression treatment into primary care, depression often remains undetected. The overall goal of the present study was to compare a single item screening for depression......: An easy way to detect depression in older primary care patients would be asking the single question, 'do you think you suffer from depression?'...

  9. Item Response Theory to analyze a statistics test in university students

    OpenAIRE

    Vendramini, Claudette Maria Medeiros; Dias, Anelise Silva

    2005-01-01

    Este estudo objetivou aplicar a Teoria de Resposta ao Item na análise das 15 questões de múltipla escolha de uma prova de estatística apresentada na forma de gráficos ou de tabelas estatísticas. Participaram 413 universitários, selecionados por conveniência, de duas instituições da rede particular de ensino superior, predominantemente do curso de Psicologia (91,5%). Os universitários foram 80% do gênero feminino e do período diurno (69,8%), com idades de 16 a 53 anos, média 24,4 e...

  10. Tests of the validity of a model relating frequency of contaminated items and increasing radiation dose

    International Nuclear Information System (INIS)

    Tallentire, A.; Khan, A.A.

    1975-01-01

    The 60 Co radiation response of Bacillus pumilus E601 spores has been characterized when present in a laboratory test system. The suitability of test vessels to act as both containers for irradiation and culture vessels in sterility testing has been checked. Tests have been done with these spores to verify assumptions basic to the general model described in a previous paper. First measurements indicate that the model holds with this laboratory test system. (author)

  11. The differential item functioning and structural equivalence of a nonverbal cognitive ability test for five language groups

    Directory of Open Access Journals (Sweden)

    Pieter Schaap

    2011-10-01

    Research purpose: The aim of the study was to determine the differential item functioning (DIF and structural equivalence of a nonverbal cognitive ability test (the PiB/SpEEx Observance test [401] for five South African language groups. Motivation for study: Cultural and language group sensitive tests can lead to unfair discrimination and is a contentious workplace issue in South Africa today. Misconceptions about psychometric testing in industry can cause tests to lose credibility if industries do not use a scientifically sound test-by-test evaluation approach. Research design, approach and method: The researcher used a quasi-experimental design and factor analytic and logistic regression techniques to meet the research aims. The study used a convenience sample drawn from industry and an educational institution. Main findings: The main findings of the study show structural equivalence of the test at a holistic level and nonsignificant DIF effect sizes for most of the comparisons that the researcher made. Practical/managerial implications: This research shows that the PIB/SpEEx Observance Test (401 is not completely language insensitive. One should see it rather as a language-reduced test when people from different language groups need testing. Contribution/value-add: The findings provide supporting evidence that nonverbal cognitive tests are plausible alternatives to verbal tests when one compares people from different language groups.

  12. Towards single screening tests for brucellosis

    DEFF Research Database (Denmark)

    Nielsen, K.; Smith, P.; Yu, W.

    2005-01-01

    This paper describes an indirect enzyme-linked immunosorbent assay (I-ELISA) and a fluorescence polarisation assay (FPA), each capable of detecting antibody in several species of hosts to smooth and rough members of the genus Brucella. The I-ELISA uses a mixture of smooth lipopolysaccharide (SLPS...... than did I-ELISA procedures using each individual antigen separately. Similarly, the assay using combined antigens detected antibody in slightly fewer animals not exposed to Brucella sp. When a universal cutoff of 10% positivity was used (relative to strongly positive control sera of each species......-ELISA and the FPA with combined antigens were suitable as screening tests for all species of Brucella in the animal species tested....

  13. Developing Testing Accommodations for English Language Learners: Illustrations as Visual Supports for Item Accessibility

    Science.gov (United States)

    Solano-Flores, Guillermo; Wang, Chao; Kachchaf, Rachel; Soltero-Gonzalez, Lucinda; Nguyen-Le, Khanh

    2014-01-01

    We address valid testing for English language learners (ELLs)--students in the United States who are schooled in English while they are still acquiring English as a second language. Also, we address the need for procedures for systematically developing ELL testing accommodations--changes in tests intended to support ELLs to gain access to the…

  14. On the issue of item selection in computerized adaptive testing with response times

    NARCIS (Netherlands)

    Veldkamp, Bernard P.

    2016-01-01

    Many standardized tests are now administered via computer rather than paper-and-pencil format. The computer-based delivery mode brings with it certain advantages. One advantage is the ability to adapt the difficulty level of the test to the ability level of the test taker in what has been termed

  15. Evaluation of the single-item self-rating adherence scale for use in routine clinical care of people living with HIV.

    Science.gov (United States)

    Feldman, B J; Fredericksen, R J; Crane, P K; Safren, S A; Mugavero, M J; Willig, James H; Simoni, J M; Wilson, I B; Saag, M S; Kitahata, M M; Crane, H M

    2013-01-01

    The self-rating scale item (SRSI) is a single-item self-report adherence measure that uses adjectives in a 5-point Likert scale, from "very poor" to "excellent," to describe medication adherence over the past 4 weeks. This study investigated the SRSI in 2,399 HIV-infected patients in routine care at two outpatient primary HIV clinics. Correlations between the SRSI and four commonly used adherence items ranged from 0.37 to 0.64. Correlations of adherence barriers, such as depression and substance use, were comparable across all adherence items. General estimating equations suggested the SRSI is as good as or better than other adherence items (p's <0.001 vs. <0.001-0.99) at predicting adherence-related clinical outcomes, such as HIV viral load and CD4(+) cell count. These results and the SRSI's low patient burden suggest its routine use could be helpful for assessing adherence in clinical care and should be more widespread, particularly where more complex instruments may be impractical.

  16. Testing the discriminant validity of Schwartz' Portrait Value Questionnaire items – A replication and extension of Knoppen and Saris (2009

    Directory of Open Access Journals (Sweden)

    Constanze Beierlein

    2012-04-01

    Full Text Available Schwartz' theory of ten basic human values has stimulated numerous studies using a variety of instruments. Confirmatory factor analyses (CFA of the properties of some of the instruments have revealed that three pairs of values were excessively highly correlated. This led Davidov et al. (2008 to propose unifying values. To overcome the problems of loss of precision due to unifying distinct values, Knoppen and Saris (2009a,b investigated the factorial structure of each of the ten values measured with the PVQ (Schwartz et al. 2001. They identified both cross-loadings and distinct sub-dimensions for the pairs of nondiscriminated values in two German student samples. They concluded that the original strategy for selecting items, maximizing theoretical coverage at the expense of item homogeneity, produced the poor discrimination between values. Our Study 1 examines whether the Knoppen and Saris findings generalize to a representative sample of the German population. With some notable exceptions, our findings replicate theirs. Study 2 uses 33 items from an experimental version of the PVQ to operationalize and test a full model of the 11 basic values. Following Knoppen and Saris, we included only one sub-dimension of each of the 11 values. This CFA model yielded a satisfactory fit with no estimation problems. We conclude that available indicators permit measuring the distinct values without the need to collapse factors. Limitations and implications of the research are discussed.

  17. Performance of Accounting students on the Enade/2012 test: an application of the Item-Response Theory

    Directory of Open Access Journals (Sweden)

    Raphael Vinicius Weigert Camargo

    2016-08-01

    Full Text Available The objective in this study was to measure Accounting students’ performance (proficiency on the Enade test using the Item Response Theory (IRT. The students’ performance was measured using the three parameter logistic model (3PL, based on data related to the Enade test/2012, taken from the website of the National Institute for Educational Studies and Research Anísio Teixeira (Inep, concerning 47,098 students. Through the scale, three levels of student performance could be distinguished. Level 1 students master the reading and interpretation of texts and quantitative reasoning. In addition, Level 2 students should present logical reasoning and systemic and holistic perspective. Furthermore, at Level 3, students should present interdisciplinary knowledge, covering accounting contents, critical-analytic skills and practical application of the content mastered. The results also appointed that the items of the Enade test were very difficulty for the group that took the test. Independently of the student characteristics analyzed, overall, the proficiency scores were very low. This result suggests that the HEI need to take actions and that public policies are needed that can contribute to improve the students’ performance.

  18. Results of wholesomeness test on basic plan of research and development of food irradiation (7 items)

    International Nuclear Information System (INIS)

    Furuya, Tsuyoshi

    1989-01-01

    Twenty years have elapsed since the general research on food irradiation was begun in Japan as the new technology for food preservation, and the research on the wholesomeness of irradiated foods has been carried out in wide range together with the research on irradiation effect, irradiation techniques and economical efficiency. The wholesomeness of irradiated foods includes chronic toxicity including carcinogenic property in the continuous intake for long period, the effect to reproduction function over many generations and the possibility of giving hereditary injury to cells, the nutritional adequacy required for the sustenance of life and the increase of health, and microbiological safety. In Japan, the research on food irradiation was designated as an atomic energy specific general research, and as the objects of research, potato and onion for the prevention of germination, rice and wheat for the protection from noxious insects, fish paste products, wienerwurst and mandarin orange for sterilization were selected. For the irradiation, Co-60 gamma ray was used except the case of mandarin orange using electron beam. The research on all 7 items was finished, and the irradiation of potato was permitted. (K.I.)

  19. Test Equating under the NEAT Design: A Necessary Condition for Anchor Items

    Science.gov (United States)

    Raykov, Tenko

    2010-01-01

    Mroch, Suh, Kane, & Ripkey (2009); Suh, Mroch, Kane, & Ripkey (2009); and Kane, Mroch, Suh, & Ripkey (2009) provided elucidating discussions on critical properties of linear equating methods under the nonequivalent groups with anchor test (NEAT) design. In this popular equating design, two test forms are administered to different…

  20. Explanatory Item Response Modeling of Children's Change on a Dynamic Test of Analogical Reasoning

    Science.gov (United States)

    Stevenson, Claire E.; Hickendorff, Marian; Resing, Wilma C. M.; Heiser, Willem J.; de Boeck, Paul A. L.

    2013-01-01

    Dynamic testing is an assessment method in which training is incorporated into the procedure with the aim of gauging cognitive potential. Large individual differences are present in children's ability to profit from training in analogical reasoning. The aim of this experiment was to investigate sources of these differences on a dynamic test of…

  1. The differential item functioning and structural equivalence of a nonverbal cognitive ability test for five language groups

    Directory of Open Access Journals (Sweden)

    Pieter Schaap

    2011-03-01

    Full Text Available Orientation: For a number of years, eliminating a language component in testing by using nonverbal cognitive tests has been proposed as a possible solution to the effect of groups’ languages (mother tongues or first languages on test performance. This is particularly relevant in South Africa with its 11 official languages.Research purpose: The aim of the study was to determine the differential item functioning (DIF and structural equivalence of a nonverbal cognitive ability test (the PiB/SpEEx Observance test [401] for five South African language groups.Motivation for study: Cultural and language group sensitive tests can lead to unfair discrimination and is a contentious workplace issue in South Africa today. Misconceptions about psychometric testing in industry can cause tests to lose credibility if industries do not use a scientifically sound test-by-test evaluation approach.Research design, approach and method: The researcher used a quasi-experimental design and factor analytic and logistic regression techniques to meet the research aims. The study used a convenience sample drawn from industry and an educational institution.Main findings: The main findings of the study show structural equivalence of the test at a holistic level and nonsignificant DIF effect sizes for most of the comparisons that the researcher made.Practical/managerial implications: This research shows that the PIB/SpEEx Observance Test (401 is not completely language insensitive. One should see it rather as a language-reduced test when people from different language groups need testing.Contribution/value-add: The findings provide supporting evidence that nonverbal cognitive tests are plausible alternatives to verbal tests when one compares people from different language groups.

  2. Testing the robustness of deterministic models of optimal dynamic pricing and lot-sizing for deteriorating items under stochastic conditions

    DEFF Research Database (Denmark)

    Ghoreishi, Maryam

    2018-01-01

    Many models within the field of optimal dynamic pricing and lot-sizing models for deteriorating items assume everything is deterministic and develop a differential equation as the core of analysis. Two prominent examples are the papers by Rajan et al. (Manag Sci 38:240–262, 1992) and Abad (Manag...... Sci 42:1093–1104, 1996). To our knowledge, nobody has ever tested whether the optimal solutions obtained in those papers are valid if the real system is exposed to randomness: with regard to demand process as well as with regard to the deterioration process. The motivation is that although the real...

  3. A test of the International Personality Item Pool representation of the Revised NEO Personality Inventory and development of a 120-item IPIP-based measure of the five-factor model.

    Science.gov (United States)

    Maples, Jessica L; Guan, Li; Carter, Nathan T; Miller, Joshua D

    2014-12-01

    There has been a substantial increase in the use of personality assessment measures constructed using items from the International Personality Item Pool (IPIP) such as the 300-item IPIP-NEO (Goldberg, 1999), a representation of the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992). The IPIP-NEO is free to use and can be modified to accommodate its users' needs. Despite the substantial interest in this measure, there is still a dearth of data demonstrating its convergence with the NEO PI-R. The present study represents an investigation of the reliability and validity of scores on the IPIP-NEO. Additionally, we used item response theory (IRT) methodology to create a 120-item version of the IPIP-NEO. Using an undergraduate sample (n = 359), we examined the reliability, as well as the convergent and criterion validity, of scores from the 300-item IPIP-NEO, a previously constructed 120-item version of the IPIP-NEO (Johnson, 2011), and the newly created IRT-based IPIP-120 in comparison to the NEO PI-R across a range of outcomes. Scores from all 3 IPIP measures demonstrated strong reliability and convergence with the NEO PI-R and a high degree of similarity with regard to their correlational profiles across the criterion variables (rICC = .983, .972, and .976, respectively). The replicability of these findings was then tested in a community sample (n = 757), and the results closely mirrored the findings from Sample 1. These results provide support for the use of the IPIP-NEO and both 120-item IPIP-NEO measures as assessment tools for measurement of the five-factor model. (c) 2014 APA, all rights reserved.

  4. Dynamic Testing of Analogical Reasoning in 5- to 6-Year-Olds : Multiple-Choice Versus Constructed-Response Training Items

    NARCIS (Netherlands)

    Stevenson, C.E.; Heiser, W.J.; Resing, W.C.M.

    2016-01-01

    Multiple-choice (MC) analogy items are often used in cognitive assessment. However, in dynamic testing, where the aim is to provide insight into potential for learning and the learning process, constructed-response (CR) items may be of benefit. This study investigated whether training with CR or MC

  5. Barriers and benefits to desired behaviors for single use plastic items in northeast Ohio's Lake Erie basin.

    Science.gov (United States)

    Bartolotta, Jill F; Hardy, Scott D

    2018-02-01

    Given the growing saliency of plastic marine debris, and the impact of plastics on beaches and aquatic environments in the Laurentian Great Lakes, applied research is needed to support municipal and nongovernmental campaigns to prevent debris from reaching the water's edge. This study addresses this need by examining the barriers and benefits to positive behavior for two plastic debris items in northeast Ohio's Lake Erie basin: plastic bags and plastic water bottles. An online survey is employed to gather data on the use and disposal of these plastic items and to solicit recommendations on how to positively change behavior to reduce improper disposal. Results support a ban on plastic bags and plastic water bottles, with more enthusiasm for a bag ban. Financial incentives are also seen as an effective way to influence behavior change, as are location-specific solutions focused on education and outreach. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement.

    Science.gov (United States)

    McInnes, Matthew D F; Moher, David; Thombs, Brett D; McGrath, Trevor A; Bossuyt, Patrick M; Clifford, Tammy; Cohen, Jérémie F; Deeks, Jonathan J; Gatsonis, Constantine; Hooft, Lotty; Hunt, Harriet A; Hyde, Christopher J; Korevaar, Daniël A; Leeflang, Mariska M G; Macaskill, Petra; Reitsma, Johannes B; Rodin, Rachel; Rutjes, Anne W S; Salameh, Jean-Paul; Stevens, Adrienne; Takwoingi, Yemisi; Tonelli, Marcello; Weeks, Laura; Whiting, Penny; Willis, Brian H

    2018-01-23

    Systematic reviews of diagnostic test accuracy synthesize data from primary diagnostic studies that have evaluated the accuracy of 1 or more index tests against a reference standard, provide estimates of test performance, allow comparisons of the accuracy of different tests, and facilitate the identification of sources of variability in test accuracy. To develop the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) diagnostic test accuracy guideline as a stand-alone extension of the PRISMA statement. Modifications to the PRISMA statement reflect the specific requirements for reporting of systematic reviews and meta-analyses of diagnostic test accuracy studies and the abstracts for these reviews. Established standards from the Enhancing the Quality and Transparency of Health Research (EQUATOR) Network were followed for the development of the guideline. The original PRISMA statement was used as a framework on which to modify and add items. A group of 24 multidisciplinary experts used a systematic review of articles on existing reporting guidelines and methods, a 3-round Delphi process, a consensus meeting, pilot testing, and iterative refinement to develop the PRISMA diagnostic test accuracy guideline. The final version of the PRISMA diagnostic test accuracy guideline checklist was approved by the group. The systematic review (produced 64 items) and the Delphi process (provided feedback on 7 proposed items; 1 item was later split into 2 items) identified 71 potentially relevant items for consideration. The Delphi process reduced these to 60 items that were discussed at the consensus meeting. Following the meeting, pilot testing and iterative feedback were used to generate the 27-item PRISMA diagnostic test accuracy checklist. To reflect specific or optimal contemporary systematic review methods for diagnostic test accuracy, 8 of the 27 original PRISMA items were left unchanged, 17 were modified, 2 were added, and 2 were omitted. The 27-item

  7. Explosive Classification Testing of Experimental Colored Smoke Compositions and End Items

    Science.gov (United States)

    1976-10-01

    ch) long lead cylLer.^tmt/r 8 1 ^g o^ZT ^ perpen .ular to and in contact with the top surface of the sam^e A cm (2^ch) wood cylinder with a hole... SaWdUSt WhiCh WaS ^M with - eiecTZlly Vitiated match head igmter This test was conducted a minimum of two times Tte ignihon and unconfined-buming...tested were covered with addi- tional combustible material sufficient to sustain a hot fire. The entire mass was then saturated with approximately 50

  8. Algorithms for the Construction of Parallel Tests by Zero-One Programming. Project Psychometric Aspects of Item Banking No. 7. Research Report 86-7.

    Science.gov (United States)

    Boekkooi-Timminga, Ellen

    Nine methods for automated test construction are described. All are based on the concepts of information from item response theory. Two general kinds of methods for the construction of parallel tests are presented: (1) sequential test design; and (2) simultaneous test design. Sequential design implies that the tests are constructed one after the…

  9. Improving measurement in health education and health behavior research using item response modeling: comparison with the classical test theory approach.

    Science.gov (United States)

    Wilson, Mark; Allen, Diane D; Li, Jun Corser

    2006-12-01

    This paper compares the approach and resultant outcomes of item response models (IRMs) and classical test theory (CTT). First, it reviews basic ideas of CTT, and compares them to the ideas about using IRMs introduced in an earlier paper. It then applies a comparison scheme based on the AERA/APA/NCME 'Standards for Educational and Psychological Tests' to compare the two approaches under three general headings: (i) choosing a model; (ii) evidence for reliability--incorporating reliability coefficients and measurement error--and (iii) evidence for validity--including evidence based on instrument content, response processes, internal structure, other variables and consequences. An example analysis of a self-efficacy (SE) scale for exercise is used to illustrate these comparisons. The investigation found that there were (i) aspects of the techniques and outcomes that were similar between the two approaches, (ii) aspects where the item response modeling approach contributes to instrument construction and evaluation beyond the classical approach and (iii) aspects of the analysis where the measurement models had little to do with the analysis or outcomes. There were no aspects where the classical approach contributed to instrument construction or evaluation beyond what could be done with the IRM approach. Finally, properties of the SE scale are summarized and recommendations made.

  10. Language Games and Meaning as Used in Student Encounters with Scientific Literacy Test Items

    Science.gov (United States)

    Serder, Margareta; Jakobsson, Anders

    2016-01-01

    Previous research in science education has suggested that difficulties among students learning science relate to challenges in framing its discourse. This article examines the role that language plays in a scientific literacy test for which everyday life is an augmented aspect. Video-recorded data was collected in four ninth-grade science classes…

  11. Psychometric evaluation of the EORTC computerized adaptive test (CAT) fatigue item pool

    DEFF Research Database (Denmark)

    Petersen, Morten Aa; Giesinger, Johannes M; Holzner, Bernhard

    2013-01-01

    Fatigue is one of the most common symptoms associated with cancer and its treatment. To obtain a more precise and flexible measure of fatigue, the EORTC Quality of Life Group has developed a computerized adaptive test (CAT) measure of fatigue. This is part of an ongoing project developing a CAT v...

  12. The 40-item Monell Extended Sniffin' Sticks Identification Test (MONEX-40)

    NARCIS (Netherlands)

    Freiherr, J.; Gordon, A.R.; Alden, E.C.; Ponting, A.L.; Hernandez, M.; Boesveldt, S.; Lundstrom, J.N.

    2012-01-01

    Background Most existing olfactory identification (ID) tests have the primary aim of diagnosing clinical olfactory dysfunction, thereby rendering them sub-optimal for experimental settings where the aim is to detect differences in healthy subjects’ odor ID abilities. Materials and methods We have

  13. Explanatory item response modeling of children's change on a dynamic test of analogical reasoning

    NARCIS (Netherlands)

    Stevenson, C.E.; Hickendorff, M.; Resing, W.C.M.; Heiser, W.J.; de Boeck, P.A.L.

    Dynamic testing is an assessment method in which training is incorporated into the procedure with the aim of gauging cognitive potential. Large individual differences are present in children's ability to profit from training in analogical reasoning. The aim of this experiment was to investigate

  14. Evaluation of the box and blocks test, stereognosis and item banks of activity and upper extremity function in youths with brachial plexus birth palsy.

    Science.gov (United States)

    Mulcahey, Mary Jane; Kozin, Scott; Merenda, Lisa; Gaughan, John; Tian, Feng; Gogola, Gloria; James, Michelle A; Ni, Pengsheng

    2012-09-01

    One of the greatest limitations to measuring outcomes in pediatric orthopaedics is the lack of effective instruments. Computer adaptive testing, which uses large item banks, select only items that are relevant to a child's function based on a previous response and filters items that are too easy or too hard or simply not relevant to the child. In this way, computer adaptive testing provides for a meaningful, efficient, and precise method to evaluate patient-reported outcomes. Banks of items that assess activity and upper extremity (UE) function have been developed for children with cerebral palsy and have enabled computer adaptive tests that showed strong reliability, strong validity, and broader content range when compared with traditional instruments. Because of the void in instruments for children with brachial plexus birth palsy (BPBP) and the importance of having an UE and activity scale, we were interested in how well these items worked in this population. Cross-sectional, multicenter study involving 200 children with BPBP was conducted. The box and block test (BBT) and Stereognosis tests were administered and patient reports of UE function and activity were obtained with the cerebral palsy item banks. Differential item functioning (DIF) was examined. Predictive ability of the BBT and stereognosis was evaluated with proportional odds logistic regression model. Spearman correlations coefficients (rs) were calculated to examine correlation between stereognosis and the BBT and between individual stereognosis items and the total stereognosis score. Six of the 86 items showed DIF, indicating that the activity and UE item banks may be useful for computer adaptive tests for children with BPBP. The penny and the button were strongest predictors of impairment level (odds ratio=0.34 to 0.40]. There was a good positive relationship between total stereognosis and BBT scores (rs=0.60). The BBT had a good negative (rs=-0.55) and good positive (rs=0.55) relationship with

  15. Using Automated Processes to Generate Test Items And Their Associated Solutions and Rationales to Support Formative Feedback

    Directory of Open Access Journals (Sweden)

    Mark Gierl

    2015-08-01

    Full Text Available Automatic item generation is the process of using item models to produce assessment tasks using computer technology. An item model is similar to a template that highlights the elements in the task that must be manipulated to produce new items. The purpose of our study is to describe an innovative method for generating large numbers of diverse and heterogeneous items along with their solutions and associated rationales to support formative feedback. We demonstrate the method by generating items in two diverse content areas, mathematics and nonverbal reasoning

  16. Sex Differences on the Mental Rotation Test: An Analysis of Item Types

    Science.gov (United States)

    Bors, Douglas A.; Vigneau, Francois

    2011-01-01

    Replicating a finding now common in the literature, the present study revealed a significant difference between the performance of men (M = 19.66; SD = 5.34; SK = 0.52) and the performance of women (M = 14.85; SD = 6.06; SK = -0.38, Cohen's d = 0.90) on the Mental Rotation Test (Vandenberg & Kuse, 1978). In an attempt to identify determinants of…

  17. BRCA Testing by Single-Molecule Molecular Inversion Probes

    NARCIS (Netherlands)

    Neveling, K.; Mensenkamp, A.R.; Derks, R; Kwint, M.P.; Ouchene, H.; Steehouwer, M.; Lier, L.A. van; Bosgoed, E.A.J.; Rikken, A.; Tychon, M.W.J.; Zafeiropoulou, D.; Castelein, S.; Hehir-Kwa, J.Y.; Thung, G.W.; Hofste, T.; Lelieveld, S.H.; Bertens, S.M.; Adan, I.B.; Eijkelenboom, A.; Tops, B.B.J.; Yntema, H.G.; Stokowy, T.; Knappskog, P.M.; Hoberg-Vetti, H.; Steen, V.M.; Boyle, E.; Martin, B.; Ligtenberg, M.J.L.; Shendure, J.; Nelen, M.R.; Hoischen, A.

    2017-01-01

    BACKGROUND: Despite advances in next generation DNA sequencing (NGS), NGS-based single gene tests for diagnostic purposes require improvements in terms of completeness, quality, speed, and cost. Single-molecule molecular inversion probes (smMIPs) are a technology with unrealized potential in the

  18. Single-shell tank riser resistance to ground test plan

    International Nuclear Information System (INIS)

    Kiewert, L.R.

    1996-01-01

    This Test Procedure provides the general directions for conducting Single-Shell Tank Riser to Earth Measurements which will be used by engineering as a step towards providing closure for the Lightning Hazard Issue

  19. The Effect of Trier Social Stress Test (TSST) on Item and Associative Recognition of Words and Pictures in Healthy Participants

    Science.gov (United States)

    Guez, Jonathan; Saar-Ashkenazy, Rotem; Keha, Eldad; Tiferet-Dweck, Chen

    2016-01-01

    Psychological stress, induced by the Trier Social Stress Test (TSST), has repeatedly been shown to alter memory performance. Although factors influencing memory performance such as stimulus nature (verbal/pictorial) and emotional valence have been extensively studied, results whether stress impairs or improves memory are still inconsistent. This study aimed at exploring the effect of TSST on item versus associative memory for neutral, verbal, and pictorial stimuli. 48 healthy subjects were recruited, 24 participants were randomly assigned to the TSST group and the remaining 24 participants were assigned to the control group. Stress reactivity was measured by psychological (subjective state anxiety ratings) and physiological (Galvanic skin response recording) measurements. Subjects performed an item-association memory task for both stimulus types (words, pictures) simultaneously, before, and after the stress/non-stress manipulation. The results showed that memory recognition for pictorial stimuli was higher than for verbal stimuli. Memory for both words and pictures was impaired following TSST; while the source for this impairment was specific to associative recognition in pictures, a more general deficit was observed for verbal material, as expressed in decreased recognition for both items and associations following TSST. Response latency analysis indicated that the TSST manipulation decreased response time but at the cost of memory accuracy. We conclude that stress does not uniformly affect memory; rather it interacts with the task’s cognitive load and stimulus type. Applying the current study results to patients diagnosed with disorders associated with traumatic stress, our findings in healthy subjects under acute stress provide further support for our assertion that patients’ impaired memory originates in poor recollection processing following depletion of attentional resources. PMID:27148117

  20. The effect of Trier Social Stress Test (TSST on item and associative recognition of words and pictures in healthy participants

    Directory of Open Access Journals (Sweden)

    Jonathan eGuez

    2016-04-01

    Full Text Available Psychological stress, induced by the Trier Social Stress Test (TSST, has repeatedly been shown to alter memory performance. Although factors influencing memory performance such as stimulus nature (verbal /pictorial and emotional valence have been extensively studied, results whether stress impairs or improves memory are still inconsistent. This study aimed at exploring the effect of TSST on item versus associative memory for neutral, verbal, and pictorial stimuli. 48 healthy subjects were recruited, 24 participants were randomly assigned to the TSST group and the remaining 24 participants were assigned to the control group. Stress reactivity was measured by psychological (subjective state anxiety ratings and physiological (Galvanic skin response recording measurements. Subjects performed an item-association memory task for both stimulus types (words, pictures simultaneously, before, and after the stress/non-stress manipulation. The results showed that memory recognition for pictorial stimuli was higher than for verbal stimuli. Memory for both words and pictures was impaired following TSST; while the source for this impairment was specific to associative recognition in pictures, a more general deficit was observed for verbal material, as expressed in decreased recognition for both items and associations following TSST. Response latency analysis indicated that the TSST manipulation decreased response time but at the cost of memory accuracy. We conclude that stress does not uniformly affect memory; rather it interacts with the task’s cognitive load and stimulus type. Applying the current study results to patients diagnosed with disorders associated with traumatic stress, our findings in healthy subjects under acute stress provide further support for our assertion that patients’ impaired memory originates in poor recollection processing following depletion of attentional resources.

  1. Passive ultra high frequency radio frequency identification systems for single-item identification in food supply chains

    Directory of Open Access Journals (Sweden)

    Paolo Barge

    2017-02-01

    Full Text Available In the food industry, composition, size, and shape of items are much less regular than in other commodities sectors. In addition, a wide variety of packaging, composed by different materials, is employed. As material, size and shape of items to which the tag should be attached strongly influence the minimum power requested for tag functioning, performance improvements can be achieved only selecting suitable radio frequency (RF identifiers for the specific combination of food product and packaging. When dealing with logistics units, the dynamic reading of a vast number of tags could originate simultaneous broadcasting of signals (tag-to-tag collisions that could affect reading rates and the overall reliability of the identification procedure. This paper reports the results of an analysis of the reading performance of ultra high frequency radio frequency identification systems for multiple static and dynamic electronic identification of food packed products in controlled conditions. Products were considered when arranged on a logistics pallet. The effects on reading rate of different factors, among which the product type, the gate configuration, the field polarisation, the power output of the RF reader, the interrogation protocol configuration as well as the transit speed, the number of tags and their interactions were statistically analysed and compared.

  2. A single hole tracer test to determine longitudinal dispersion

    International Nuclear Information System (INIS)

    Noy, D.J.; Holmes, D.C.

    1986-03-01

    The paper concerns a single hole tracer test to determine longitudinal dispersion, which is an important parameter in assessing the suitability of a site for radioactive waste disposal. The theory, equipment and procedure for measuring longitudinal dispersion in a single borehole is described. Results are presented for field trials conducted in an aquifer, where the technique produced good results. The measured value of longitudinal dispersion, from a single hole test, relates only to a limited volume of rock immediately adjacent to the borehole. (U.K.)

  3. 'Do you think you suffer from depression?' Reevaluating the use of a single item question for the screening of depression in older primary care patients

    DEFF Research Database (Denmark)

    Ayalon, Liat; Goldfracht, Margalit; Bech, Per

    2010-01-01

    to existing depression screening tools. METHODS: A cross sectional sample of 153 older primary care patients. Participants completed several depression-screening measures (e.g. a single depression screen, Patient Health Questionnaire-9, Major Depression Inventory, Visual Analogue Scale). Measures were...... evaluated against a depression diagnosis made by the Structured Clinical Interview for DSM-IV. RESULTS: Overall, 3.9% of the sample was diagnosed with depression. The most notable finding was that the single-item question, 'do you think you suffer from depression?' had as good or better sensitivity (83......: An easy way to detect depression in older primary care patients would be asking the single question, 'do you think you suffer from depression?'...

  4. Gender differential item functioning on a national field-specific test: The case of PhD entrance exam of TEFL in Iran

    Directory of Open Access Journals (Sweden)

    Alireza Ahmadi

    2016-01-01

    Full Text Available Differential Item Functioning (DIF exists when examinees of equal ability from different groups have different probabilities of successful performance in a certain item. This study examined gender differential item functioning across the PhD Entrance Exam of TEFL (PEET in Iran, using both logistic regression (LR and one-parameter item response theory (1-p IRT models. The PEET is a national test consisting of a centralized written examination designed to provide information on the eligibility of PhD applicants of TEFL to enter PhD programs. The 2013 administration of this test provided score data for a sample of 999 Iranian PhD applicants consisting of 397 males and 602 females. First, the data were subjected to DIF analysis through logistic regression (LR model. Then, to triangulate the findings, a 1-p IRT procedure was applied. The results indicated (1 more items flagged for DIF by LR than by 1-p IRT (2 DIF cancellation (the number of DIF items were equal for both males and females, as revealed through LR, (3 equal number of uniform and non-uniform DIF, as tracked via LR, and (4 female superiority in the test performance, as revealed via IRT analysis. Overall, the findings of the study indicated that PEET suffers from DIF. As such, test developers and policymakers (like NOET & MSRT are recommended to take these findings into serious consideration and exercise care in fair test practice by dedicating effort to more unbiased test development and decision making.

  5. An Item Analysis of the French Version of the Test for Reception of Grammar Among Children and Adolescents With Down Syndrome or Intellectual Disability of Undifferentiated Etiology.

    Science.gov (United States)

    Facon, Bruno; Magis, David

    2016-10-01

    An item analysis of Bishop's (1983) Test for Reception of Grammar (TROG) in its French version (F-TROG; Lecocq, 1996) was conducted to determine whether the difficulty of items is similar for participants with or without intellectual disability (ID). In Study 1, responses to the 92 F-TROG items by 55 participants with Down syndrome (DS), 55 with ID of undifferentiated etiology (UND), and 55 typical children (TYP) matched on their F-TROG total score were compared using the transformed item difficulties method, a statistical approach designed to detect differential item functioning (DIF) between groups. In Study 2, an additional comparison involving 526 TYP participants and 526 participants with UND was conducted to increase the statistical power of the analysis. The difficulty of items was highly similar whatever the sample size or clinical status of participants. Fewer than 3.5% of the items were flagged as showing DIF. Tests such as the TROG can be used with confidence in clinical practice as well as in research studies comparing participants with or without ID. Methods designed for investigating potential internal test bias-such as done here-should be more regularly employed in the developmental disability field to affirm the absence of DIF.

  6. Value in Single-level Lumbar Discectomy: Surgical Disposable Item Cost and Relationship to Patient-reported Outcomes.

    Science.gov (United States)

    Rosenbaum, Benjamin P; Modic, Michael T; Krishnaney, Ajit A

    2017-11-01

    This is a retrospective study. Compare improvements in health status measures (HSMs) and surgical costs to determine whether use of more costly items has any relationship to clinical outcome and value in lumbar disc surgery. Association between cost, outcomes, and value in spine surgery, including lumbar discectomy is poorly understood. Outcomes were calculated as difference in mean HSM scores between preoperative and postoperative timeframes. Prospective validated patient-reported HSMs studied were EuroQol quality of life index score (EQ-5D), Pain Disability Questionnaire (PDQ), and Patient Health Questionnaire (PHQ-9). Surgical costs consisted of disposable items and implants used in operating room. We retrospectively identified all adult patients at Cleveland Clinic main campus between October 2009 and August 2013 who underwent lumbar discectomy (652) using administrative billing data, Current Procedural Terminology (CPT) code 63030. HSMs were obtained from Cleveland Clinic Knowledge Program Data Registry. In total, 67% of operations performed in the outpatient or ambulatory setting, 33% in the inpatient setting. Among 9 surgeons who performed >10 lumbar discectomies, there were 72.4 operations per surgeon, on average. Mean surgical costs of each surgeon differed (Pcosts (Pcosts (P=0.76, 0.07, 0.76, respectively). In multivariable regression, only surgical cost was significantly correlated to mean difference in PDQ (P=0.030). More costly surgeries resulted in worse PDQ outcomes. Mean surgical costs varied statistically among 9 surgeons; costs were not shown to be positively correlated with patient outcomes. Performing an operation using more costly disposable supplies/implants does not seem to improve patient outcomes and should be considered when constructing preference cards and during an operation.

  7. Three Statistical Testing Procedures in Logistic Regression: Their Performance in Differential Item Functioning (DIF) Investigation. Research Report. ETS RR-09-35

    Science.gov (United States)

    Paek, Insu

    2009-01-01

    Three statistical testing procedures well-known in the maximum likelihood approach are the Wald, likelihood ratio (LR), and score tests. Although well-known, the application of these three testing procedures in the logistic regression method to investigate differential item function (DIF) has not been rigorously made yet. Employing a variety of…

  8. easyCBM CCSS Math Item Scaling and Test Form Revision (2012-2013): Grades 6-8. Technical Report #1313

    Science.gov (United States)

    Anderson, Daniel; Alonzo, Julie; Tindal, Gerald

    2012-01-01

    The purpose of this technical report is to document the piloting and scaling of new easyCBM mathematics test items aligned with the Common Core State Standards (CCSS) and to describe the process used to revise and supplement the 2012 research version easyCBM CCSS math tests in Grades 6-8. For all operational 2012 research version test forms (10…

  9. Changes in Word Usage Frequency May Hamper Intergenerational Comparisons of Vocabulary Skills: An Ngram Analysis of Wordsum, WAIS, and WISC Test Items

    Science.gov (United States)

    Roivainen, Eka

    2014-01-01

    Research on secular trends in mean intelligence test scores shows smaller gains in vocabulary skills than in nonverbal reasoning. One possible explanation is that vocabulary test items become outdated faster compared to nonverbal tasks. The history of the usage frequency of the words on five popular vocabulary tests, the GSS Wordsum, Wechsler…

  10. 49 CFR 232.305 - Single car air brake tests.

    Science.gov (United States)

    2010-10-01

    ... from a train or when placed on a shop or repair track, as defined in § 232.303(a); (2) A car is on a shop or repair track, as defined in § 232.303(a), for any reason and has not received a single car air... 49 Transportation 4 2010-10-01 2010-10-01 false Single car air brake tests. 232.305 Section 232...

  11. My mind is as clear as it used to be: A pilot study illustrating the difficulties of employing a single-item subjective screen to detect cognitive impairment in outpatients with cancer.

    Science.gov (United States)

    Kibiger, Gail; Kirsh, Kenneth L; Wall, Jacqueline R; Passik, Steven D

    2003-08-01

    Oncology patients often complain that their "mind does not seem to be clear." This subjective perception, sometimes referred to as "chemo brain," may be due to situational stressors, psychological disorders, organic factors, or effects of neurotoxic medications. Cognitive decline cannot only diminish quality of life, but can also interfere with a patient's ability to make decisions regarding complex treatment issues. The current study investigated the utility of using item 11 of the Zung Self-Rating Depression Screen (ZSDS) as a cognitive screen. A sample of 61 ambulatory cancer patients completed this study. Participants were recruited from four sites of Community Cancer Care, Inc., in Indiana. A battery of cognitive instruments and psychosocial inventories was administered in a standardized order. The sample had a mean age of 58.6 years and comprised 57.4% (n=35) women and 42.6% (n=26) men. Item 11 of the ZSDS was not significantly correlated to the cognitive measures. Correlates of the perception of cognitive impairment were the Dementia Rating Scale (DRS) Attention Scale (r=-0.26, PStroop test (F=19.8, Pspecificity indicated that the single-item screen used in this study is not an accurate means for identifying oncology patients with actual cognitive impairment. We conclude that while the perception of cognitive impairment is common in cancer patients, there may be problems in interpreting the nature of these complaints, particularly in separating them from depressive preoccupation.

  12. Item Banking with Embedded Standards

    Science.gov (United States)

    MacCann, Robert G.; Stanley, Gordon

    2009-01-01

    An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…

  13. Hybrid Testing of Composite Structures with Single-Axis Control

    DEFF Research Database (Denmark)

    Waldbjørn, Jacob Paamand; Høgh, Jacob Herold; Stang, Henrik

    2013-01-01

    Hybrid testing is a substructuring technique where a structure is emulated by modelling a part of it in a numerical model while testing the remainder experimentally. Previous research in hybrid testing has been performed on multi-component structures e.g. damping fixtures, however in this paper...... a hybrid testing platform is introduced for single-component hybrid testing. In this case, the boundary between the numerical model and experimental setup is defined by multiple Degrees-Of-Freedoms (DOFs) which highly complicate the transferring of response between the two substructures. Digital Image...... Correlation (DIC) is therefore implemented for displacement control of the experimental setup. The hybrid testing setup was verified on a multicomponent structure consisting of a beam loaded in three point bending and a numerical structure of a frame. Furthermore, the stability of the hybrid testing loop...

  14. Development and Application of Methods for Estimating Operating Characteristics of Discrete Test Item Responses without Assuming any Mathematical Form.

    Science.gov (United States)

    Samejima, Fumiko

    In latent trait theory the latent space, or space of the hypothetical construct, is usually represented by some unidimensional or multi-dimensional continuum of real numbers. Like the latent space, the item response can either be treated as a discrete variable or as a continuous variable. Latent trait theory relates the item response to the latent…

  15. European accelerator facilities for single event effects testing

    Energy Technology Data Exchange (ETDEWEB)

    Adams, L.; Nickson, R.; Harboe-Sorensen, R. [ESA-ESTEC, Noordwijk (Netherlands); Hajdas, W.; Berger, G.

    1997-03-01

    Single event effects are an important hazard to spacecraft and payloads. The advances in component technology, with shrinking dimensions and increasing complexity will give even more importance to single event effects in the future. The ground test facilities are complex and expensive and the complexities of installing a facility are compounded by the requirement that maximum control is to be exercised by users largely unfamiliar with accelerator technology. The PIF and the HIF are the result of experience gained in the field of single event effects testing and represent a unique collaboration between space technology and accelerator experts. Both facilities form an essential part of the European infrastructure supporting space projects. (J.P.N.)

  16. Model tests on single piles in soft clay

    Energy Technology Data Exchange (ETDEWEB)

    Pan, J.L. [Durham Univ., Durham, (United Kingdom). School of Engineering; Goh, A.T.C.; Wong, K.S.; Teh, C.I. [Nanyang Technological Univ., (Singapore). Geotechnical Research Centre

    2000-08-04

    The behaviour of single stainless steel piles subjected to lateral soft clay soil movement was investigated in laboratory model tests in an effort to determine the ultimate soil pressure acting along the pile shaft. A custom designed apparatus was manufactured and calibrated for the test which measured the limiting soil pressures acting along the model pile shaft. The ultimate soil pressure was determined based on the maximum value of this measurement. The results show that the ultimate soil pressure for single passive piles was about 10 times the undrained shear strength, and the magnitude of the soil translation needed to fully mobilize the ultimate soil pressure on the single passive piles was about half the pile width. Further experimental study is needed to examine the effects of the pile end fixity, flexibility and shape and to confirm the effects of sample size and the disturbance due to soil sample preparation. 17 refs., 10 figs.

  17. Limited-information goodness-of-fit testing of item response theory models for sparse 2 tables.

    Science.gov (United States)

    Cai, Li; Maydeu-Olivares, Albert; Coffman, Donna L; Thissen, David

    2006-05-01

    Bartholomew and Leung proposed a limited-information goodness-of-fit test statistic (Y) for models fitted to sparse 2(P ) contingency tables. The null distribution of Y was approximated using a chi-squared distribution by matching moments. The moments were derived under the assumption that the model parameters were known in advance and it was conjectured that the approximation would also be appropriate when the parameters were to be estimated. Using maximum likelihood estimation of the two-parameter logistic item response theory model, we show that the effect of parameter estimation on the distribution of Y is too large to be ignored. Consequently, we derive the asymptotic moments of Y for maximum likelihood estimation. We show using a simulation study that when the null distribution of Y is approximated using moments that take into account the effect of estimation, Y becomes a very useful statistic to assess the overall goodness of fit of models fitted to sparse 2(P) tables.

  18. The accuracy of the Life Orientation Test-Revised (LOT-R) in measuring dispositional optimism: evidence from item response theory analyses.

    Science.gov (United States)

    Chiesi, Francesca; Galli, Silvia; Primi, Caterina; Innocenti Borgi, Paolo; Bonacchi, Andrea

    2013-01-01

    The accuracy of the Life Orientation Test-Revised (LOT-R) in measuring dispositional optimism was investigated applying item response theory (IRT). The study was conducted on a sample of 484 university students (62% males, M age = 22.79 years, SD = 5.63). After testing the 1-factor structure of the scale, IRT was applied to evaluate the functioning of the LOT-R along the pessimism-optimism continuum. Item parameter estimates and the test information function showed that each item and the global scale satisfactorily measured the latent trait. Referring to the IRT estimated trait levels, the validity of the LOT-R was studied examining the relationships between dispositional optimism and psychological well-being, sense of mastery, and sense of coherence. Overall findings based on IRT analyses provide evidence of the accuracy of the LOT-R and suggest possible modifications of the scale to improve the assessment of dispositional optimism.

  19. Developmental Validation of the ParaDNA® Screening System - A presumptive test for the detection of DNA on forensic evidence items.

    Science.gov (United States)

    Dawnay, Nick; Stafford-Allen, Beccy; Moore, Dave; Blackman, Stephen; Rendell, Paul; Hanson, Erin K; Ballantyne, Jack; Kallifatidis, Beatrice; Mendel, Julian; Mills, DeEtta K; Nagy, Randy; Wells, Simon

    2014-07-01

    Current assessment of whether a forensic evidence item should be submitted for STR profiling is largely based on the personal experience of the Crime Scene Investigator (CSI) and the submissions policy of the law enforcement authority involved. While there are chemical tests that can infer the presence of DNA through the detection of biological stains, the process remains mostly subjective and leads to many samples being submitted that give no profile or not being submitted although DNA is present. The ParaDNA(®) Screening System was developed to address this issue. It consists of a sampling device, pre-loaded reaction plates and detection instrument. The test uses direct PCR with fluorescent HyBeacon™ detection of PCR amplicons to identify the presence and relative amount of DNA on an evidence item and also provides a gender identification result in approximately 75 minutes. This simple-to-use design allows objective data to be acquired by both DNA analyst and non-specialist personnel, to enable a more informed submission decision to be made. The developmental validation study described here tested the sensitivity, reproducibility, accuracy, inhibitor tolerance, and performance of the ParaDNA Screening System on a range of mock evidence items. The data collected demonstrates that the ParaDNA Screening System identifies the presence of DNA on a variety of evidence items including blood, saliva and touch DNA items. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  20. Analysis of Item Difficulty Parameters on Item Characteristic Curves ...

    African Journals Online (AJOL)

    Analysis of Item Difficulty Parameters on Item Characteristic Curves as A Function of Changes in WAEC and NECO Examination Instruments and Students Ability Parameters in Mathematics Objective Test in Cross River State, Nigeria.

  1. Bipolar Items

    Directory of Open Access Journals (Sweden)

    Nishiguchi Sumiyo

    2016-12-01

    Full Text Available This article asserts that the Japanese wide-scope mo ‘even’ in simple sentences are bipolar items (BPIs antilicensed or forbidden by negation and licensed in a non-monotonic (NM environment. BPIs share the features of negative polarity items (NPIs as well as positive polarity items (PPIs. The Dutch ooit ‘ever’, the Serbo-Croatian i-series ‘and/even’, and the Hungarian is-series ‘and/even’ are antilicensed by clausemate negation and licensed by extraclausal negation (van der Wouden, 1997; Progovac, 1994; Szabolcsi, 2002 or non-monotonic negative (and positive, for Serbo-Croatian emotive predicates. Adding an NPI rescues BPIs in uncomfortable clausemate negation.

  2. Earthquake acceleration amplification based on single microtremor test

    Science.gov (United States)

    Jaya Syahbana, Arifan; Kurniawan, Rahmat; Soebowo, Eko

    2018-02-01

    Understanding soil dynamics is needed to understand soil behaviour, including the parameters of earthquake acceleration amplification. Many researchers now conduct single microtremor tests to obtain amplification of velocity and natural periods of soil at test sites. However, these amplification parameters are rarely used, so a method is needed to convert the velocity amplification to acceleration amplification. This paper will discuss the proposed process of changing the value of amplification. The proposed method is to integrate the time histories of the synthetic earthquake acceleration of the soil surface under the deaggregation at that location so the time histories of the velocity earthquake will be obtained. Next is to conduct a “fitting curve” between amplification by a single microtremor test with amplification of the synthetic earthquake velocity time histories. After obtaining the fitting curve time histories of velocity, differentiation will be conducted to obtain fitting curve acceleration time histories. The final step after obtaining the fitting curve is to compare the acceleration of the “fitting curve” against the histories time of the acceleration of synthetic earthquake at bedrocks to obtain single microtremor acceleration amplification factor.

  3. The Italian version of the 16-item prodromal questionnaire (iPQ-16): Field-test and psychometric features.

    Science.gov (United States)

    Lorenzo, Pelizza; Silvia, Azzali; Federica, Paterlini; Sara, Garlassi; Ilaria, Scazza; Pupo, Simona; Andrea, Raballo

    2018-03-20

    Among current early screeners for psychosis-risk states, the Prodromal Questionnaire-16 items (PQ-16) is often used. We aimed to assess validity and reliability of the Italian version of the PQ-16 in a young adult help-seeking population. We included 154 individuals aged 18-35years seeking help at the Reggio Emilia outpatient mental health services in a large semirural catchment area (550.000 inhabitants). Participants completed the Italian version of the PQ-16 (iPQ-16) and were subsequently evaluated with the Comprehensive Assessment of At-Risk Mental States (CAARMS). We examined diagnostic accuracy (i.e. specificity, sensitivity, negative and positive likelihood ratios, and negative and positive predictive values) and content, convergent, and concurrent validity between PQ-16 and CAARMS using Cronbach's alpha, Spearman's rho, and Cohen's kappa, respectively. We also tested the validity of the adopted PQ-16 cut-offs through Receiver Operating Characteristic (ROC) curves plotted against CAARMS diagnoses and the 1-year predictive validity of the PQ-16. The iPQ-16 showed high internal consistency and acceptable diagnostic accuracy and concurrent validity. ROC analyses pointed to a cut-off score of ≥5 as best cut-off. After 12months of follow-up, 8.7% of participants with a PQ-16 symptom total score of ≥5 who were below the CAARMS psychosis threshold at the baseline, developed a psychotic disorder. Psychometric properties of the iPQ-16 were satisfactory. Copyright © 2018. Published by Elsevier B.V.

  4. CUSUM Statistics for Large Item Banks: Computation of Standard Errors. Law School Admission Council Computerized Testing Report. LSAC Research Report Series.

    Science.gov (United States)

    Glas, C. A. W.

    In a previous study (1998), how to evaluate whether adaptive testing data used for online calibration sufficiently fit the item response model used by C. Glas was studied. Three approaches were suggested, based on a Lagrange multiplier (LM) statistic, a Wald statistic, and a cumulative sum (CUMSUM) statistic respectively. For all these methods,…

  5. Measuring Student Involvement: A Comparison of Classical Test Theory and Item Response Theory in the Construction of Scales from Student Surveys

    Science.gov (United States)

    Sharkness, Jessica; DeAngelo, Linda

    2011-01-01

    This study compares the psychometric utility of Classical Test Theory (CTT) and Item Response Theory (IRT) for scale construction with data from higher education student surveys. Using 2008 Your First College Year (YFCY) survey data from the Cooperative Institutional Research Program at the Higher Education Research Institute at UCLA, two scales…

  6. [Psychometric quality of the "Eating Attitudes Test" (German version EAT-26D) for measuring disordered eating in pre-adolescents and proposal for a 13-item short version].

    Science.gov (United States)

    Berger, Uwe; Hentrich, Isabel; Wick, Katharina; Bormann, Bianca; Brix, Christina; Sowa, Melanie; Schwartze, Dominique; Strauß, Bernhard

    2012-06-01

    To detect risky eating behavior questionnaires should be economic but at the same time they should fulfill the psychometric quality criteria. Available instruments are too long for the target group (e. g. EDE-Q, 28 items), restricted on primary symptoms (short version of EDI, 23 items) and with minor reliability (e. g. SCOFF and WC-Scale, 5 items each). Using the German version of the Eating Attitudes Test (EAT-26D, which comprises 26 items) in a community sample of 1 331 11-13 year old girls and 906 boys from Thuringia, Germany, we measured a internal consistency of Cronbachs' Alpha=0.85 for girls and 0.78 for boys. In a principal factor analysis, we could replicate the 6-factorial structure of previous studies. A confirmatory factor analysis verified the suitability of the EAT-26D for both, girls and boys. Reducing the EAT-26D on the 3 core-factors leads to an economic 13 item short version with an internal consistency of 0.87 for girls and 0.80 for boys. © Georg Thieme Verlag KG Stuttgart · New York.

  7. An Item Response Theory-Based, Computerized Adaptive Testing Version of the MacArthur-Bates Communicative Development Inventory

    DEFF Research Database (Denmark)

    Makransky, Guido; Dale, Philip; Havmose, Philip S.

    2016-01-01

    precision. Method: Parent-reported vocabulary for the American CDI:WS norming sample consisting 1461 children between the ages of 16 and 30 months was used to investigate the fit of the items to the 2 parameter logistic (2-PL) IRT model, and to simulate CDI-CAT versions with 400, 200, 100, 50, 25, 10 and 5...

  8. Single specimen fracture toughness determination procedure using instrumented impact test

    International Nuclear Information System (INIS)

    Rintamaa, R.

    1993-04-01

    In the study a new single specimen test method and testing facility for evaluating dynamic fracture toughness has been developed. The method is based on the application of a new pendulum type instrumented impact tester equipped with and optical crack mouth opening displacement (COD) extensometer. The fracture toughness measurement technique uses the Double Displacement Ratio (DDR) method, which is based on the assumption that the specimen is deformed as two rigid arms that rotate around an apparent centre of rotation. This apparent moves as the crack grows, and the ratio of COD versus specimen displacement changes. As a consequence the onset ductile crack initiation can be detected on the load-displacement curve. Thus, an energy-based fracture toughness can be calculated. In addition the testing apparatus can use specimens with the Double ligament size as compared with the standard Charpy specimen which makes the impact testing more appropriate from the fracture mechanics point of view. The novel features of the testing facility and the feasibility of the new DDR method has been verified by performing an extensive experimental and analytical study. (99 refs., 91 figs., 27 tabs.)

  9. Evaluation of a single leg stance balance test in children.

    Science.gov (United States)

    Zumbrunn, Thomas; MacWilliams, Bruce A; Johnson, Barbara A

    2011-06-01

    Balance is a major determinate of gait. In high functioning individuals without significant vestibular or vision impairments, a ceiling effect may be present when using a double limb support protocol to assess balance function. For these populations, a single leg stance protocol may be more suitable. 47 typically developing (TD) subjects and 10 patients with CEV performed a single leg stance test on a force plate. The center of pressure (COP) was determined and several COP derived variables were calculated. Included measurements were: standard deviation, maximum excursion, area, average radial displacement, path velocity and frequency of the COP. Directional components of suitable variables were used to analyze anterior/posterior and medial/lateral contributions. Correlations with age of TD subjects indicated that all balance variables except frequency were significantly correlated. Most parameters were highly inter-correlated. Age adjusted COP balance variables also correlated to the Bruininks-Oseretsky balance subtest. Highest correlations were determined by the maximum excursion and velocity of the COP in the anterior/posterior direction. Statistical comparisons between the CEV group and a 4-6 TD group indicated significant differences between groups for most COP balance parameters. These results indicated that a single limb balance assessment may be a useful assessment for determining balance impairments in higher functioning children with orthopedic impairments. Copyright © 2011 Elsevier B.V. All rights reserved.

  10. Item level diagnostics and model - data fit in item response theory ...

    African Journals Online (AJOL)

    Item response theory (IRT) is a framework for modeling and analyzing item response data. Item-level modeling gives IRT advantages over classical test theory. The fit of an item score pattern to an item response theory (IRT) models is a necessary condition that must be assessed for further use of item and models that best fit ...

  11. Simple test system for single molecule recognition force microscopy

    International Nuclear Information System (INIS)

    Riener, Christian K.; Stroh, Cordula M.; Ebner, Andreas; Klampfl, Christian; Gall, Alex A.; Romanin, Christoph; Lyubchenko, Yuri L.; Hinterdorfer, Peter; Gruber, Hermann J.

    2003-01-01

    We have established an easy-to-use test system for detecting receptor-ligand interactions on the single molecule level using atomic force microscopy (AFM). For this, avidin-biotin, probably the best characterized receptor-ligand pair, was chosen. AFM sensors were prepared containing tethered biotin molecules at sufficiently low surface concentrations appropriate for single molecule studies. A biotin tether, consisting of a 6 nm poly(ethylene glycol) (PEG) chain and a functional succinimide group at the other end, was newly synthesized and covalently coupled to amine-functionalized AFM tips. In particular, PEG 800 diamine was glutarylated, the mono-adduct NH 2 -PEG-COOH was isolated by ion exchange chromatography and reacted with biotin succinimidylester to give biotin-PEG-COOH which was then activated as N-hydroxysuccinimide (NHS) ester to give the biotin-PEG-NHS conjugate which was coupled to the aminofunctionalized AFM tip. The motional freedom provided by PEG allows for free rotation of the biotin molecule on the AFM sensor and for specific binding to avidin which had been adsorbed to mica surfaces via electrostatic interactions. Specific avidin-biotin recognition events were discriminated from nonspecific tip-mica adhesion by their typical unbinding force (∼40 pN at 1.4 nN/s loading rate), unbinding length (<13 nm), the characteristic nonlinear force-distance relation of the PEG linker, and by specific block with excess of free d-biotin. The convenience of the test system allowed to evaluate, and compare, different methods and conditions of tip aminofunctionalization with respect to specific binding and nonspecific adhesion. It is concluded that this system is well suited as calibration or start-up kit for single molecule recognition force microscopy

  12. Test-retest reliability at the item level and total score level of the Norwegian version of the Spinal Cord Injury Falls Concern Scale (SCI-FCS).

    Science.gov (United States)

    Roaldsen, Kirsti Skavberg; Måøy, Åsa Blad; Jørgensen, Vivien; Stanghelle, Johan Kvalvik

    2016-05-01

    Translation of the Spinal Cord Injury Falls Concern Scale (SCI-FCS), and investigation of test-retest reliability on item-level and total-score-level. Translation, adaptation and test-retest study. A specialized rehabilitation setting in Norway. Fifty-four wheelchair users with a spinal cord injury. The median age of the cohort was 49 years, and the median number of years after injury was 13. Interventions/measurements: The SCI-FCS was translated and back-translated according to guidelines. Individuals answered the SCI-FCS twice over the course of one week. We investigated item-level test-retest reliability using Svensson's rank-based statistical method for disagreement analysis of paired ordinal data. For relative reliability, we analyzed the total-score-level test-retest reliability with intraclass correlation coefficients (ICC2.1), the standard error of measurement (SEM), and the smallest detectable change (SDC) for absolute reliability/measurement-error assessment and Cronbach's alpha for internal consistency. All items showed satisfactory percentage agreement (≥69%) between test and retest. There were small but non-negligible systematic disagreements among three items; we recovered an 11-13% higher chance for a lower second score. There was no disagreement due to random variance. The test-retest agreement (ICC2.1) was excellent (0.83). The SEM was 2.6 (12%), and the SDC was 7.1 (32%). The Cronbach's alpha was high (0.88). The Norwegian SCI-FCS is highly reliable for wheelchair users with chronic spinal cord injuries.

  13. Effects of memantine on cognition in patients with moderate to severe Alzheimer's disease: post-hoc analyses of ADAS-cog and SIB total and single-item scores from six randomized, double-blind, placebo-controlled studies.

    Science.gov (United States)

    Mecocci, Patrizia; Bladström, Anna; Stender, Karina

    2009-05-01

    The post-hoc analyses reported here evaluate the specific effects of memantine treatment on ADAS-cog single-items or SIB subscales for patients with moderate to severe AD. Data from six multicentre, randomised, placebo-controlled, parallel-group, double-blind, 6-month studies were used as the basis for these post-hoc analyses. All patients with a Mini-Mental State Examination (MMSE) score of less than 20 were included. Analyses of patients with moderate AD (MMSE: 10-19), evaluated with the Alzheimer's disease Assessment Scale (ADAS-cog) and analyses of patients with moderate to severe AD (MMSE: 3-14), evaluated using the Severe Impairment Battery (SIB), were performed separately. The mean change from baseline showed a significant benefit of memantine treatment on both the ADAS-cog (p ADAS-cog single-item analyses showed significant benefits of memantine treatment, compared to placebo, for mean change from baseline for commands (p < 0.001), ideational praxis (p < 0.05), orientation (p < 0.01), comprehension (p < 0.05), and remembering test instructions (p < 0.05) for observed cases (OC). The SIB subscale analyses showed significant benefits of memantine, compared to placebo, for mean change from baseline for language (p < 0.05), memory (p < 0.05), orientation (p < 0.01), praxis (p < 0.001), and visuospatial ability (p < 0.01) for OC. Memantine shows significant benefits on overall cognitive abilities as well as on specific key cognitive domains for patients with moderate to severe AD. (c) 2009 John Wiley & Sons, Ltd.

  14. Sources of interference in item and associative recognition memory.

    Science.gov (United States)

    Osth, Adam F; Dennis, Simon

    2015-04-01

    A powerful theoretical framework for exploring recognition memory is the global matching framework, in which a cue's memory strength reflects the similarity of the retrieval cues being matched against the contents of memory simultaneously. Contributions at retrieval can be categorized as matches and mismatches to the item and context cues, including the self match (match on item and context), item noise (match on context, mismatch on item), context noise (match on item, mismatch on context), and background noise (mismatch on item and context). We present a model that directly parameterizes the matches and mismatches to the item and context cues, which enables estimation of the magnitude of each interference contribution (item noise, context noise, and background noise). The model was fit within a hierarchical Bayesian framework to 10 recognition memory datasets that use manipulations of strength, list length, list strength, word frequency, study-test delay, and stimulus class in item and associative recognition. Estimates of the model parameters revealed at most a small contribution of item noise that varies by stimulus class, with virtually no item noise for single words and scenes. Despite the unpopularity of background noise in recognition memory models, background noise estimates dominated at retrieval across nearly all stimulus classes with the exception of high frequency words, which exhibited equivalent levels of context noise and background noise. These parameter estimates suggest that the majority of interference in recognition memory stems from experiences acquired before the learning episode. (c) 2015 APA, all rights reserved).

  15. A Psychometric Analysis of the Italian Version of the eHealth Literacy Scale Using Item Response and Classical Test Theory Methods.

    Science.gov (United States)

    Diviani, Nicola; Dima, Alexandra Lelia; Schulz, Peter Johannes

    2017-04-11

    The eHealth Literacy Scale (eHEALS) is a tool to assess consumers' comfort and skills in using information technologies for health. Although evidence exists of reliability and construct validity of the scale, less agreement exists on structural validity. The aim of this study was to validate the Italian version of the eHealth Literacy Scale (I-eHEALS) in a community sample with a focus on its structural validity, by applying psychometric techniques that account for item difficulty. Two Web-based surveys were conducted among a total of 296 people living in the Italian-speaking region of Switzerland (Ticino). After examining the latent variables underlying the observed variables of the Italian scale via principal component analysis (PCA), fit indices for two alternative models were calculated using confirmatory factor analysis (CFA). The scale structure was examined via parametric and nonparametric item response theory (IRT) analyses accounting for differences between items regarding the proportion of answers indicating high ability. Convergent validity was assessed by correlations with theoretically related constructs. CFA showed a suboptimal model fit for both models. IRT analyses confirmed all items measure a single dimension as intended. Reliability and construct validity of the final scale were also confirmed. The contrasting results of factor analysis (FA) and IRT analyses highlight the importance of considering differences in item difficulty when examining health literacy scales. The findings support the reliability and validity of the translated scale and its use for assessing Italian-speaking consumers' eHealth literacy. ©Nicola Diviani, Alexandra Lelia Dima, Peter Johannes Schulz. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 11.04.2017.

  16. Is a single item stress measure independently associated with subsequent severe injury: a prospective cohort study of 16,385 forest industry employees.

    Science.gov (United States)

    Salminen, Simo; Kouvonen, Anne; Koskinen, Aki; Joensuu, Matti; Väänänen, Ari

    2014-06-02

    A previous review showed that high stress increases the risk of occupational injury by three- to five-fold. However, most of the prior studies have relied on short follow-ups. In this prospective cohort study we examined the effect of stress on recorded hospitalised injuries in an 8-year follow-up. A total of 16,385 employees of a Finnish forest company responded to the questionnaire. Perceived stress was measured with a validated single-item measure, and analysed in relation recorded hospitalised injuries from 1986 to 2008. We used Cox proportional hazard regression models to examine the prospective associations between work stress, injuries and confounding factors. Highly stressed participants were approximately 40% more likely to be hospitalised due to injury over the follow-up period than participants with low stress. This association remained significant after adjustment for age, gender, marital status, occupational status, educational level, and physical work environment. High stress is associated with an increased risk of severe injury.

  17. Concurrent Validity and Sensitivity to Change of Direct Behavior Rating Single-Item Scales (DBR-SIS) Within an Elementary Sample.

    Science.gov (United States)

    Smith, Rhonda L; Eklund, Katie; Kilgus, Stephen P

    2017-06-12

    The purpose of this study was to evaluate the concurrent validity, sensitivity to change, and teacher acceptability of Direct Behavior Rating single-item scales (DBR-SIS), a brief progress monitoring measure designed to assess student behavioral change in response to intervention. Twenty-four elementary teacher-student dyads implemented a daily report card intervention to promote positive student behavior during prespecified classroom activities. During both baseline and intervention, teachers completed DBR-SIS ratings of 2 target behaviors (i.e., Academic Engagement, Disruptive Behavior) whereas research assistants collected systematic direct observation (SDO) data in relation to the same behaviors. Five change metrics (i.e., absolute change, percent of change from baseline, improvement rate difference, Tau-U, and standardized mean difference; Gresham, 2005) were calculated for both DBR-SIS and SDO data, yielding estimates of the change in student behavior in response to intervention. Mean DBR-SIS scores were predominantly moderately to highly correlated with SDO data within both baseline and intervention, demonstrating evidence of the former's concurrent validity. DBR-SIS change metrics were also significantly correlated with SDO change metrics for both Disruptive Behavior and Academic Engagement, yielding evidence of the former's sensitivity to change. In addition, teacher Usage Rating Profile-Assessment (URP-A) ratings indicated they found DBR-SIS to be acceptable and usable. Implications for practice, study limitations, and areas of future research are discussed. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  18. Teoria de Resposta ao Item na análise de uma prova de estatística em universitários Item Response Theory to analyze a statistics test in university students

    Directory of Open Access Journals (Sweden)

    Claudette Maria Medeiros Vendramini

    2005-12-01

    Full Text Available Este estudo objetivou aplicar a Teoria de Resposta ao Item na análise das 15 questões de múltipla escolha de uma prova de estatística apresentada na forma de gráficos ou de tabelas estatísticas. Participaram 413 universitários, selecionados por conveniência, de duas instituições da rede particular de ensino superior, predominantemente do curso de Psicologia (91,5%. Os universitários foram 80% do gênero feminino e do período diurno (69,8%, com idades de 16 a 53 anos, média 24,4 e desvio padrão 7,4. A prova é predominantemente unidimensional e os itens são mais bem ajustados ao modelo logístico de três parâmetros. Os índices de discriminação, dificuldade e correlação bisserial apresentam valores aceitáveis. Os resultados mostram as dificuldades apresentadas pelos estudantes com relação aos conceitos matemáticos e estatísticos, dificuldades essas observadas em outras pesquisas desde o ensino fundamental. Sugere-se que esses conceitos sejam tratados mais profundamente no ensino superior.This study aimed to use the Item Response Theory to analyze the 15 multiple-choice questions of a statistics test presented in the statistics graphics or tables form. The 414 university students were selected by convenience from two private universities, predominantly psychology students (91.5%. The university students were 80% female and with 16-53 years old, mean 24.4 and standard deviation 7.4. The test has predominantly one dimension and the items can be better fitting to the model of three parameters. The indexes of difficulty, discrimination and bisserial correlation presented acceptable values. The results indicate the difficulties of university students in the mathematic and statistic concepts, that difficulties are observed in the other studies since the elementary education. One suggests making more profound studies of these concepts in higher education.

  19. Single Stage Contactor Testing Of The Next Generation Solvent Blend

    Energy Technology Data Exchange (ETDEWEB)

    Herman, D. T.; Peters, T. B.; Duignan, M. R.; Williams, M. R.; Poirier, M. R.; Brass, E. A.; Garrison, A. G.; Ketusky, E. T.

    2014-01-06

    The Modular Caustic Side Solvent Extraction (CSSX) Unit (MCU) facility at the Savannah River Site (SRS) is actively pursuing the transition from the current BOBCalixC6 based solvent to the Next Generation Solvent (NGS)-MCU solvent to increase the cesium decontamination factor. To support this integration of NGS into the MCU facility the Savannah River National Laboratory (SRNL) performed testing of a blend of the NGS (MaxCalix based solvent) with the current solvent (BOBCalixC6 based solvent) for the removal of cesium (Cs) from the liquid salt waste stream. This testing utilized a blend of BOBCalixC6 based solvent and the NGS with the new extractant, MaxCalix, as well as a new suppressor, tris(3,7dimethyloctyl) guanidine. Single stage tests were conducted using the full size V-05 and V-10 liquid-to-liquid centrifugal contactors installed at SRNL. These tests were designed to determine the mass transfer and hydraulic characteristics with the NGS solvent blended with the projected heel of the BOBCalixC6 based solvent that will exist in MCU at time of transition. The test program evaluated the amount of organic carryover and the droplet size of the organic carryover phases using several analytical methods. The results indicate that hydraulically, the NGS solvent performed hydraulically similar to the current solvent which was expected. For the organic carryover 93% of the solvent is predicted to be recovered from the stripping operation and 96% from the extraction operation. As for the mass transfer, the NGS solvent significantly improved the cesium DF by at least an order of magnitude when extrapolating the One-stage results to actual Seven-stage extraction operation with a stage efficiency of 95%.

  20. Laboratory testing on infiltration in single synthetic fractures

    Science.gov (United States)

    Cherubini, Claudia; Pastore, Nicola; Li, Jiawei; Giasi, Concetta I.; Li, Ling

    2017-04-01

    An understanding of infiltration phenomena in unsaturated rock fractures is extremely important in many branches of engineering for numerous reasons. Sectors such as the oil, gas and water industries are regularly interacting with water seepage through rock fractures, yet the understanding of the mechanics and behaviour associated with this sort of flow is still incomplete. An apparatus has been set up to test infiltration in single synthetic fractures in both dry and wet conditions. To simulate the two fracture planes, concrete fractures have been moulded from 3D printed fractures with varying geometrical configurations, in order to analyse the influence of aperture and roughness on infiltration. Water flows through the single fractures by means of a hydraulic system composed by an upstream and a downstream reservoir, the latter being subdivided into five equal sections in order to measure the flow rate in each part to detect zones of preferential flow. The fractures have been set at various angles of inclination to investigate the effect of this parameter on infiltration dynamics. The results obtained identified that altering certain fracture parameters and conditions produces relevant effects on the infiltration process through the fractures. The main variables influencing the formation of preferential flow are: the inclination angle of the fracture, the saturation level of the fracture and the mismatch wavelength of the fracture.

  1. Testing the single degenerate channel for supernova Ia

    Science.gov (United States)

    Parsons, Steven

    2014-10-01

    The progenitors of supernova Ia are close binaries containing white dwarfs. Of crucial importance to the evolution of these systems is how much material the white dwarf can stably accrete and hence grow in mass. This occurs during a short-lived intense phase of mass transfer known as the super soft source (SSS) phase. The short duration of this phase and large extinction to soft X-rays means that only a handful are known in our Galaxy. Far more can be learned from the underlying SSS progenitor population of close white dwarf plus FGK type binaries. Unfortunately, these systems are hard to find since the main-sequence stars completely outshine the white dwarfs at optical wavelengths. Because of this, there are currently no known close white dwarf binaries with F, G or early K type companions, making it impossible to determine the contribution of the single degenerate channel towards supernova Ia. Using the GALEX and RAVE surveys we have now identified the first large sample of FGK stars with UV excesses, a fraction of which are these illusive, close systems. Following an intense ground based spectroscopic investigation of these systems, we have identified 5 definite close binaries, with periods of less than a few days. Here we apply for COS spectroscopic observations to measure the mass and temperature of the white dwarfs in order to determine the future evolution of these systems. This will provide a crucial test for the single degenerate channel towards supernova Ia.

  2. Crystal plasticity study of single crystal tungsten by indentation tests

    International Nuclear Information System (INIS)

    Yao, Weizhi

    2012-01-01

    Owing to its favorable material properties, tungsten (W) has been studied as a plasma-facing material in fusion reactors. Experiments on W heating in plasma sources and electron beam facilities have shown an intense micro-crack formation at the heated surface and sub-surface. The cracks go deep inside the irradiated sample, and often large distorted areas caused by local plastic deformation are present around the cracks. To interpret the crack-induced microscopic damage evolution process in W, one needs firstly to understand its plasticity on a single grain level, which is referred to as crystal plasticity. In this thesis, the crystal plasticity of single crystal tungsten (SCW) has been studied by spherical and Berkovich indentation tests and the finite element method with a crystal plasticity model. Appropriate values of the material parameters included in the crystal plasticity model are determined by fitting measured load-displacement curves and pile-up profiles with simulated counterparts for spherical indentation. The numerical simulations reveal excellent agreement with experiment. While the load-displacement curves and the deduced indentation hardness exhibit little sensitivity to the indented plane at small indentation depths, the orientation of slip directions within the crystals governs the development of deformation hillocks at the surface. It is found that several factors like friction, indentation depth, active slip systems, misoriented crystal orientation, misoriented sample surface and azimuthal orientation of the indenter can affect the indentation behavior of SCW. The Berkovich indentation test was also used to study the crystal plasticity of SCW after deuterium irradiation. The critical load (pop-in load) for triggering plastic deformation under the indenter is found to depend on the crystallographic orientation. The pop-in loads decrease dramatically after deuterium plasma irradiation for all three investigated crystallographic planes.

  3. A New Extension of the Binomial Error Model for Responses to Items of Varying Difficulty in Educational Testing and Attitude Surveys.

    Directory of Open Access Journals (Sweden)

    James A Wiley

    Full Text Available We put forward a new item response model which is an extension of the binomial error model first introduced by Keats and Lord. Like the binomial error model, the basic latent variable can be interpreted as a probability of responding in a certain way to an arbitrarily specified item. For a set of dichotomous items, this model gives predictions that are similar to other single parameter IRT models (such as the Rasch model but has certain advantages in more complex cases. The first is that in specifying a flexible two-parameter Beta distribution for the latent variable, it is easy to formulate models for randomized experiments in which there is no reason to believe that either the latent variable or its distribution vary over randomly composed experimental groups. Second, the elementary response function is such that extensions to more complex cases (e.g., polychotomous responses, unfolding scales are straightforward. Third, the probability metric of the latent trait allows tractable extensions to cover a wide variety of stochastic response processes.

  4. Testing parent dyad interchangeability in the parent proxy-report of PedsQL™ 4.0: a differential item functioning analysis.

    Science.gov (United States)

    Doostfatemeh, Marziyeh; Ayatollahi, Seyyed Mohammad Taghi; Jafari, Peyman

    2015-08-01

    In child-parent agreement studies in the field of paediatric health-related quality of life (HRQoL), little attention has been paid to the effect of gender in parental proxy rating of children's HRQoL. This study aims to test the potential interchangeability of parent dyads in reporting children's HRQoL on both item and scale levels of the PedsQL™ 4.0 instrument, using the approach of differential item functioning (DIF). The PedsQL™ 4.0 Generic Core Scales were completed by 576 father-and-mother dyads. A polytomous item response theory model, graded response model, was used to detect DIF across fathers and mothers. Assessment at item level showed that fathers and mothers perceived the meaning of items of the PedsQL™ 4.0 consistently. Regarding the scale level, a moderate to high level of agreement was observed between mothers' and fathers' reports on all similar subscales. Although the significant mean score differences in total, physical and emotional functioning indicated that fathers gave higher scores to their children, the small effect size implied that this difference may not be practically meaningful. Our findings revealed that discrepancy in parent dyads in rating children's HRQoL is a "real" difference and not an artefact due to measurement non-invariance. Fathers were seen to have slightly different insights into their children, especially for emotional functioning, but overall the results were not all that different. This suggests that paternal proxy-reports can be included in studies along with maternal proxy-reports, and the two may be combined when looking at parent-child agreement. Parent-child agreement studies in Iran are not affected by parents' gender, and therefore, researchers may rely on the assumption of the interchangeability of fathers and mothers in these studies.

  5. Radiation tests for a single-GEM-loaded gaseous detector

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Kyong Sei; Hong, Byung Sik; Park, Sung Keun [Korea University, Seoul (Korea, Republic of); Kim, Sang Yeol [NoticeKorea, Anyang (Korea, Republic of)

    2014-11-15

    We report on a systematic study of a single-gas-electron-multiplier (GEM)-loaded gaseous detector developed for precision measurements of high-energy particle beams and for dose verification in particle therapy. In the present study, a 256-channel prototype detector having an active area of 16 x 16 cm{sup 2} and operating using a continuous current-integration-mode signal-processing method was manufactured and tested with X-rays emitted from a 70-kV X-ray generator and 43-MeV protons provided by the MC50 proton cyclotron at the Korea Institute of Radiological and Medical Science(KIRAMS). The amplified detector response was measured for X-rays with an intensity of about 5 x 10{sup 6} Hz cm{sup -2}. The linearity of the detector response to the particle flux was examined and validated by using 43-MeV proton beams. The non-uniform development of the amplification for the gas electrons in space was corrected by applying a proper calibration to the channel responses of the measured beam-profile data. We conclude from the radiation tests that the detector developed in the present study will allow us to perform quality measurements of various high-energy particle beams and to apply the technology to dose-verification measurements in particle therapy.

  6. Software Note: Using BILOG for Fixed-Anchor Item Calibration

    Science.gov (United States)

    DeMars, Christine E.; Jurich, Daniel P.

    2012-01-01

    The nonequivalent groups anchor test (NEAT) design is often used to scale item parameters from two different test forms. A subset of items, called the anchor items or common items, are administered as part of both test forms. These items are used to adjust the item calibrations for any differences in the ability distributions of the groups taking…

  7. Rasch analysis of the Pediatric Evaluation of Disability Inventory-computer adaptive test (PEDI-CAT) item bank for children and young adults with spinal muscular atrophy.

    Science.gov (United States)

    Pasternak, Amy; Sideridis, Georgios; Fragala-Pinkham, Maria; Glanzman, Allan M; Montes, Jacqueline; Dunaway, Sally; Salazar, Rachel; Quigley, Janet; Pandya, Shree; O'Riley, Susan; Greenwood, Jonathan; Chiriboga, Claudia; Finkel, Richard; Tennekoon, Gihan; Martens, William B; McDermott, Michael P; Fournier, Heather Szelag; Madabusi, Lavanya; Harrington, Timothy; Cruz, Rosangel E; LaMarca, Nicole M; Videon, Nancy M; Vivo, Darryl C De; Darras, Basil T

    2016-12-01

    In this study we evaluated the suitability of a caregiver-reported functional measure, the Pediatric Evaluation of Disability Inventory-Computer Adaptive Test (PEDI-CAT), for children and young adults with spinal muscular atrophy (SMA). PEDI-CAT Mobility and Daily Activities domain item banks were administered to 58 caregivers of children and young adults with SMA. Rasch analysis was used to evaluate test properties across SMA types. Unidimensional content for each domain was confirmed. The PEDI-CAT was most informative for type III SMA, with ability levels distributed close to 0.0 logits in both domains. It was less informative for types I and II SMA, especially for mobility skills. Item and person abilities were not distributed evenly across all types. The PEDI-CAT may be used to measure functional performance in SMA, but additional items are needed to identify small changes in function and best represent the abilities of all types of SMA. Muscle Nerve 54: 1097-1107, 2016. © 2016 Wiley Periodicals, Inc.

  8. The construct equivalence and item bias of the pib/SpEEx conceptualisation-ability test for members of five language groups in South Africa

    Directory of Open Access Journals (Sweden)

    Pieter Schaap

    2008-11-01

    Full Text Available This study’s objective was to determine whether the Potential Index Batteries/Situation Specific Evaluation Expert (PIB/SpEEx conceptualisation (100 ability test displays construct equivalence and item bias for members of five selected language groups in South Africa. The sample consisted of a non-probability convenience sample (N = 6 261 of members of five language groups (speakers of Afrikaans, English, North Sotho, Setswana and isiZulu working in the medical and beverage industries or studying at higher-educational institutions. Exploratory factor analysis with target rotations confrmed the PIB/SpEEx 100’s construct equivalence for the respondents from these five language groups. No evidence of either uniform or non-uniform item bias of practical signifcance was found for the sample.

  9. Cross-cultural validation of the Turkish Four-Dimensional Symptom Questionnaire (4DSQ) using differential item and test functioning (DIF and DTF) analysis.

    Science.gov (United States)

    Terluin, Berend; Unalan, Pemra C; Turfaner Sipahioğlu, Nurver; Arslan Özkul, Seda; van Marwijk, Harm W J

    2016-05-11

    The Four-Dimensional Symptom Questionnaire (4DSQ) is originally a Dutch 50 item questionnaire developed in primary care to assess distress, depression, anxiety and somatization. We aimed to develop and validate a Turkish translation of the 4DSQ. The questionnaire was translated using forward and backward translation, and pilot testing. Turkish 4DSQ-data were collected in 352 consecutive adult primary care patients. For comparison, gender and age matched Dutch reference data were drawn from a larger existing dataset. We used differential item and test functioning (DIF and DTF) analysis to validate the Turkish translation to the original Dutch questionnaire. Through additional inquiry we tried to obtain more insight in the background of DIF in some items. Twenty-one items displayed DIF but this impacted only the distress and depression scores. Inquiry among Turkish people revealed that the reason for DTF in the distress scale was probably related to unfavourable socio-economic circumstances. On the other hand, the likely explanation for DTF in the depression scale appeared to be grounded in culturally and religiously determined optimistic beliefs. Raising the distress cut-offs by 2 points and lowering the depression cut-offs by 1 point ensures that individual Turkish 4DSQ scores be correctly interpreted. The Turkish translation of the 4DSQ (named: "Dört-Boyutlu Yakınma Listesi", 4BYL) measures the same constructs as the original Dutch questionnaire. Turkish anxiety and somatization scores can be interpreted in the same way as Dutch scores. However, when interpreting Turkish distress and depression scores, DTF should be taken into account.

  10. Analysis of experiment testing technology for single event effects in China

    International Nuclear Information System (INIS)

    He Chaohui

    2001-01-01

    The merit and demerit of simulation source were analyzed for Single Event Effects (SEE) experiment testing in China. Laboratory experiment systems for SEE were brief introduced and requests for SEE test system were emphasize analyzed. Test systems were presented for Single Event Upset, Single Event Latch-up, Single Event Burnout and Single Event Gate-Rupture. The attention should be in mind in SEE experiments were discussed

  11. The impact of structure on word meaning and fill-in-the-blank tests procedures on short-term and long-term retention of vocabulary items

    Directory of Open Access Journals (Sweden)

    Seyed Hossein Fazeli

    2009-10-01

    Full Text Available The purpose of research described in the current study to investigate the impact of structure knowing on two types of test, i.e. word-meaning test and fill-in-the-blank test, their correlation and procedures on both short-term and long-term retention of vocabulary items. The importance of the present study, to test the condition that learners are not allowed to use guess strategy or randomly answer the tests and they should give reason semantically for their answer, otherwise their answer, even is correct, is not scored. The population for subject recruitment was all undergraduate students from second semester at large university in Iran (both male and female that study English as a compulsory paper. In Iran, English is taught as a foreign language.

  12. Columbia University flow instability experimental program: Volume 6. Single annulus tests, transient test program

    Energy Technology Data Exchange (ETDEWEB)

    Dougherty, T.; Maciuca, C.; McAssey, E.V. Jr.; Reddy, D.G.; Yang, B.W.

    1992-09-01

    The coolant in the Savannah River Site (SRS) production nuclear reactor assemblies is circulated as a subcooled liquid under normal operating conditions. This coolant is evenly distributed throughout multiple annular flow channels with a uniform pressure profile across each coolant flow channel. During the postulated Loss of Coolant Accident (LOCA), which is initiated by a hypothetical guillotine pipe break, the coolant flow through the reactor assemblies is significantly reduced. The flow reduction and accompanying power reduction (after shutdown is initiated) occur in the first 1 to 2 seconds of the LOCA. This portion of the LOCA is referred to as the Flow Instability phase. This report presents the experimental results for the transient portion of the single annulus test program. The test program was designed to investigate the onset of flow instability in an annular geometry similar to the MARK 22 reactor. The test program involved testing of both a ribless heater and a ribbed heater under steady state as well as transient conditions. The ribbed heater testing is currently underway and will be reported separately. The steady state portion of this test program with ribless heater was completed and reported in report No. CU-HTRF-T3A. The present report presents transient test results obtained from a ribless, uniform annulus test section. A total of thirty five transients were conducted with six cases in which flow excursion occurred. No unstable conditions resulted for tests in which the steady state Q{sub ratio} OFI limit was not exceeded.

  13. Linking Item Response Model Parameters.

    Science.gov (United States)

    van der Linden, Wim J; Barrett, Michelle D

    2016-09-01

    With a few exceptions, the problem of linking item response model parameters from different item calibrations has been conceptualized as an instance of the problem of test equating scores on different test forms. This paper argues, however, that the use of item response models does not require any test score equating. Instead, it involves the necessity of parameter linking due to a fundamental problem inherent in the formal nature of these models-their general lack of identifiability. More specifically, item response model parameters need to be linked to adjust for the different effects of the identifiability restrictions used in separate item calibrations. Our main theorems characterize the formal nature of these linking functions for monotone, continuous response models, derive their specific shapes for different parameterizations of the 3PL model, and show how to identify them from the parameter values of the common items or persons in different linking designs.

  14. 49 CFR 232.307 - Modification of the single car air brake test procedures.

    Science.gov (United States)

    2010-10-01

    ... Requirements § 232.307 Modification of the single car air brake test procedures. (a) Request. The AAR or other authorized representative of the railroad industry may seek modification of the single car air brake test... 49 Transportation 4 2010-10-01 2010-10-01 false Modification of the single car air brake test...

  15. Gender Differential Item Functioning on a National Field-Specific Test: The Case of PhD Entrance Exam of TEFL in Iran

    Science.gov (United States)

    Ahmadi, Alireza; Bazvand, Ali Darabi

    2016-01-01

    Differential Item Functioning (DIF) exists when examinees of equal ability from different groups have different probabilities of successful performance in a certain item. This study examined gender differential item functioning across the PhD Entrance Exam of TEFL (PEET) in Iran, using both logistic regression (LR) and one-parameter item response…

  16. Developing Pairwise Preference-Based Personality Test and Experimental Investigation of Its Resistance to Faking Effect by Item Response Model

    Science.gov (United States)

    Usami, Satoshi; Sakamoto, Asami; Naito, Jun; Abe, Yu

    2016-01-01

    Recent years have shown increased awareness of the importance of personality tests in educational, clinical, and occupational settings, and developing faking-resistant personality tests is a very pragmatic issue for achieving more precise measurement. Inspired by Stark (2002) and Stark, Chernyshenko, and Drasgow (2005), we develop a pairwise…

  17. MCQ testing in higher education: Yes, there are bad items and invalid scores—A case study identifying solutions

    OpenAIRE

    Brown, Gavin

    2017-01-01

    This is a lecture given at Umea University, Sweden in September 2017. It is based on the published study: Brown, G. T. L., & Abdulnabi, H. (2017). Evaluating the quality of higher education instructor-constructed multiple-choice tests: Impact on student grades. Frontiers in Education: Assessment, Testing, & Applied Measurement, 2(24).. doi:10.3389/feduc.2017.00024

  18. Reliability testing of indirect composites as single implant restorations.

    Science.gov (United States)

    Suzuki, Marcelo; Bonfante, Estevam; Silva, Nelson Rfa; Coelho, Paulo G

    2011-10-01

    To investigate the reliability and failure modes of indirect composites as single-unit implant crowns. Thirty-eight custom-milled titanium alloy locking-taper abutments were divided into two groups (n = 19 each), and crown build-up of a mandibular molar was accomplished using two indirect composite systems (Ceramage, Shofu, Kyoto, Japan; Diamond Crown, DRM, Branford, CT). Three crowns of each material were loaded until failure for determination of the step-stress profiles. Reliability testing started at a load 30% of the mean load to failure and used three profiles with increasing fatigue loading (step stress). Weibull curves with 300 N stress and 90% confidence intervals were calculated and plotted using a power-law relationship. Weibull modulus "Beta" and characteristic strength "Eta" were identified, and a contour plot was used (Beta vs. Eta) for examining differences between groups. Specimens were inspected in polarized light and scanning electron microscope for fracture analysis. Use level Weibull probability showed fatigue being a damage factor only for the Ceramage group (β= 3.39) but not for the Diamond Crown group (β= 0.40). Overlap in the confidence bounds resulted in no statistical difference. Irrespective of composite system, fracture initiated in the region immediately below the contact between the indenter and the cusp, with the crack propagating toward the margins of cohesive failure. No significant differences were observed in life and Weibull probability calculations for Ceramage and Diamond Crown veneered onto Ti alloy abutments. Failure modes comprised composite veneer chippings. © 2011 by The American College of Prosthodontists.

  19. The Impact Analysis of Psychological Reliability of Population Pilot Study For Selection of Particular Reliable Multi-Choice Item Test in Foreign Language Research Work

    Directory of Open Access Journals (Sweden)

    Seyed Hossein Fazeli

    2010-10-01

    Full Text Available The purpose of research described in the current study is the psychological reliability, its’ importance, application, and more to investigate on the impact analysis of psychological reliability of population pilot study for selection of particular reliable multi-choice item test in foreign language research work. The population for subject recruitment was all under graduated students from second semester at large university in Iran (both male and female that study English as a compulsory paper. In Iran, English is taught as a foreign language.

  20. Comparison of CAT Item Selection Criteria for Polytomous Items

    Science.gov (United States)

    Choi, Seung W.; Swartz, Richard J.

    2009-01-01

    Item selection is a core component in computerized adaptive testing (CAT). Several studies have evaluated new and classical selection methods; however, the few that have applied such methods to the use of polytomous items have reported conflicting results. To clarify these discrepancies and further investigate selection method properties, six…

  1. 49 CFR 232.309 - Equipment and devices used to perform single car air brake tests.

    Science.gov (United States)

    2010-10-01

    ... (Continued) FEDERAL RAILROAD ADMINISTRATION, DEPARTMENT OF TRANSPORTATION BRAKE SYSTEM SAFETY STANDARDS FOR... Testing Requirements § 232.309 Equipment and devices used to perform single car air brake tests. (a) Equipment and devices used to perform single car air brake tests shall be tested for correct operation at...

  2. Cross-cultural development of an item list for computer-adaptive testing of fatigue in oncological patients

    DEFF Research Database (Denmark)

    Giesinger, Johannes M.; Petersen, Morten Aa.; Grønvold, Mogens

    2011-01-01

    Within an ongoing project of the EORTC Quality of Life Group, we are developing computerized adaptive test (CAT) measures for the QLQ-C30 scales. These new CAT measures are conceptualised to reflect the same constructs as the QLQ-C30 scales. Accordingly, the Fatigue-CAT is intended to capture phy...

  3. Using Standards and Empirical Evidence to Develop Academic English Proficiency Test Items in Reading. CSE Technical Report 664

    Science.gov (United States)

    Bailey, Alison L.; Stevens, Robin; Butler, Frances A.; Huang, Becky; Miyoshi, Judy N.

    2005-01-01

    The work we report focuses on utilizing linguistic profiles of mathematics, science and social studies textbook selections for the creation of reading test specifications. Once we determined that a text and associated tasks fit within the parameters established in Butler et al. (2004), they underwent both internal and external review by language…

  4. Test-retest reliability of Antonovsky's 13-item sense of coherence scale in patients with hand-related disorders

    DEFF Research Database (Denmark)

    Hansen, Alice Ørts; Kristensen, Hanne Kaae; Cederlund, Ragnhild

    2017-01-01

    to be a powerful tool to measure the ICF component personal factors, which could have an impact on patients' rehabilitation outcomes. Implications for rehabilitation Antonovsky's SOC-13 scale showed test-retest reliability for patients with hand-related disorders. The SOC-13 scale could be a suitable tool to help...... measure personal factors....

  5. A Generalized Logistic Regression Procedure to Detect Differential Item Functioning among Multiple Groups

    Science.gov (United States)

    Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul

    2011-01-01

    We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

  6. Evaluation of the Relative Validity and Test-Retest Reliability of a 15-Item Beverage Intake Questionnaire in Children and Adolescents.

    Science.gov (United States)

    Hill, Catelyn E; MacDougall, Carly R; Riebl, Shaun K; Savla, Jyoti; Hedrick, Valisa E; Davy, Brenda M

    2017-11-01

    Added sugar intake, in the form of sugar-sweetened beverages (SSBs), may contribute to weight gain and obesity development in children and adolescents. A valid and reliable brief beverage intake assessment tool for children and adolescents could facilitate research in this area. The purpose of this investigation was to evaluate the relative validity and test-retest reliability of a 15-item beverage intake questionnaire (BEVQ) for assessing usual beverage intake in children and adolescents. This cross-sectional investigation included four study visits within a 2- to 3-week time period. Participants (333 enrolled; 98% completion rate) were children aged 6 to 11 years and adolescents aged 12 to18 years recruited from the New River Valley, VA, region from January 2014 to September 2015. Study visits included assessment of height/weight, health history, and four 24-hour dietary recalls (24HRs). The BEVQ was completed at two visits (BEVQ 1, BEVQ 2). To evaluate relative validity, BEVQ 1 was compared with habitual beverage intake determined by the averaged 24HR. To evaluate test-retest reliability, BEVQ 1 was compared with BEVQ 2. Analyses included descriptive statistics, independent sample t tests, χ 2 tests, one-way analysis of variance, paired sample t tests, and correlational analyses. In the full sample, self-reported water and total SSB intake were not different between BEVQ 1 and 24HR (mean differences 0±1 fl oz and 0±1 fl oz, respectively; both P values >0.05). Reported intake across all beverage categories was significantly correlated between BEVQ 1 and BEVQ 2 (Pintake of milk and energy (in kilocalories) for total beverages was not different (all P values >0.05) between BEVQ 1 and 24HR (mean differences: whole milk=3±4 kcal, reduced-fat milk=9±5 kcal, and fat-free milk=7±6 kcal, which is 7±15 total beverage kilocalories). In adolescents (n=200), water and SSB kilocalories were not different (both P values >0.05) between BEVQ 1 and 24HR (mean

  7. Comparison of the Air Force Officer Qualifying Test Form T and Form S: Initial Item- and Subtest-Level Analyses

    Science.gov (United States)

    2017-03-15

    Statistics Tables 4 through 7 summarize the means and standard deviations of subtest raw scores, and skewness and kurtosis of the raw subtest score...test values ≥+/-1.96 are statistically significant at p < .05 T2 N = 5,199 Table 7. AFOQT Form S2 Subtests Means, Standard Deviations , Skewness, and...Form S or Form T (skewness values on WK and VA ranging from -.09 to -.39). Table 4. AFOQT Form TI Subtests Means, Standard Deviations , Skewness

  8. Detecting Differential Item Discrimination (DID) and the Consequences of Ignoring DID in Multilevel Item Response Models

    Science.gov (United States)

    Lee, Woo-yeol; Cho, Sun-Joo

    2017-01-01

    Cross-level invariance in a multilevel item response model can be investigated by testing whether the within-level item discriminations are equal to the between-level item discriminations. Testing the cross-level invariance assumption is important to understand constructs in multilevel data. However, in most multilevel item response model…

  9. Pulsed laser simulation of VLSI single-event effect testing study

    International Nuclear Information System (INIS)

    Xue Yuxiong; Cao Zhou Yang Shiyu; Tian Kai; Liu Shufen; Chu Nan; Cao Haining; Shang Zhi

    2008-01-01

    This paper describes a study aimed at investigating the pulsed laser simulation of Single-Event Effect (SEE) testing for VLSI Intel386EX CPU, using our laboratory LSS (laser simulation system). We have detailed SEE testing principle, testing method, testing system constituting, testing result. It validates that our laser pulses simulate may use SEE testing in VLSI, and Intel 386Ex have a large locking resistance to single event. (authors)

  10. Análise de itens de uma prova de raciocínio estatístico Analysis of items of a statistical reasoning test

    Directory of Open Access Journals (Sweden)

    Claudette Maria Medeiros Vendramini

    2004-12-01

    Full Text Available Este estudo objetivou analisar as 18 questões (do tipo múltipla escolha de uma prova sobre conceitos básicos de Estatística pelas teorias clássica e moderna. Participaram 325 universitários, selecionados aleatoriamente das áreas de humanas, exatas e saúde. A análise indicou que a prova é predominantemente unidimensional e que os itens podem ser mais bem ajustados ao modelo de três parâmetros. Os índices de dificuldade, discriminação e correlação bisserial apresentam valores aceitáveis. Sugere-se a inclusão de novos itens na prova, que busquem confiabilidade e validade para o contexto educacional e revelem o raciocínio estatístico de universitários ao ler representações de dados estatísticos.This study aimed at to analyze the 18 questions (of multiple choice type of a test on basic concepts of Statistics for the classic and modern theories. The test was taken by 325 undergraduate students, randomly selected from the areas of Human, Exact and Health Sciences. The analysis indicated that the test has predominantly one dimension and that the items can be better fitting to the model of three parameters. The indexes of difficulty, discrimination and biserial correlation present acceptable values. It is suggested to include new items to the test in order to obtain reliability and validity to use it in the education context and to reveal the statistical reasoning of undergraduate students when dealing with statistical data representation.

  11. Pre test parametric studies on single compartment vented enclosure

    International Nuclear Information System (INIS)

    Sharma, Pavan K.; Gera, B.; Singh, R.K.; Vaze, K.K.

    2011-01-01

    Establishing a proper design fire scenario is a challenging task and essential component for conducting fire safety design of buildings. A design fire scenario is a qualitative description of a fire with time identifying key events that characterize the fire (ignition, growth, flashover, fully-developed, and decay stages of fire). Proper fire safety design requires the appropriate selection of design fires against which the performance of the building is evaluated. The selection of the design fires directly impacts all aspects of fire safety performance, including the structural fire resistance, compartmentation against fire spread, egress systems, manual or automatic detection systems, suppression systems, and smoke control. The parameters affecting design fires include, the type, amount and arrangement of combustible materials, the ventilation conditions (air supply conditions, door/window open), and size of the compartment of fire origin. A design fire is a quantitative description of the characteristics of a fire, such as heat release rate (HRR), size of fire and its rate of spread, yield of products of combustion, and hot gas temperatures. Design fires are based on fire scenarios that replicate real fires. Six Computational Fluid Dynamics (CFD) numerical simulations were conducted in order to investigate the effect of fire load on fire dynamics in a) iso corner fire configuration b) IIT Delhi single compartment of a size of 5.0 m long, 5.0 m wide and 5.0 m high with doorway opening of 1m x 3m with centre fire of size 0.5 m x 0.5m. These types of simulation are carried out for deciding about the instrumentation scheme, safety aspect, and optimization of proposed experiments for National Fire Test Facility as pretest calculations. The simulations results are summarized in various identified applied parameter which are useful in terms of understanding the complex fire dynamics, validating the numerical tolls against experiments and using them (in form of values

  12. Single event effect testing of the Intel 80386 family and the 80486 microprocessor

    International Nuclear Information System (INIS)

    Moran, A.; LaBel, K.; Gates, M.; Seidleck, C.; McGraw, R.; Broida, M.; Firer, J.; Sprehn, S.

    1996-01-01

    The authors present single event effect test results for the Intel 80386 microprocessor, the 80387 coprocessor, the 82380 peripheral device, and on the 80486 microprocessor. Both single event upset and latchup conditions were monitored

  13. Use of Jackknifing to Evaluate Effects of Anchor Item Selection on Equating with the Nonequivalent Groups with Anchor Test (NEAT) Design. Research Report. ETS RR-15-10

    Science.gov (United States)

    Lu, Ru; Haberman, Shelby; Guo, Hongwen; Liu, Jinghua

    2015-01-01

    In this study, we apply jackknifing to anchor items to evaluate the impact of anchor selection on equating stability. In an ideal world, the choice of anchor items should have little impact on equating results. When this ideal does not correspond to reality, selection of anchor items can strongly influence equating results. This influence does not…

  14. Numerical test for single concrete armour layer on breakwaters

    OpenAIRE

    Anastasaki, E; Latham, J-P; Xiang, J

    2016-01-01

    The ability of concrete armour units for breakwaters to interlock and form an integral single layer is important for withstanding severe wave conditions. In reality, displacements take place under wave loading, whether they are small and insignificant or large and representing serious structural damage. In this work, a code that combines finite- and discrete-element methods which can simulate motion and interaction among units was used to conduct a numerical investigation. Various concrete ar...

  15. Fabrication and test of Superconducting Single Photon Detectors

    International Nuclear Information System (INIS)

    Leoni, R.; Mattioli, F.; Castellano, M.G.; Cibella, S.; Carelli, P.; Pagano, S.; Perez de Lara, D.; Ejrnaes, M.; Lisitskyi, M.P.; Esposito, E.; Cristiano, R.; Nappi, C.

    2006-01-01

    We report here on the state of our fabrication process for Superconducting Single Photon Detectors (SSPDs). We have fabricated submicrometer SSPD structures by electron beam lithography using very thin (10 nm) NbN films deposited by DC-magnetron sputtering on different substrates and at room substrate temperature. The structures show a fast optical response (risetime <500 ps limited by readout electronics) and interesting self-resetting features

  16. Compreensão da leitura: análise do funcionamento diferencial dos itens de um Teste de Cloze Reading comprehension: differential item functioning analysis of a Cloze Test

    Directory of Open Access Journals (Sweden)

    Katya Luciane Oliveira

    2012-01-01

    Full Text Available Este estudo teve por objetivos investigar o ajuste de um Teste de Cloze ao modelo Rasch e avaliar a dificuldade na resposta ao item em razão do gênero das pessoas (DIF. Participaram da pesquisa 573 alunos das 5ª a 8ª séries do ensino fundamental de escolas públicas estaduais dos estados de São Paulo e Minas Gerais. O teste de Cloze foi aplicado de forma coletiva. A análise do instrumento evidenciou um bom ajuste ao modelo Rasch, bem como os itens foram respondidos conforme o padrão esperado, demonstrando um bom ajuste, também. Quanto ao DIF, apenas três itens indicaram diferenciar o gênero. Com base nos dados, identificou-se que houve equilíbrio nas respostas dadas pelos meninos e meninas.The objectives of the present study were to investigate the adaptation of a Cloze test to the Rasch Model as well as to evaluate the Differential Item Functioning (DIF in relation to gender. The sample was composed by 573 students from 5th to 8th grades of public schools in the state of São Paulo. The cloze test was applied collectively. The analysis of the instrument revealed its adaptation to Rash Model and that the items were responded according to the expected pattern, showing good adjustment, as well. Regarding DIF, only three items were differentiated by gender. Based on the data, results indicated a balance in the answers given by boys and girls.

  17. Testing single point incremental forming molds for thermoforming operations

    Science.gov (United States)

    Afonso, Daniel; de Sousa, Ricardo Alves; Torcato, Ricardo

    2016-10-01

    Low pressure polymer processing processes as thermoforming or rotational molding use much simpler molds then high pressure processes like injection. However, despite the low forces involved with the process, molds manufacturing for this operations is still a very material, energy and time consuming operation. The goal of the research is to develop and validate a method for manufacturing plastically formed sheets metal molds by single point incremental forming (SPIF) operation for thermoforming operation. Stewart platform based SPIF machines allow the forming of thick metal sheets, granting the required structural stiffness for the mold surface, and keeping the short lead time manufacture and low thermal inertia.

  18. Psychometric properties of three single-item pain scales in patients with rheumatoid arthritis seen during routine clinical care: a comparative perspective on construct validity, reproducibility and internal responsiveness.

    Science.gov (United States)

    Sendlbeck, Melanie; Araujo, Elizabeth G; Schett, Georg; Englbrecht, Matthias

    2015-01-01

    To investigate the construct validity, reproducibility (ie, retest reliability) and internal responsiveness to treatment change of common single-item scales measuring overall pain in patients with rheumatoid arthritis (RA) and to investigate the corresponding effect of common pain-related comorbidities and medical consultation on these outcomes. 236 patients with RA completed a set of questionnaires including a visual analogue scale (VAS), a numerical rating scale (NRS) and a verbal rating scale (VRS) measuring overall pain before and immediately after routine medical consultation as well as 1 week after the patient's visit. Construct validity and retest reliability were evaluated using the Bravais-Pearson correlation while standardised response means (SRM) were calculated for evaluating internal responsiveness. Differences in the perception of pain were calculated using dependent samples t-tests. In the total sample, construct validity was good across all three time points (convergent validity of pain scales: rT1-T3=0.82-0.92, pscales with age: rage=0.01-0.16, p>0.05). In patients maintaining antirheumatic treatment, retest reliability of pain scales was confirmed for all scales and across time points (rVAS=0.82-0.95, rNRS=0.89-0.98, rVRS=0.80-0.90, pscales to a change in treatment was low across all scales (SRM=0.08-0.21). The VAS especially suggested a change in pain perception after medical consultation in patients maintaining therapy. The VAS, NRS and VRS are valid and retest reliable in an outpatient clinical practice setting. The low pain scales' internal responsiveness to treatment change is likely to be due to the short follow-up period. Patients with RA maintaining antirheumatic therapy seem to experience less pain after medical consultation.

  19. Testing for one Generalized Linear Single Order Parameter

    DEFF Research Database (Denmark)

    Ellegaard, Niels Langager; Christensen, Tage Emil; Dyre, Jeppe

    work the order parameter may be chosen to have a non-exponential relaxation. The model predictions contradict the general consensus of the properties of viscous liquids in two ways: (i) The model predicts that following a linear isobaric temperature step, the normalized volume and entalpy relaxation......We examine a linear single order parameter model for thermoviscoelastic relaxation in viscous liquids, allowing for a distribution of relaxation times. In this model the relaxation of volume and entalpy is completely described by the relaxation of one internal order parameter. In contrast to prior...... functions are identical. This assumption conflicts with some (but not all) reports, utilizing the Tool-Narayanaswamy formalism to extrapolate from non-linear measurements to the linear regime. (ii) The model predicts that the theoretical "linear Prigogine-Defay" ratio is one. This ratio has never been...

  20. Detectability of single and plural flaws by ultrasonic testing

    International Nuclear Information System (INIS)

    Iida, K.

    1985-01-01

    An outline and up-to-date test results of an eight year project of proving tests on the effectiveness of in-service inspection is described in the first part of the present paper. Effects on the detectability of such testing parameters as refraction angle, thickness of stainless steel cladding, inspectors, standard flaws in reference specimens, stress state subjected to defects are discussed. This is followed by a discussion of detection reproducibility, resolution and accuracy of inspected size of a defects. The latter part of the paper deals with up-to-date results of tests on resolution and shape determination of propagating adjacent and co-linear fatigue cracks by ultrasonic examination. It was found that real lengths of fatigue crack and EDM surface notch will be roughly estimated by 12 dB and 8 dB down methods, respectively. It is also concluded that the 10 dB down method is available for estimation of the inside distance of two co-linear surface cracks

  1. Assessing the Item Response Theory with Covariate (IRT-C) Procedure for Ascertaining Differential Item Functioning

    Science.gov (United States)

    Tay, Louis; Vermunt, Jeroen K.; Wang, Chun

    2013-01-01

    We evaluate the item response theory with covariates (IRT-C) procedure for assessing differential item functioning (DIF) without preknowledge of anchor items (Tay, Newman, & Vermunt, 2011). This procedure begins with a fully constrained baseline model, and candidate items are tested for uniform and/or nonuniform DIF using the Wald statistic.…

  2. Clinical and 22-item Sino-Nasal Outcome Test symptom patterns in primary headache disorder patients presenting to otolaryngologists with "sinus" headaches, pain or pressure.

    Science.gov (United States)

    Lal, Devyani; Rounds, Alexis B; Rank, Matthew A; Divekar, Rohit

    2015-05-01

    The objective of this work was to study patient and 22-item Sino-Nasal Outcome Test (SNOT-22) characteristics in primary headache disorders (PHDs). Retrospective chart review of "sinus" headache/pressure/pain patients was conducted. Patients that had rhinosinusitis excluded (negative endoscopy/computed tomography [CT]), and neurologist-confirmed PHD were studied. Patterns in symptom and SNOT-22 items were analyzed by network visualization and cluster analysis. Forty-six patients met study criteria. Forty-three (93.5%) reported "need to blow nose" and 40 (86.9%) reported postnasal drainage. Sneezing was reported by 37 (80.4%) patients, "blockage/congestion of nose" by 33 (71.8%), and "runny nose by 32 (69.6%) patients. The median SNOT-22 score was 54 (interquartile range [IQR], 40 to 63). Past history included neurological diagnoses (60%), rhinologic disease (39%; chronic rhinosinusitis [CRS], rhinitis, recurrent acute sinusitis), asthma (28%), and allergen-sensitivity (26%). Previous sinonasal surgery had been performed in 41%. Network layout and cluster analysis identified 2 patient clusters and 2 symptom clusters. Two-thirds (31) of patients formed a tight cluster (cluster 1) linking to a symptom cluster of psychosocial items wrapped tightly with "facial pain/pressure." The remaining one-third of patients (cluster 2) linked to rhinologic symptoms loosely grouped away from "facial pressure/pain." In contrast to patients in cluster 2, patients in cluster 1 were predominantly female (p < 0.04), had significantly higher (p < 0.0001) median SNOT-22 scores (60 vs 34; IQR, 53 to 67 vs 17 to 42), were more likely to have migraine history (p = 0.058), and reported being "sad" (p < 0.0001) or "embarrassed" (p < 0.006). Prominent rhinologic symptoms can be present in PHD patients in the absence of rhinosinusitis. In particular, high symptom-burden/SNOT-22 scores and high psychosocial symptoms should raise suspicion of PHD when endoscopy and/or CT results do not correlate

  3. Dynamic tensile test of single PET textile cables

    Directory of Open Access Journals (Sweden)

    Pasco F.

    2012-08-01

    Full Text Available The tyres conception involves for certain applications, the use of textile cables as reinforcement. During its use, the tyre undergoes temperatures variations and dynamic loading rates. The consideration of these conditions during the numeric simulations requires the knowledge of the sensitivity of the mechanical behaviour to loading rate and temperature. In this paper, we developed an experimental methodology for testing textile cable up to high strain rate. The main difficulty of testing cables is the optimization of cable fixing on the machine. For that purpose, we adapted the solution of fixing by progressive binding already used in quasi-static, while taking into account constraints inherent to high strain tests. Firstly, the mass of grips was decreased in order to get force signal less sensitive to grips inertia. The method was developed on a high speed hydraulic machine equipped with a thermal enclosure. The investigated temperatures and strain rates range from room temperature to 373 ∘K (100 ∘C and from 0,01 to 100/s, respectively. In addition, the hydraulic machine was equipped with a high speed video camera. The obtained images were analysed by a tracking technique to measure the average strain in the cable (from 50 to 20000 f/s.

  4. Teste de Raciocínio Auditivo Musical (RAu: estudo inicial por meio da Teoria de Reposta ao Item Test de Raciocinio Auditivo Musical (RAu: estudio inicial a través de la Teoría de Repuesta al Ítem Auditory Musical Reasoning Test: an initial study with Item Response Theory

    Directory of Open Access Journals (Sweden)

    Fernando Pessotto

    2012-12-01

    Full Text Available A presente pesquisa tem como objetivo buscar evidências de validade com base na estrutura interna e de critério para um instrumento de avaliação do processamento auditivo das habilidades musicais (Teste de Processamento Auditivo com Estímulos Musicais, RAu. Para tanto, foram avaliadas 162 pessoas de ambos os sexos, sendo 56,8% homens, com faixa etária entre 15 e 59 anos (M=27,5; DP=9,01. Os participantes foram divididos entre músicos (N=24, amadores (N=62 e leigos (N=76, de acordo com o nível de conhecimento em música. Por meio da análise Full Information Factor Analysis, verificou-se a dimensionalidade do instrumento, e também as propriedades dos itens, por meio da Teoria de Resposta ao Item (TRI. Além disso, buscou-se identificar a capacidade de discriminação entre os grupos de músicos e não-músicos. Os dados encontrados apontam evidências de que os itens medem uma dimensão principal (alfa=0,92 com alta capacidade para diferenciar os grupos de músicos profissionais, amadores e leigos, obtendo-se um coeficiente de validade de critério de r=0,68. Os resultado indicam evidências positivas de precisão e validade para o RAu.La presente investigación tiene como objetivo buscar evidencias de validez basadas en la estructura interna y de criterio para un instrumento de evaluación del procesamiento auditivo de las habilidades musicales (Test de Procesamiento Auditivo con Estímulos Musicales, RAu. Para eso, fueron evaluadas 162 personas de ambos los sexos, siendo 56,8% hombres, con rango de edad entre 15 y 59 años (M=27,5; DP=9,01. Los participantes fueron divididos entre músicos (N=24, aficionados (N=62 y laicos (N=76 de acuerdo con el nivel de conocimiento en música. Por medio del análisis Full Information Factor Analysis se verificó la dimensionalidad del instrumento y también las propiedades de los ítems a través de la Teoría de Respuesta al Ítem (TRI. Además, se buscó identificar la capacidad de discriminaci

  5. Injection molded nanofluidic chips: Fabrication method and functional tests using single-molecule DNA experiments

    DEFF Research Database (Denmark)

    Utko, Pawel; Persson, Karl Fredrik; Kristensen, Anders

    2011-01-01

    We demonstrate that fabrication of nanofluidic systems can be greatly simplified by injection molding of polymers. We functionally test our devices by single-molecule DNA experiments in nanochannels.......We demonstrate that fabrication of nanofluidic systems can be greatly simplified by injection molding of polymers. We functionally test our devices by single-molecule DNA experiments in nanochannels....

  6. 76 FR 34801 - Petition for Modification of Single Car Air Brake Test Procedures

    Science.gov (United States)

    2011-06-14

    ... reference in 49 CFR 232.305) is intended for freight cars with automatic brake systems that are...] Petition for Modification of Single Car Air Brake Test Procedures In accordance with Part 232 of Title 49... Railroad Administration (FRA) grant a modification of the single car air brake test procedures as...

  7. A single-item self-rated health measure correlates with objective health status in the elderly: a survey in suburban Beijing

    Directory of Open Access Journals (Sweden)

    Qinqin eMeng

    2014-04-01

    Full Text Available IntroductionThe measurement of health status of the elderly remains one important topic. Self-rated health status (SRH is considered to be a simple indicator to measure the health status of the old population. But some researchers still take a skeptical view about its reliability. This study aims to investigate the association between self-rated health indicator and health status of the elderly and discuss its subsequent public health implications.MethodsIn a total 1096 people who were 60 years of age or older from 1784 households from a suburban area of Beijing were interviewed using multistage stratified cluster sampling. SRH was measured by a single question please choose one point in this 0-100 scale which can best represent your health today?. The disease status and physical functional status were also obtained. A multiple linear regression was conducted to test the associate between SRH and individual’s disease/functional status.ResultsThe average of SRH scores of the elderly was 72.49±15.64 (on a 1 to 100 scale. The SRH scores declined not only with the severity of self-reported mental/disease status, but also with the decrease of physical functional status. Multiple linear regression showed that after adjustment for other variables, two-week sickness, chronic diseases, hospitalization, and ability of self-care (washing and dressing were able to explain 35% of the variation in SRH among the elderly. Among them, disease status and self-care ability were the most powerful predictor of SRH. After adjusting other variables, physical functional status could explain only 5% of the variation in SRH.ConclusionSRH reflects the disease/functional health status of the elderly. It is an easy-to-implement variable and it can reduce both recall bias and investigator bias, thus being widely used in health surveys. It is a cost-effective means of measuring the health status. However, the comparability of SRH in different populations should be studied

  8. Validation of the Ten-Item Internet Gaming Disorder Test (IGDT-10) and evaluation of the nine DSM-5 Internet Gaming Disorder criteria.

    Science.gov (United States)

    Király, Orsolya; Sleczka, Pawel; Pontes, Halley M; Urbán, Róbert; Griffiths, Mark D; Demetrovics, Zsolt

    2017-01-01

    The inclusion of Internet Gaming Disorder (IGD) in the DSM-5 (Section 3) has given rise to much scholarly debate regarding the proposed criteria and their operationalization. The present study's aim was threefold: to (i) develop and validate a brief psychometric instrument (Ten-Item Internet Gaming Disorder Test; IGDT-10) to assess IGD using definitions suggested in DSM-5, (ii) contribute to ongoing debate regards the usefulness and validity of each of the nine IGD criteria (using Item Response Theory [IRT]), and (iii) investigate the cut-off threshold suggested in the DSM-5. An online gamer sample of 4887 gamers (age range 14-64years, mean age 22.2years [SD=6.4], 92.5% male) was collected through Facebook and a gaming-related website with the cooperation of a popular Hungarian gaming magazine. A shopping voucher of approx. 300 Euros was drawn between participants to boost participation (i.e., lottery incentive). Confirmatory factor analysis and a structural regression model were used to test the psychometric properties of the IGDT-10 and IRT analysis was conducted to test the measurement performance of the nine IGD criteria. Finally, Latent Class Analysis along with sensitivity and specificity analysis were used to investigate the cut-off threshold proposed in the DSM-5. Analysis supported IGDT-10's validity, reliability, and suitability to be used in future research. Findings of the IRT analysis suggest IGD is manifested through a different set of symptoms depending on the level of severity of the disorder. More specifically, "continuation", "preoccupation", "negative consequences" and "escape" were associated with lower severity of IGD, while "tolerance", "loss of control", "giving up other activities" and "deception" criteria were associated with more severe levels. "Preoccupation" and "escape" provided very little information to the estimation IGD severity. Finally, the DSM-5 suggested threshold appeared to be supported by our statistical analyses. IGDT-10 is

  9. LHC BLM Single Channel Connectivity Test using the Standard Installation

    CERN Document Server

    Emery, J; Effinger, E; Ferioli, G; Zamantzas, C; Ikeda, H; Verhagen, E

    2009-01-01

    For the LHC Beam Loss Measurement system (BLM), the high voltage supply of the ionisation chambers and the secondary emission detectors is used to test their connectivity. A harmonic modulation of 0.03 Hz results in a current signal of about 100pA measured by the beam loss acquisition electronics. The signal is analyzed and the measured amplitude and phase are compared with individual channel limits for the 4000 channels. It is foreseen to execute an automatic procedure for all channels every 12 hours which takes about 20 minutes. The paper will present the design of the system, the circuit simulations, measurements of systematic dependencies of different channels and the reproducibility of the amplitude and phase measurements.

  10. Construct Validity of the Item-Specific Deficit Approach to the California Verbal Learning Test (2nd ed) in HIV Infection

    Science.gov (United States)

    Cattie, Jordan E.; Woods, Steven Paul; Arce, Miguel; Weber, Erica; Delis, Dean C.; Grant, Igor

    2012-01-01

    Impairment in list learning and recall is prevalent in HIV-infected individuals and is strongly predictive of everyday functioning outcomes. Consistent with its predominant frontostriatal pathology, the memory profile associated with HIV infection is best characterized as a mixed encoding/retrieval profile. The Item-Specific Deficit Approach (ISDA) was developed by Wright et al. (2009) to elicit indices of Encoding, Consolidation, and Retrieval from the well-validated California Verbal Learning Test (CVLT; Delis et al., 1987; 2000). The current study evaluated construct validity of the ISDA for the CVLT-II in 40 persons with HIV-associated neurocognitive disorders (HIV+/HAND+), 103 HIV-infected persons without HAND (HIV+/HAND−), and 43 seronegative comparison subjects (HIV−). Results provided mixed support for the construct validity of ISDA indices. HIV+/HAND+ individuals performed significantly more poorly than persons in the HIV+/HAND− and HIV− groups on ISDA Encoding, Consolidation, and Retrieval deficit indices, which demonstrated adequate classification accuracy for diagnosing HIV+/HAND+ participants and evidence of both convergent (e.g., episodic memory) and divergent (e.g., motor skills) correlations in the HIV+/HAND+ participants. However, highly intercorrelated ISDA indices and traditional CVLT-II measures showed comparable between-groups effect sizes, classification accuracy, and correlations to other memory tests, thereby raising uncertainties about the incremental value of the ISDA approach in clinical neuroAIDS research. PMID:22394206

  11. Test procedures and instructions for single shell tank saltcake cesium removal with crystalline silicotitanate

    Energy Technology Data Exchange (ETDEWEB)

    Duncan, J.B.

    1997-01-07

    This document provides specific test procedures and instructions to implement the test plan for the preparation and conduct of a cesium removal test, using Hanford Single Shell Tank Saltcake from tanks 24 t -BY- I 10, 24 1 -U- 108, 24 1 -U- 109, 24 1 -A- I 0 1, and 24 t - S-102, in a bench-scale column. The cesium sorbent to be tested is crystalline siticotitanate. The test plan for which this provides instructions is WHC-SD-RE-TP-024, Hanford Single Shell Tank Saltcake Cesium Removal Test Plan.

  12. How Do They (Participants) Understand Our (Researchers) Intentions? A Qualitative Test of the Curvilinear Assumptions of the Adaptability Items of the FACES III.

    Science.gov (United States)

    Ben-David, Amith; Sprenkle, Douglas H.

    1993-01-01

    Eight individuals from larger study of lay persons who interpreted Family Adaptability and Cohesion Evaluation Scales (FACES-III) adaptability items were interviewed in-depth concerning their understanding of adaptability items of regular FACES III and two alternate versions. Qualitative results suggest that participants may not be understanding…

  13. NEPP Update of Independent Single Event Upset Field Programmable Gate Array Testing

    Science.gov (United States)

    Berg, Melanie; Label, Kenneth; Campola, Michael; Pellish, Jonathan

    2017-01-01

    This presentation provides a NASA Electronic Parts and Packaging (NEPP) Program update of independent Single Event Upset (SEU) Field Programmable Gate Array (FPGA) testing including FPGA test guidelines, Microsemi RTG4 heavy-ion results, Xilinx Kintex-UltraScale heavy-ion results, Xilinx UltraScale+ single event effect (SEE) test plans, development of a new methodology for characterizing SEU system response, and NEPP involvement with FPGA security and trust.

  14. A Balance Sheet for Educational Item Banking.

    Science.gov (United States)

    Hiscox, Michael D.

    Educational item banking presents observers with a considerable paradox. The development of test items from scratch is viewed as wasteful, a luxury in times of declining resources. On the other hand, item banking has failed to become a mature technology despite large amounts of money and the efforts of talented professionals. The question of which…

  15. The effects of isolated single umbilical artery on first and second trimester aneuploidy screening test parameters.

    Science.gov (United States)

    Tulek, Firat; Kahraman, Alper; Taskin, Salih; Ozkavukcu, Esra; Soylemez, Feride

    2015-04-01

    Reliability of first and second trimester screening tests largely depends on accurate estimation of maternal serum marker values. Reduced reliability could lead redundant invasive tests or misdiagnosis. Adjustments of serum marker values for confounding factors like insulin-dependent diabetes, maternal weight or maternal rhesus status are essential. We aimed to investigate whether isolated single umbilical artery alters first and second trimester test parameters or not. Routine detailed obstetric ultrasonographies performed were retrospectively screened for this study. Among spontaneously conceived singleton pregnancies, women who were found to have single umbilical artery without any additional structural anomalies or aneuploidies were selected. First and second trimester screening test results were accessible for 98 and 102 of the cases with isolated single umbilical artery, respectively. Among first trimester screening test parameters, PAPP-A (pregnancy-associated plasma protein A) MoMs were found significantly higher in isolated single umbilical artery group. AFP MoMs were found significantly elevated in isolated single umbilical artery group in second trimester quadruple tests. Existence of single umbilical artery could alter the estimation of MoM values of maternal serum markers. Reliability of prenatal screening tests could be improved by adjusting these parameters in accordance with isolated single umbilical artery.

  16. Definition of Capabilities Needed for a Single Event Effects Test Facility

    Energy Technology Data Exchange (ETDEWEB)

    Riemer, Bernie [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Spallation Neutron Source (SNS); Gallmeier, Franz X. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Spallation Neutron Source (SNS)

    2014-12-01

    The Federal Aviation Administration (FAA) is contemplating new regulations mandating testing of the vulnerability of flight-critical avionics to single event effects (SEE). A limited number of high-energy neutron test facilities currently serve the SEE industrial and institutional research community. The FAA recognizes that existing facilities have insufficient test capacity to meet new demand from such mandates; it desires more flexible irradiation capabilities to test complete, large systems and would like capabilities to address greater concerns for thermal neutrons. For this reason, the FAA funded this study by Spallation Neutron Source (SNS) staff with the ultimate aim of developing options for SEE test facilities using high-energy neutrons at the SNS complex. After an investigation of current SEE test practices and assessment of future testing requirements, three concepts were identified covering a range of test functionality, neutron flux levels, and fidelity to the atmospheric neutron spectrum. The costs and times required to complete each facility were also estimated. SEE testing is generally performed by accelerating the event rate to a point where the effects are still dominated by single events and double event causes of failures are negligible. In practice, acceleration factors of as high as 106 are applicable for component testing, whereas for systems testing acceleration factors of 104 seem to be the upper limit. It is strongly desirable that the irradiation facility be tunable over a large range of high-energy neutron fluxes of 102 - 104 n/cm²/s for systems testing and from 104 - 107 n/cm²/s for components testing. The most capable, most flexible, and highest-test-capacity option is a new stand-alone target station named the High-Energy neutron Test Station (HETS). It is also the most expensive option, with a cost to complete of approximately $100 million. Dual test enclosures would

  17. Evaluation procedures for single axis sinusoidal test to design spectrum requirements

    International Nuclear Information System (INIS)

    Sun, P.C.; Javid, A.

    1983-01-01

    Two simple procedures are provided in this paper for the purpose of evaluating the adequacy of a single frequency single axis test. For the purpose of evaluating the adequacy of single frequency test to meet broad-band response spectrum requirements, the proposed procedure is based on the equivalence of maximum response of a dynamic system when it is subjected to either type of design input. The required information used for the evaluation is usually recorded and available in the test report. This procedure is applicable to systems with or without closely-spaced modes. When evaluating against broad-band design spectra and multi-axes requirements, an empirical procedure is proposed and it has been found conservative. These two proposed procedures provide a quick assessment on the adequacy of a single frequency test performed earlier. The use of these procedures may eliminate the need of expensive and time consuming equipment re-testing. (orig./HP)

  18. A strategy for optimizing item-pool management

    NARCIS (Netherlands)

    Ariel, A.; van der Linden, Willem J.; Veldkamp, Bernard P.

    2006-01-01

    Item-pool management requires a balancing act between the input of new items into the pool and the output of tests assembled from it. A strategy for optimizing item-pool management is presented that is based on the idea of a periodic update of an optimal blueprint for the item pool to tune item

  19. Results of single borehole hydraulic testing in the Mizunami Underground Research Laboratory project. Phase 2

    International Nuclear Information System (INIS)

    Daimaru, Shuji; Takeuchi, Ryuji; Onoe, Hironori; Saegusa, Hiromitsu

    2012-09-01

    This report summarize the results of the single borehole hydraulic tests of 79 sections conducted as part of the Construction phase (Phase 2) in the Mizunami Underground Research Laboratory (MIU) Project. The details of each test (test interval depth, geology, etc.) as well as the interpreted hydraulic parameters and analytical method used are presented in this report. (author)

  20. NASA Electronic Parts and Packaging Field Programmable Gate Array Single Event Effects Test Guideline Update

    Science.gov (United States)

    Berg, Melanie D.; LaBel, Kenneth A.

    2018-01-01

    The following are updated or new subjects added to the FPGA SEE Test Guidelines manual: academic versus mission specific device evaluation, single event latch-up (SEL) test and analysis, SEE response visibility enhancement during radiation testing, mitigation evaluation (embedded and user-implemented), unreliable design and its affects to SEE Data, testing flushable architectures versus non-flushable architectures, intellectual property core (IP Core) test and evaluation (addresses embedded and user-inserted), heavy-ion energy and linear energy transfer (LET) selection, proton versus heavy-ion testing, fault injection, mean fluence to failure analysis, and mission specific system-level single event upset (SEU) response prediction. Most sections within the guidelines manual provide information regarding best practices for test structure and test system development. The scope of this manual addresses academic versus mission specific device evaluation and visibility enhancement in IP Core testing.

  1. Testing the Discriminant and Convergent Validity of the World Health Organization Six-Item Adult ADHD Self-Report Scale Screener Using the Stockholm Public Health Cohort.

    Science.gov (United States)

    Lundin, Andreas; Kosidou, Kyriaki; Dalman, Christina

    2017-10-01

    The World Health Organization (WHO) Adult ADHD Self-Report Scale (ASRS) is intended to measure population prevalence of ADHD. The short version ASRS-6 has yet not been validated in a population setting. Our aim was to examine the validity of the ASRS-6 in a general population. We used the Stockholm Public Health Cohort 2014. The convergent validity was assessed using item response theory (IRT). The discriminant validity was assessed by examining the correlation between the ASRS and known correlates. The ASRS-6 was unidimensional albeit with hyperactivity and impulsivity items fitting less good. IRT analysis showed that the item difficulty ranged between easy to hard and that the four items on inattention had good or very good discriminatory ability. Correlates were all in the expected direction. The ASRS-6 has adequate validity in the general population but reflects the duality of ADHD having both inattention and hyperactivity/impulsivity as sufficient and non-necessary criteria.

  2. Forecast of thermal-hydrological conditions and air injection test results of the single heater test at Yucca Mountain

    International Nuclear Information System (INIS)

    Birkholzer, J.T.; Tsang, Y.W.

    1996-12-01

    The heater in the Single Heater Test (SHT) in alcove 5 of the Exploratory Studies Facility (ESF) was turned on August 26, 1996. A large number of sensors are installed in the various instrumented boreholes to monitor the coupled thermal-hydrological-mechanical-chemical responses of the rock mass to the heat generated in the single heater. In this report the authors present the results of the modeling of both the heating and cooling phases of the Single Heater Test (SHT), with focus on the thermal-hydrological aspect of the coupled processes. Also in this report, the authors present simulations of air injection tests will be performed at different stages of the heating and cooling phase of the SHT

  3. Intersection tests for single marker QTL analysis can be more powerful than two marker QTL analysis

    Directory of Open Access Journals (Sweden)

    Doerge RW

    2003-06-01

    Full Text Available Abstract Background It has been reported in the quantitative trait locus (QTL literature that when testing for QTL location and effect, the statistical power supporting methodologies based on two markers and their estimated genetic map is higher than for the genetic map independent methodologies known as single marker analyses. Close examination of these reports reveals that the two marker approaches are more powerful than single marker analyses only in certain cases. Simulation studies are a commonly used tool to determine the behavior of test statistics under known conditions. We conducted a simulation study to assess the general behavior of an intersection test and a two marker test under a variety of conditions. The study was designed to reveal whether two marker tests are always more powerful than intersection tests, or whether there are cases when an intersection test may outperform the two marker approach. We present a reanalysis of a data set from a QTL study of ovariole number in Drosophila melanogaster. Results Our simulation study results show that there are situations where the single marker intersection test equals or outperforms the two marker test. The intersection test and the two marker test identify overlapping regions in the reanalysis of the Drosophila melanogaster data. The region identified is consistent with a regression based interval mapping analysis. Conclusion We find that the intersection test is appropriate for analysis of QTL data. This approach has the advantage of simplicity and for certain situations supplies equivalent or more powerful results than a comparable two marker test.

  4. Understanding the effect of single-fracture heterogeneity from single-well injection-withdrawal (SWIW) tests

    Science.gov (United States)

    Larsson, Martin; Doughty, Christine; Tsang, Chin-Fu; Niemi, Auli

    2013-12-01

    The single-well injection-withdrawal (SWIW) tracer test is a method used to estimate the tracer retardation properties of a fracture or fracture zone. The effects of single-fracture aperture heterogeneity on SWIW-test tracer breakthrough curves are examined by numerical modelling. The effects of the matrix diffusion and sorption are accounted for by using a particle tracking method through the addition of a time delay added to the advective transport time. For a given diffusion and sorption property ( P m) value and for a heterogeneous fracture, the peak concentration is larger compared to a homogeneous fracture. The cumulative breakthrough curve for a heterogeneous fracture is similar to that for a homogeneous fracture and a less sorptive/diffusive tracer. It is demonstrated that the fracture area that meets the flowing water, the specific flow-wetted surface (sFWS) of the fracture, can be determined by matching the observed breakthrough curve for a heterogeneous fracture to that for a homogeneous fracture with an equivalent property parameter. SWIW tests are also simulated with a regional pressure gradient present. The results point to the possibility of distinguishing the effect of the regional pressure gradient from that of diffusion through the use of multiple tracers with different P m values.

  5. The Influence of Age on Interaction between Breath-Holding Test and Single-Breath Carbon Dioxide Test

    Directory of Open Access Journals (Sweden)

    Nikita Trembach

    2017-01-01

    Full Text Available Introduction. The aim of the study was to compare the breath-holding test and single-breath carbon dioxide test in evaluation of the peripheral chemoreflex sensitivity to carbon dioxide in healthy subjects of different age. Methods. The study involved 47 healthy volunteers between ages of 25 and 85 years. All participants were divided into 4 groups according to age: 25 to 44 years (n=14, 45 to 60 years (n=13, 60 to 75 years (n=12, and older than 75 years (n=8. Breath-holding test was performed in the morning before breakfast. The single-breath carbon dioxide (SB-CO2 test was performed the following day. Results. No correlation was found between age and duration of breath-holding (r=0.13 and between age and peripheral chemoreflex sensitivity to CO2 (r=0.07. In all age groups there were no significant differences in the mean values from the breath-holding test and peripheral chemoreflex sensitivity tests. In all groups there was a strong significant inverse correlation between breath-holding test and SB-CO2 test. Conclusion. A breath-holding test reflects the sensitivity of the peripheral chemoreflex to carbon dioxide in healthy elderly humans. Increasing age alone does not alter the peripheral ventilatory response to hypercapnia.

  6. The Influence of Age on Interaction between Breath-Holding Test and Single-Breath Carbon Dioxide Test.

    Science.gov (United States)

    Trembach, Nikita; Zabolotskikh, Igor

    2017-01-01

    Introduction . The aim of the study was to compare the breath-holding test and single-breath carbon dioxide test in evaluation of the peripheral chemoreflex sensitivity to carbon dioxide in healthy subjects of different age. Methods . The study involved 47 healthy volunteers between ages of 25 and 85 years. All participants were divided into 4 groups according to age: 25 to 44 years ( n = 14), 45 to 60 years ( n = 13), 60 to 75 years ( n = 12), and older than 75 years ( n = 8). Breath-holding test was performed in the morning before breakfast. The single-breath carbon dioxide (SB-CO 2 ) test was performed the following day. Results . No correlation was found between age and duration of breath-holding ( r = 0.13) and between age and peripheral chemoreflex sensitivity to CO 2 ( r = 0.07). In all age groups there were no significant differences in the mean values from the breath-holding test and peripheral chemoreflex sensitivity tests. In all groups there was a strong significant inverse correlation between breath-holding test and SB-CO 2 test. Conclusion . A breath-holding test reflects the sensitivity of the peripheral chemoreflex to carbon dioxide in healthy elderly humans. Increasing age alone does not alter the peripheral ventilatory response to hypercapnia.

  7. Assessing the equivalence of web-based and paper-and-pencil questionnaires using differential item and test functioning (DIF and DTF) analysis : A case of the Four-Dimensional Symptom Questionnaire (4DSQ)

    NARCIS (Netherlands)

    Terluin, B.; Brouwers, E.P.M.; Marchand, M.A.G.; De Vet, H.C.

    2018-01-01

    Purpose: Many paper-and-pencil (P&P) questionnaires have been migrated to electronic platforms. Differential item and test functioning (DIF and DTF) analysis constitutes a superior research design to assess measurement equivalence across modes of administration. The purpose of this study was to

  8. Evidence of the Generalization and Construct Representation Inferences for the "GRE"® Revised General Test Sentence Equivalence Item Type. ETS GRE® Board Research Report. ETS GRE®-17-02. ETS Research Report. RR-17-05

    Science.gov (United States)

    Bejar, Isaac I.; Deane, Paul D.; Flor, Michael; Chen, Jing

    2017-01-01

    The report is the first systematic evaluation of the sentence equivalence item type introduced by the "GRE"® revised General Test. We adopt a validity framework to guide our investigation based on Kane's approach to validation whereby a hierarchy of inferences that should be documented to support score meaning and interpretation is…

  9. Analytical solutions for efficient interpretation of single-well push-pull tracer tests

    Science.gov (United States)

    Single-well push-pull tracer tests have been used to characterize the extent, fate, and transport of subsurface contamination. Analytical solutions provide one alternative for interpreting test results. In this work, an exact analytical solution to two-dimensional equations descr...

  10. Fibre reinforced concrete in flexure and single fibre pull-out test: a correlation

    Science.gov (United States)

    Manca, M.; Ciancio, D.; Dight, P.

    2017-09-01

    The aim of the present work is to assess whether a single fibre pull-out test can be related to the behaviour of multiple fibres in fibre reinforced concrete under bending condition. A simple model based on the stress block theory is described and compared with experimental results on three point bending tests with aligned fibres.

  11. Content Coverage of Single-Word Tests Used to Assess Common Phonological Error Patterns

    Science.gov (United States)

    Kirk, Cecilia; Vigeland, Laura

    2015-01-01

    Purpose: This review evaluated whether 9 single-word tests of phonological error patterns provide adequate content coverage to accurately identify error patterns that are active in a child's speech. Method: Tests in the current study were considered to display sufficient opportunities to assess common phonological error patterns if they…

  12. A test of the factor structure equivalence of the 50-item IPIP Five-factor model measure across gender and ethnic groups.

    Science.gov (United States)

    Ehrhart, Karen Holcombe; Roesch, Scott C; Ehrhart, Mark G; Kilian, Britta

    2008-09-01

    Personality is frequently assessed in research and applied settings, in part due to evidence that scores on measures of the Five-factor model (FFM) of personality show predictive validity for a variety of outcomes. Although researchers are increasingly using the International Personality Item Pool (IPIP; Goldberg, 1999; International Personality Item Pool, 2007b) FFM measures, investigations of the psychometric properties of these measures are unfortunately sparse. The purpose of this study was to examine the factor structure equivalence of the 50-item IPIP FFM measure across gender and ethnic groups (i.e., Whites, Latinos, Asian Americans) using multigroup confirmatory factor analysis. Results from a sample of 1,727 college students generally support the invariance of the factor structure across groups, although there was some evidence of differences across gender and ethnic groups for model parameters. We discuss these findings and their implications.

  13. A single standard for in-place testing of DOE HEPA filters - not

    Energy Technology Data Exchange (ETDEWEB)

    Mokler, B.V. [Los Alamos National Laboratory, NM (United States)

    1995-02-01

    This article is a review of arguments against the use of a single standard for in-place testing of DOE HEPA filters. The author feels that the term `standard` entails mandatory compliance. Additionally, the author feels that the variety of DOE HEPA systems requiring in-place testing is such that the guidance for testing must be written in a permissive fashion, allowing options and alternatives. With this in mind, it is not possible to write a single document entailing mandatory compliance for all DOE facilities.

  14. A single standard for in-place testing of DOE HEPA filters - not

    International Nuclear Information System (INIS)

    Mokler, B.V.

    1995-01-01

    This article is a review of arguments against the use of a single standard for in-place testing of DOE HEPA filters. The author feels that the term 'standard' entails mandatory compliance. Additionally, the author feels that the variety of DOE HEPA systems requiring in-place testing is such that the guidance for testing must be written in a permissive fashion, allowing options and alternatives. With this in mind, it is not possible to write a single document entailing mandatory compliance for all DOE facilities

  15. Linking Existing Instruments to Develop an Activity of Daily Living Item Bank.

    Science.gov (United States)

    Li, Chih-Ying; Romero, Sergio; Bonilha, Heather S; Simpson, Kit N; Simpson, Annie N; Hong, Ickpyo; Velozo, Craig A

    2018-03-01

    This study examined dimensionality and item-level psychometric properties of an item bank measuring activities of daily living (ADL) across inpatient rehabilitation facilities and community living centers. Common person equating method was used in the retrospective veterans data set. This study examined dimensionality, model fit, local independence, and monotonicity using factor analyses and fit statistics, principal component analysis (PCA), and differential item functioning (DIF) using Rasch analysis. Following the elimination of invalid data, 371 veterans who completed both the Functional Independence Measure (FIM) and minimum data set (MDS) within 6 days were retained. The FIM-MDS item bank demonstrated good internal consistency (Cronbach's α = .98) and met three rating scale diagnostic criteria and three of the four model fit statistics (comparative fit index/Tucker-Lewis index = 0.98, root mean square error of approximation = 0.14, and standardized root mean residual = 0.07). PCA of Rasch residuals showed the item bank explained 94.2% variance. The item bank covered the range of θ from -1.50 to 1.26 (item), -3.57 to 4.21 (person) with person strata of 6.3. The findings indicated the ADL physical function item bank constructed from FIM and MDS measured a single latent trait with overall acceptable item-level psychometric properties, suggesting that it is an appropriate source for developing efficient test forms such as short forms and computerized adaptive tests.

  16. Reliability Assessment of a Single-Shot System by Use of Screen Test Results

    Science.gov (United States)

    2018-02-01

    unlimited. NUWC Keyport #17-002. Reliability Assessment of a Single-Shot System by Use of Screen Test Results Abstract: Field reliability prediction...approach described here assumes that the defect density during testing takes the form of an exponential decay, although other mathematical functions can...be substituted for the exponential. In order to apply the decay rate function to a discrete pass/fail test scheme, the approach provides for

  17. The Influence of Age on Interaction between Breath-Holding Test and Single-Breath Carbon Dioxide Test

    OpenAIRE

    Trembach, Nikita; Zabolotskikh, Igor

    2017-01-01

    Introduction. The aim of the study was to compare the breath-holding test and single-breath carbon dioxide test in evaluation of the peripheral chemoreflex sensitivity to carbon dioxide in healthy subjects of different age. Methods. The study involved 47 healthy volunteers between ages of 25 and 85 years. All participants were divided into 4 groups according to age: 25 to 44 years (n = 14), 45 to 60 years (n = 13), 60 to 75 years (n = 12), and older than 75 years (n = 8). Breath-holding test ...

  18. Differential Item Functioning Assessment in Cognitive Diagnostic Modeling: Applying the Wald Test to Investigate DIF in the Generalized DINA Model Framework

    Science.gov (United States)

    Hou, Likun

    2013-01-01

    Analyzing examinees' responses using cognitive diagnostic models (CDMs) have the advantages of providing richer diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this dissertation, the model-based DIF detection method, Wald-CDM procedure is…

  19. Homogeneity tests for variances and mean test under heterogeneity conditions in a single way ANOVA method

    International Nuclear Information System (INIS)

    Morales P, J.R.; Avila P, P.

    1996-01-01

    If we have consider the maximum permissible levels showed for the case of oysters, it results forbidding to collect oysters at the four stations of the El Chijol Channel ( Veracruz, Mexico), as well as along the channel itself, because the metal concentrations studied exceed these limits. In this case the application of Welch tests were not necessary. For the water hyacinth the means of the treatments were unequal in Fe, Cu, Ni, and Zn. This case is more illustrative, for the conclusion has been reached through the application of the Welch tests to treatments with heterogeneous variances. (Author)

  20. Single Event Effects Test Facility Options at the Oak Ridge National Laboratory

    Energy Technology Data Exchange (ETDEWEB)

    Riemer, Bernie [ORNL; Gallmeier, Franz X [ORNL; Dominik, Laura J [ORNL

    2015-01-01

    Increasing use of microelectronics of ever diminishing feature size in avionics systems has led to a growing Single Event Effects (SEE) susceptibility arising from the highly ionizing interactions of cosmic rays and solar particles. Single event effects caused by atmospheric radiation have been recognized in recent years as a design issue for avionics equipment and systems. To ensure a system meets all its safety and reliability requirements, SEE induced upsets and potential system failures need to be considered, including testing of the components and systems in a neutron beam. Testing of integrated circuits (ICs) and systems for use in radiation environments requires the utilization of highly advanced laboratory facilities that can run evaluations on microcircuits for the effects of radiation. This paper provides a background of the atmospheric radiation phenomenon and the resulting single event effects, including single event upset (SEU) and latch up conditions. A study investigating requirements for future single event effect irradiation test facilities and developing options at the Spallation Neutron Source (SNS) is summarized. The relatively new SNS with its 1.0 GeV proton beam, typical operation of 5000 h per year, expertise in spallation neutron sources, user program infrastructure, and decades of useful life ahead is well suited for hosting a world-class SEE test facility in North America. Emphasis was put on testing of large avionics systems while still providing tunable high flux irradiation conditions for component tests. Makers of ground-based systems would also be served well by these facilities. Three options are described; the most capable, flexible, and highest-test-capacity option is a new stand-alone target station using about one kW of proton beam power on a gas-cooled tungsten target, with dual test enclosures. Less expensive options are also described.

  1. Ambiguity in measuring matrix diffusion with single-well injection/recovery tracer tests

    Science.gov (United States)

    Lessoff, S.C.; Konikow, Leonard F.

    1997-01-01

    Single-well injection/recovery tracer tests are considered for use in characterizing and quantifying matrix diffusion in dual-porosity aquifers. Numerical modeling indicates that neither regional drift in homogeneous aquifers, nor heterogeneity in aquifers having no regional drift, nor hydrodynamic dispersion significantly affects these tests. However, when drift is coupled simultaneously with heterogeneity, they can have significant confounding effects on tracer return. This synergistic effect of drift and heterogeneity may help explain irreversible flow and inconsistent results sometimes encountered in previous single-well injection/recovery tracer tests. Numerical results indicate that in a hypothetical single-well injection/recovery tracer test designed to demonstrate and measure dual-porosity characteristics in a fractured dolomite, the simultaneous effects of drift and heterogeneity sometimes yields responses similar to those anticipated in a homogeneous dual-porosity formation. In these cases, tracer recovery could provide a false indication of the occurrence of matrix diffusion. Shortening the shut-in period between injection and recovery periods may make the test less sensitive to drift. Using multiple tracers having different diffusion characteristics, multiple tests having different pumping schedules, and testing the formation at more than one location would decrease the ambiguity in the interpretation of test data.

  2. Comparative Performance of Four Single Extreme Outlier Discordancy Tests from Monte Carlo Simulations

    Directory of Open Access Journals (Sweden)

    Surendra P. Verma

    2014-01-01

    Full Text Available Using highly precise and accurate Monte Carlo simulations of 20,000,000 replications and 102 independent simulation experiments with extremely low simulation errors and total uncertainties, we evaluated the performance of four single outlier discordancy tests (Grubbs test N2, Dixon test N8, skewness test N14, and kurtosis test N15 for normal samples of sizes 5 to 20. Statistical contaminations of a single observation resulting from parameters called δ from ±0.1 up to ±20 for modeling the slippage of central tendency or ε from ±1.1 up to ±200 for slippage of dispersion, as well as no contamination (δ=0 and ε=±1, were simulated. Because of the use of precise and accurate random and normally distributed simulated data, very large replications, and a large number of independent experiments, this paper presents a novel approach for precise and accurate estimations of power functions of four popular discordancy tests and, therefore, should not be considered as a simple simulation exercise unrelated to probability and statistics. From both criteria of the Power of Test proposed by Hayes and Kinsella and the Test Performance Criterion of Barnett and Lewis, Dixon test N8 performs less well than the other three tests. The overall performance of these four tests could be summarized as N2≅N15>N14>N8.

  3. Testing of the large bore single aperture 1-meter superconducting dipoles made with phenolic inserts

    CERN Document Server

    Boschmann, H; Dubbeldam, R L; Kirby, G A; Lucas, J; Ostojic, R; Russenschuck, Stephan; Siemko, A; Taylor, T M; Vanenkov, I; Weterings, W

    1998-01-01

    Two identical single aperture 1-metre superconducting dipoles have been built in collaboration with HMA Power Systems and tested at CERN. The 87.8 mm aperture magnets feature a single layer coil wound using LHC main dipole outer layer cable, phenolic spacer type collars, and a keyed two part structural iron yoke. The magnets are designed as models of the D1 separation dipole in the LHC experimental insertions, whose nominal field is 4.5 T at 4.5 K. In this report we present the test results of the two magnets at 4.3 K and 1.9 K.

  4. Crash tests of three identical low-wing single-engine airplane

    Science.gov (United States)

    Castle, C. B.; Alfaro-Bou, E.

    1983-01-01

    Three identical four place, low wing single engine airplane specimens with nominal masses of 1043 kg were crash tested under controlled free flight conditions. The tests were conducted at the same nominal velocity of 25 m/sec along the flight path. Two airplanes were crashed on a concrete surface (at 10 and 30 deg pitch angles), and one was crashed on soil (at a -30 deg pitch angle). The three tests revealed that the specimen in the -30 deg test on soil sustained massive structural damage in the engine compartment and fire wall. Also, the highest longitudinal cabin floor accelerations occurred in this test. Severe damage, but of lesser magnitude, occurred in the -30 deg test on concrete. The highest normal cabin floor accelerations occurred in this test. The least structural damage and lowest accelerations occurred in the 10 deg test on concrete.

  5. TR-PIV Performance Test for a Flow Field Measurement in a Single Rod Test Section

    Energy Technology Data Exchange (ETDEWEB)

    Park, Ju Yong; Shin, Chang Hwan; Lee, Chi Young; Oh, Dong Seok; In, Wang Kee [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

    2011-10-15

    For large enhancement of performance of Pressurized Water Reactor(PWR), dual-cooled fuel is being developed in Korea Atomic Energy Research Institute(KAERI). This nuclear fuel is a ring shape fuel which is different from conventional cylindrical nuclear fuel and cooling water flows both inner and outer channel. For this fuel, it widens the surface area. But it is bigger outer diameter of fuel rods. So, interval between fuel rods narrows. This because of outer channel flow is unstable. So, measurement of turbulence flow and perturbation that influence in heat transfer elevation is important.. To understand heat transfer characteristics by turbulence, measurement of flow perturbation element is necessary. To measure these turbulence characteristics, hot wire anemometer is widely used. However, it has many disadvantages such as low durability of prove, and big probe size. For these reasons, TR-PIV(Time-Resolved Particle Image Velocimetry) system is employed for better flow measurement in our research institute. TR-PIV system is consisted of laser system and high-speed camera that have high frequency. So, was judged that can measurement complicated turbulence flow and perturbation. In this paper, introduce TR-PIV system, and with results acquiring in single rod flow through this system, and wish to introduce about after this practical use plan

  6. TR-PIV Performance Test for a Flow Field Measurement in a Single Rod Test Section

    International Nuclear Information System (INIS)

    Park, Ju Yong; Shin, Chang Hwan; Lee, Chi Young; Oh, Dong Seok; In, Wang Kee

    2011-01-01

    For large enhancement of performance of Pressurized Water Reactor(PWR), dual-cooled fuel is being developed in Korea Atomic Energy Research Institute(KAERI). This nuclear fuel is a ring shape fuel which is different from conventional cylindrical nuclear fuel and cooling water flows both inner and outer channel. For this fuel, it widens the surface area. But it is bigger outer diameter of fuel rods. So, interval between fuel rods narrows. This because of outer channel flow is unstable. So, measurement of turbulence flow and perturbation that influence in heat transfer elevation is important.. To understand heat transfer characteristics by turbulence, measurement of flow perturbation element is necessary. To measure these turbulence characteristics, hot wire anemometer is widely used. However, it has many disadvantages such as low durability of prove, and big probe size. For these reasons, TR-PIV(Time-Resolved Particle Image Velocimetry) system is employed for better flow measurement in our research institute. TR-PIV system is consisted of laser system and high-speed camera that have high frequency. So, was judged that can measurement complicated turbulence flow and perturbation. In this paper, introduce TR-PIV system, and with results acquiring in single rod flow through this system, and wish to introduce about after this practical use plan

  7. Item Purification Does Not Always Improve DIF Detection: A Counterexample with Angoff's Delta Plot

    Science.gov (United States)

    Magis, David; Facon, Bruno

    2013-01-01

    Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score-based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores to get purified sets of items, unaffected by DIF. The…

  8. A field test for companded single sideband modulation Implications for capacity enhancement and transmission planning

    Science.gov (United States)

    Wallace, E.; Adams, C.; Arnstein, D.

    A series of field tests of companded single sideband modulation (CSSB) technique for use in the Intelsat system is described. A 12-channel circuit group was tested between switches in Pittsburgh, and the Deutsche Bundespost (DBP) in Frankfurt via the Etam and Raisting satellite earth stations. A transponder bulk that included existing FDM-FM carriers was chosen to match the typical, Intelsat operating conditions, thus permitting the compatibility of FDM/FM and CSSB to be examined simultaneously. Results of objective performance tests are discussed, and a description of several subjective testing techniques is also given.

  9. Reliability and validity of the Spanish version of the 10-item Connor-Davidson Resilience Scale (10-item CD-RISC) in young adults.

    Science.gov (United States)

    Notario-Pacheco, Blanca; Solera-Martínez, Montserrat; Serrano-Parra, María D; Bartolomé-Gutiérrez, Raquel; García-Campayo, Javier; Martínez-Vizcaíno, Vicente

    2011-08-05

    The 10-item Connor-Davidson Resilience Scale (10-item CD-RISC) is an instrument for measuring resilience that has shown good psychometric properties in its original version in English. The aim of this study was to evaluate the validity and reliability of the Spanish version of the 10-item CD-RISC in young adults and to verify whether it is structured in a single dimension as in the original English version. Cross-sectional observational study including 681 university students ranging in age from 18 to 30 years. The number of latent factors in the 10 items of the scale was analyzed by exploratory factor analysis. Confirmatory factor analysis was used to verify whether a single factor underlies the 10 items of the scale as in the original version in English. The convergent validity was analyzed by testing whether the mean of the scores of the mental component of SF-12 (MCS) and the quality of sleep as measured with the Pittsburgh Sleep Index (PSQI) were higher in subjects with better levels of resilience. The internal consistency of the 10-item CD-RISC was estimated using the Cronbach α test and test-retest reliability was estimated with the intraclass correlation coefficient.The Cronbach α coefficient was 0.85 and the test-retest intraclass correlation coefficient was 0.71. The mean MCS score and the level of quality of sleep in both men and women were significantly worse in subjects with lower resilience scores. The Spanish version of the 10-item CD-RISC showed good psychometric properties in young adults and thus can be used as a reliable and valid instrument for measuring resilience. Our study confirmed that a single factor underlies the resilience construct, as was the case of the original scale in English.

  10. Reliability and validity of the Spanish version of the 10-item Connor-Davidson Resilience Scale (10-item CD-RISC in young adults

    Directory of Open Access Journals (Sweden)

    García-Campayo Javier

    2011-08-01

    Full Text Available Abstract Background The 10-item Connor-Davidson Resilience Scale (10-item CD-RISC is an instrument for measuring resilience that has shown good psychometric properties in its original version in English. The aim of this study was to evaluate the validity and reliability of the Spanish version of the 10-item CD-RISC in young adults and to verify whether it is structured in a single dimension as in the original English version. Findings Cross-sectional observational study including 681 university students ranging in age from 18 to 30 years. The number of latent factors in the 10 items of the scale was analyzed by exploratory factor analysis. Confirmatory factor analysis was used to verify whether a single factor underlies the 10 items of the scale as in the original version in English. The convergent validity was analyzed by testing whether the mean of the scores of the mental component of SF-12 (MCS and the quality of sleep as measured with the Pittsburgh Sleep Index (PSQI were higher in subjects with better levels of resilience. The internal consistency of the 10-item CD-RISC was estimated using the Cronbach α test and test-retest reliability was estimated with the intraclass correlation coefficient. The Cronbach α coefficient was 0.85 and the test-retest intraclass correlation coefficient was 0.71. The mean MCS score and the level of quality of sleep in both men and women were significantly worse in subjects with lower resilience scores. Conclusions The Spanish version of the 10-item CD-RISC showed good psychometric properties in young adults and thus can be used as a reliable and valid instrument for measuring resilience. Our study confirmed that a single factor underlies the resilience construct, as was the case of the original scale in English.

  11. In-flight and ground testing of single event upset sensitivity in static RAMs

    International Nuclear Information System (INIS)

    Johansson, K.; Dyreklev, P.; Granbom, B.; Calvet, C.; Fourtine, S.; Feuillatre, O.

    1998-01-01

    This paper presents the results from in-flight measurements of single event upsets (SEU) in static random access memories (SRAM) caused by the atmospheric radiation environment at aircraft altitudes. The memory devices were carried on commercial airlines at high altitude and mainly high latitudes. The SEUs were monitored by a Component Upset Test Equipment (CUTE), designed for this experiment. The in flight results are compared to ground based testing with neutrons from three different sources

  12. Hydrogeological study of single water conducting fracture using a crosshole hydraulic test apparatus

    International Nuclear Information System (INIS)

    Yamamoto, Hajime; Shimo, Michito; Yamamoto, Takuya

    1998-03-01

    The Crosshole Injection Test Apparatus has been constructed to evaluate the hydraulic properties and conditions, such as hydraulic conductivity and its anisotropy, storage coefficient, pore pressure etc. within a rock near a drift. The construction started in FY93 and completed on August FY96 as a set of equipments for the use of crosshole hydraulic test, which is composed of one injection borehole instrument, one observation borehole instrument and a set of on-ground instrument. In FY96, in-situ feasibility test was conducted at a 550 m level drift in Kamaishi In Situ Test Site which has been operated by PNC, and the performance of the equipment and its applicability to various types of injection method were confirmed. In this year, a hydrogeological investigation on the single water conducting fracture was conducted at a 250 m level drift in Kamaishi In Situ Test Site, using two boreholes, KCH-3 and KCH-4, both of which are 30 m depth and inclined by 45 degrees from the surface. Pressure responses at the KCH-3 borehole during the drilling of KCH-4 borehole, the results of Borehole TV logging and core observation indicated that a major conductive single-fracture was successfully isolated by the packers. As a result of a series of the single-hole and the crosshole tests (sinusoidal and constant flowrate test), the hydraulic parameters of the single-fracture (such as hydraulic conductivity and storage coefficient) were determined. This report shows all the test result, analysed data, and also describes the hydro-geological structure near the drift. (author)

  13. Defining Deficient Items by IRT Analysis of Calibration Data.

    Science.gov (United States)

    Krass, Iosif A.; Thomasson, Gary L.

    New items are being calibrated for the next generation of the computerized adaptive (CAT) version of the Armed Services Vocational Aptitude Battery (ASVAB) (Forms 5 and 6). The requirements that the items be "good" three-parameter logistic (3-PL) model items and typically "like" items in the previous CAT-ASVAB tests have…

  14. Item Response Theory Using Hierarchical Generalized Linear Models

    Science.gov (United States)

    Ravand, Hamdollah

    2015-01-01

    Multilevel models (MLMs) are flexible in that they can be employed to obtain item and person parameters, test for differential item functioning (DIF) and capture both local item and person dependence. Papers on the MLM analysis of item response data have focused mostly on theoretical issues where applications have been add-ons to simulation…

  15. Single-Cylinder Diesel Engine Tests with Unstabilized Water-in-Fuel Emulsions

    Science.gov (United States)

    1978-08-01

    A single-cylinder, four-stroke cycle diesel engine was operated on unstabilized water-in-fuel emulsions. Two prototype devices were used to produce the emulsions on-line with the engine. More than 350 test points were run with baseline diesel fuel an...

  16. Microwave testing of high-Tc based direct current to a single flux quantum converter

    DEFF Research Database (Denmark)

    Kaplunenko, V. K.; Fischer, Gerd Michael; Ivanov, Z. G.

    1994-01-01

    Design, simulation, and experimental investigations of a direct current to a single flux quantum converter loaded with a Josephson transmission line and driven by an external 70 GHz microwave oscillator are reported. The test circuit includes nine YBaCuO Josephson junctions aligned on the grain...

  17. Should the diagnosis of COPD be based on a single spirometry test?

    NARCIS (Netherlands)

    Schermer, T.R.; Robberts, B.; Crockett, A.J.; Thoonen, B.P.; Lucas, A.; Grootens, J.; Smeele, I.J.; Thamrin, C.; Reddel, H.K.

    2016-01-01

    Clinical guidelines indicate that a chronic obstructive pulmonary disease (COPD) diagnosis is made from a single spirometry test. However, long-term stability of diagnosis based on forced expiratory volume in 1 s over forced vital capacity (FEV1/FVC) ratio has not been reported. In primary care

  18. Should the diagnosis of COPD be based on a single spirometry test?

    NARCIS (Netherlands)

    Schermer, T.R.J.; Robberts, B.; Crockett, A.J.; Thoonen, B.P.A.; Lucas, A.; Grootens, J.; Smeele, I.J.; Thamrin, C.; Reddel, H.K.

    2016-01-01

    Clinical guidelines indicate that a chronic obstructive pulmonary disease (COPD) diagnosis is made from a single spirometry test. However, long-term stability of diagnosis based on forced expiratory volume in 1 s over forced vital capacity (FEV1/FVC) ratio has not been reported. In primary care

  19. A single well pumping and recovery test to measure in situ acrotelm transmissivity in raised bogs

    NARCIS (Netherlands)

    Schaaf, van der S.

    2004-01-01

    A quasi-steady-state single pit pumping and recovery test to measure in situ the transmissivity of the highly permeable upper layer of raised bogs, the acrotelm, is described and discussed. The basic concept is the expanding depression cone during both pumping and recovery. It is shown that applying

  20. Generalized Single-Case Randomization Tests: Flexible Analyses for a Variety of Situations.

    Science.gov (United States)

    Levin, Joel R.; Wampold, Bruce E.

    1999-01-01

    Presents a general class of single-case statistical procedures derived from previously developed nonparametric randomization tests. Designs are illustrated that focus on the general and comparative effectiveness of alternative interventions, multiple units with differentiable characteristics, and multiple outcome measures. Provides operational…

  1. Inter-rater reliability, standard error of measurement and minimal detectable change of the 12-item WHODAS 2.0 and four performance tests in institutionalized ambulatory older adults.

    Science.gov (United States)

    Silva, Anabela G; Cerqueira, Margarida; Raquel Santos, Ana; Ferreira, Catarina; Alvarelhão, Joaquim; Queirós, Alexandra

    2017-10-24

    Self-reported and performance-based instruments are both necessary for a comprehensive view of the functioning of institutionalized older adults. Our aim was to assess the reliability and measurement error of the 12-item World Health Organization Disability assessment Schedule and compare these indexes against performance-based tests. One hundred participants from Nursing Homes and Day Care Centers were assessed twice (two days to one week apart) by two independent assessors. Reliability and measurement error indexes were calculated. Reliability of the World Health Organization Disability assessment Schedule total score, and of three performance tests was appropriate for individual comparisons (ICC ≥ 0.92). Reliability for the five times seat to stand test was appropriate for group comparisons only (ICC = 0.84). The high measurement error of the timed up and go test (SEM = 4.25; MDC = 11.78) and of the five times seat to stand test (SEM = 3.47; MDC = 9.62) and the number of participants unable to perform them (TUG: n = 11; FTSST: n = 41) suggest that these tests are less suitable to monitor individual changes. The 12-item World Health Organization Disability Assessment Schedule total score, the gait speed and hand grip tests could be used to monitor changes at both the individual and group level in a population with decreased functioning. Implications for Rehabilitation The 12-item World Health Organization Disability assessment Schedule, could be used to monitor changes in perceived functioning both at the individual and group level in institutionalized ambulatory older adults. The gait speed and hand grip tests could be used to monitor changes in performance both at the individual and group level in institutionalized ambulatory older adults' functioning. The utility of the time up and go and of the five times seat to stand test might be of limited value when aiming to monitor changes in institutionalized older adults' functioning.

  2. Assessing the factor structure of a role functioning item bank.

    Science.gov (United States)

    Anatchkova, Milena D; Ware, John E; Bjorner, Jakob B

    2011-06-01

    Role functioning (RF) is an important part of health-related quality of life, but is hard to measure due to the wide definition of roles and fluctuations in role participation. This study aims to explore the dimensionality of a newly developed item bank assessing the impact of health on RF. A battery of measures with skip patterns including the new RF bank was completed by 2,500 participants answering only questions on social roles relevant to them. Confirmatory factor analyses were conducted for the participants answering items from all conceptual domains (N = 1193). Conceptually based dimensionality and method effects reflecting positively and negatively worded items were explored in a series of models. A bi-factor model (CFI = .93, RMSEA = .08) with one general and four conceptual factors (social, family, occupation, generic) was retained. Positively worded items were excluded from the final solution due to misfit. While a single factor model with methods factors had a poor fit (CFI = .88, RMSEA = .13), high loadings on the general factor in the bi-factor model suggest that the RF bank is sufficiently unidimensional for IRT analysis. The bank demonstrated sufficient unidimensionality for IRT-based calibration of all the items on a common metric and development of a computerized adaptive test.

  3. Propriedades psicométricas dos itens do teste WISC-III Propiedades psicométricas de los ítenes del subtest WISC-III Psychometric properties of WISC-III items

    Directory of Open Access Journals (Sweden)

    Vera Lúcia Marques de Figueiredo

    2008-09-01

    Full Text Available O aperfeiçoamento de um teste se dá através da seleção, substituição ou revisão de itens, e quando um item é analisado, aumenta a validade e precisão do teste. Este artigo trata da apresentação dos resultados relativos às propriedades psicométricas dos itens dos subtestes do WISC-III, referentes a dificuldade, discriminação e validade. O WISC-III é um instrumento amplamente utilizado no contexto da avaliação da inteligência, e conhecer a qualidade dos itens é essencial ao profissional que administra o teste. As análises foram efetuadas com base nas pontuações de 801 protocolos do teste, aplicados por ocasião da pesquisa de adaptação a um contexto brasileiro. As análises mostraram que os itens adaptados apresentaram características psicométricas adequadas, possibilitando a utilização do instrumento como meio confiável de diagnóstico.El perfeccionamiento de un teste ocurre por la selección, sustitución o revisión de ítenes y, cuando un item es analisado, aumenta la validez y fiabilidad del teste. Ese artículo trata de la presentación de los resultados relativos a las propiedades psicométricas de los ítenes del subtest WISC-III, referentes a la dificultad, a la discriminación y a la validez. El WISC-III es un instrumento muy utilizado en el contexto de la evaluación de la inteligencia, y conocer a la calidad de los ítenes es esencial al profesional que administra el teste. Los análisis fueron efectuados con base el los puntajes de 801 protocolos de registro del teste, aplicados por ocasión de encuesta de estandarización a un contexto brasileño. Los análisis enseñaron que los ítenes adaptados apuntaron características psicométricas adecuadas, permitiendo la utilización del instrumento como medio confiable de diagnóstico.The improvement of the quality of items by selection, substitution and review will increase a test's validity and reliability. Current essay will present results referring to

  4. Modelización de una Prueba de Analogías Figurales con la Teoría de Respuesta al Ítem / Modelling Figural Analogies Test with the Item Response Theory

    Directory of Open Access Journals (Sweden)

    G. Diego Blum

    2011-12-01

    Full Text Available The psychometric properties of a Figural Analogies Test are described within the framework of Item Response Theory. Thirty-six 2x2 matrix figures were constructed by using location, distortion and number rules. The sample included 499 psychology students from the University of Buenos Aires, 79% of whom were women. The 3-Parameter Logistic Model was used obtaining a highly satisfactory global fit at 5% (p = .47. Only 3 items did not fit the model. It had good overall discriminatory power (a: M = 1.02, SD = .33, a medium level of difficulty (b: M = -.03, SD = .63 and the c level was slightly lower than expected with six possible answers (c: M = .14, SD = .05. The conditions for modelling the test and possible disadvantages of the present study are discussed.

  5. Fabrication and testing of full-length single-cell externally fueled converters for thermionic reactors

    International Nuclear Information System (INIS)

    Schock, A.

    1994-01-01

    The preceding paper described designs and analyses of thermionic reactors employing full-core-length single-cell converters with their heated emitters located on the outside of their internally cooled collectors, and it presented results of detailed parametric analyses which illustrate the benefits of this unconventional design. The present paper describes the fabrication and testing of full-length prototypical converters, both unfueled and fueled, and presents parametric results of electrically heated tests. The unfueled converter tests demonstrated the practicality of operating such long converters without shorting across a 0.3-mm interelectrode gap. They produced a measured peak output of 751 watts(e) from a single diode and a peak efficiency of 15.4%. The fueled converter tests measured the parametric performance of prototypic UO 2 -fueled converters designed for subsequent in-pile testing. They employed revolver-shaped tungsten elements with a central emitter hole surrounded by six fuel chambers. The full-length converters were heated by a water-cooled RF-induction coil inside an ion-pumped vacuum chamber. This required development of high-vacuum coaxial RF feedthroughs. In-pile test rules required multiple containment of the UO 2 -fuel, which complicated the fabrication of the test article and required successful development of techniques for welding tungsten and other refractory components. The tests measured a peak power output of 530 watts(e) or 7.1 watts/cm 2 at an efficiency of 11.5%

  6. Tank selection for Light Duty Utility Arm (LDUA) system hot testing in a single shell tank

    Energy Technology Data Exchange (ETDEWEB)

    Bhatia, P.K.

    1995-01-31

    The purpose of this report is to recommend a single shell tank in which to hot test the Light Duty Utility Arm (LDUA) for the Tank Waste Remediation System (TWRS) in Fiscal Year 1996. The LDUA is designed to utilize a 12 inch riser. During hot testing, the LDUA will deploy two end effectors (a High Resolution Stereoscopic Video Camera System and a Still/Stereo Photography System mounted on the end of the arm`s tool interface plate). In addition, three other systems (an Overview Video System, an Overview Stereo Video System, and a Topographic Mapping System) will be independently deployed and tested through 4 inch risers.

  7. Comparison of the neural correlates of encoding item-item and item-context associations

    Directory of Open Access Journals (Sweden)

    Jenny X Wong

    2013-08-01

    Full Text Available fMRI was employed to investigate the role of the left inferior frontal gyrus (LIFG in the encoding of item-item and item-context associations. On each of a series of study trials subjects viewed a picture that was presented either to the left or right of fixation, along with a subsequently presented word that appeared at fixation. Memory was tested in a subsequent memory test that took place outside of the scanner. On each test trial one of two forced choice judgments was required. For the associative test, subjects chose between the word paired with the picture at study and a word studied on a different trial. For the source test, the judgment was whether the picture had been presented on the left or right. Successful encoding of associative information was accompanied by subsequent memory effects in several cortical regions, including much of the LIFG. By contrast, successful source encoding was selectively associated with a subsequent memory effect in right fusiform cortex. The finding that the LIFG was enhanced during successful associative, but not source, encoding is interpreted in light of the proposal that subsequent memory effects are localized to cortical regions engaged by the on-line demands of the study task.

  8. First steps towards fuel cells testing harmonisation: Procedures and parameters for single cell performance evaluation

    Energy Technology Data Exchange (ETDEWEB)

    Lunghi, P. [Department of Industrial Engineering, University of Perugia, Via Duranti 93, Perugia (Italy); Ubertini, S. [Department of Mechanical Engineering, University of Rome ' ' Tor Vergata' ' , Via di Torvergata, 110, Rome (Italy)

    2004-01-01

    The great interest in Fuel Cell Systems, combined with the innovation of the device itself, has led to a huge developmental effort to make the steps necessary for future FC plant commissioning. The clearest and most effective way to evaluate the performance of a fuel cell is to measure it directly and, since few fuel cell test rigs are available at the moment, standard experimental procedures have not been realized so far. Our research group is currently performing single cell testing at the University of Perugia fuel cell laboratory and particular emphasis has been put on the definition of procedures and the testing of parameterisation. The work team strongly believes that this is the key to effective system testing and reliable performance evaluation. In this work, the test parameterisation developed by the team, and the resulting advanced control procedure used for a single MCFC experimental characterization are presented. Efforts have been dedicated to obtain some relevant non-dimensional parameters to allow an easy understanding of the results and quick comparisons with other tests under different operating conditions, or with results obtained on different cells eventually tested in different laboratories. The authors strongly emphasise this topic to avoid the data that developers and research institutions collect being of no practical use due to a lack of shared rules. (Abstract Copyright [2003], Wiley Periodicals, Inc.)

  9. Experimental results of single screw mechanical tests: a follow-up to SAND2005-6036.

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sandwook; Lee, Kenneth L.; Korellis, John S.; McFadden, Sam X.

    2006-08-01

    The work reported here was conducted to address issues raised regarding mechanical testing of attachment screws described in SAND2005-6036, as well as to increase the understanding of screw behavior through additional testing. Efforts were made to evaluate fixture modifications and address issues of interest, including: fabrication of 45{sup o} test fixtures, measurement of the frictional load from the angled fixture guide, employment of electromechanical displacement transducers, development of a single-shear test, and study the affect of thread start orientation on single-shear behavior. A286 and 302HQ, No.10-32 socket-head cap screws were tested having orientations with respect to the primary loading axis of 0{sup 0}, 45{sup o}, 60{sup o}, 75{sup o} and 90{sup o} at stroke speeds 0,001 and 10 in/sec. The frictional load resulting from the angled screw fixture guide was insignificant. Load-displacement curves of A286 screws did not show a minimum value in displacement to failure (DTF) for 60{sup o} shear tests. Tests of 302HQ screws did not produce a consistent trend in DTF with load angle. The effect of displacement rate on DTF became larger as shear angle increased for both A286 and 302HQ screws.

  10. Single-well interference slug tests to assess the vertical hydraulic conductivity of unconsolidated aquifers

    Science.gov (United States)

    Paradis, Daniel; Lefebvre, René

    2013-01-01

    SummaryMeaningful understanding of flow and solute transport in general requires the knowledge of hydraulic conductivity and its anisotropy. Various field methods allow the measurement of the horizontal component (Kh), but vertical hydraulic conductivity (Kv) is rarely measured, for lack of practical field tests. This paper proposes vertical interference slug tests, an adaptation of inter-well interference slug tests to a single well, for the efficient field measurement of Kv. The test is carried out in a single well between a stress and an observation interval that are vertically isolated with a three-packer assembly. An instantaneous pressure pulse is induced in the stress interval and resulting drawdowns are recorded in both the stress and the observation intervals. In a proof-of-concept field study, 12 vertical interference tests were carried out sequentially along a fully-screened well across a moderately heterogeneous and highly anisotropic aquifer made up of littoral silts and sands. A direct-push method was used to install the well, which was completed without sand-pack to allow the natural collapse of sediments in the thin annular space around the screen. Direct-push wells allow the measurement of in situ hydraulic properties of sediments and minimize well construction interferences with hydraulic tests. Drawdowns measured in stress and observation intervals of multiple tests were simultaneously inverted numerically to reconstruct heterogeneous profiles of Kh, hydraulic conductivity anisotropy (Kv/Kh), and specific storage (Ss). Results were validated by comparison of observed versus predicted drawdowns and with field and laboratory measurements of Kh and Kv made along the tested well. Results indicate that the profile of Kv values obtained with vertical interference slug tests follows a similar pattern with depth than the profile with lab measurements made with a permeameter on soil samples collected in the same intervals as the interference tests, which

  11. A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

    Science.gov (United States)

    Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

    2018-04-10

    To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading .3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.

  12. Screening of nanosatellite microprocessors using californium single-event latch-up test results

    Science.gov (United States)

    Tomioka, Takahiro; Okumura, Yuta; Masui, Hirokazu; Takamiya, Koichi; Cho, Mengu

    2016-09-01

    A single-event latch-up (SEL) test using a 252Cf radioisotope was carried out. The results were compared with those of a proton test and from observation in orbit. A radioisotope can reproduce phenomena observed in orbit that are caused by protons. Considering the inexpensive nature of the 252Cf test, it is more suitable for nanosatellites that require low cost and fast delivery. A SEL occurrence rate of a commercial-off-the-shelf microprocessor was derived from the ground test results. The 252Cf test provided a SEL rate approximately 1×106 times greater than that in orbit. This data can be used to derive the minimum SEL occurrence rate in orbit and help satellite designers to evaluate the risk of SEL and take measures if necessary.

  13. Assessing the equivalence of Web-based and paper-and-pencil questionnaires using differential item and test functioning (DIF and DTF) analysis: a case of the Four-Dimensional Symptom Questionnaire (4DSQ).

    Science.gov (United States)

    Terluin, Berend; Brouwers, Evelien P M; Marchand, Miquelle A G; de Vet, Henrica C W

    2018-05-01

    Many paper-and-pencil (P&P) questionnaires have been migrated to electronic platforms. Differential item and test functioning (DIF and DTF) analysis constitutes a superior research design to assess measurement equivalence across modes of administration. The purpose of this study was to demonstrate an item response theory (IRT)-based DIF and DTF analysis to assess the measurement equivalence of a Web-based version and the original P&P format of the Four-Dimensional Symptom Questionnaire (4DSQ), measuring distress, depression, anxiety, and somatization. The P&P group (n = 2031) and the Web group (n = 958) consisted of primary care psychology clients. Unidimensionality and local independence of the 4DSQ scales were examined using IRT and Yen's Q3. Bifactor modeling was used to assess the scales' essential unidimensionality. Measurement equivalence was assessed using IRT-based DIF analysis using a 3-stage approach: linking on the latent mean and variance, selection of anchor items, and DIF testing using the Wald test. DTF was evaluated by comparing expected scale scores as a function of the latent trait. The 4DSQ scales proved to be essentially unidimensional in both modalities. Five items, belonging to the distress and somatization scales, displayed small amounts of DIF. DTF analysis revealed that the impact of DIF on the scale level was negligible. IRT-based DIF and DTF analysis is demonstrated as a way to assess the equivalence of Web-based and P&P questionnaire modalities. Data obtained with the Web-based 4DSQ are equivalent to data obtained with the P&P version.

  14. Measurement of LNAPL flux using single-well intermittent mixing tracer dilution tests.

    Science.gov (United States)

    Smith, Tim; Sale, Tom; Lyverse, Mark

    2012-01-01

    The stability of subsurface Light Nonaqueous Phase Liquids (LNAPLs) is a key factor driving expectations for remedial measures at LNAPL sites. The conventional approach to resolving LNAPL stability has been to apply Darcy's Equation. This paper explores an alternative approach wherein single-well tracer dilution tests with intermittent mixing are used to resolve LNAPL stability. As a first step, an implicit solution for single-well intermittent mixing tracer dilution tests is derived. This includes key assumptions and limits on the allowable time between intermittent mixing events. Second, single-well tracer dilution tests with intermittent mixing are conducted under conditions of known LNAPL flux. This includes a laboratory sand tank study and two field tests at active LNAPL recovery wells. Results from the sand tank studies indicate that LNAPL fluxes in wells can be transformed into formation fluxes using corrections for (1) LNAPL thicknesses in the well and formation and (2) convergence of flow to the well. Using the apparent convergence factor from the sand tank experiment, the average error between the known and measured LNAPL fluxes is 4%. Results from the field studies show nearly identical known and measured LNAPL fluxes at one well. At the second well the measured fluxes appear to exceed the known value by a factor of two. Agreement between the known and measured LNAPL fluxes, within a factor of two, indicates that single-well tracer dilution tests with intermittent mixing can be a viable means of resolving LNAPL stability. © 2012, The Author(s). Ground Water © 2012, National Ground Water Association.

  15. Single-task and dual-task tandem gait test performance after concussion.

    Science.gov (United States)

    Howell, David R; Osternig, Louis R; Chou, Li-Shan

    2017-07-01

    To compare single-task and dual-task tandem gait test performance between athletes after concussion with controls on observer-timed, spatio-temporal, and center-of-mass (COM) balance control measurements. Ten participants (19.0±5.5years) were prospectively identified and completed a tandem gait test protocol within 72h of concussion and again 1 week, 2 weeks, 1 month, and 2 months post-injury. Seven uninjured controls (20.0±4.5years) completed the same protocol in similar time increments. Tandem gait test trials were performed with (dual-task) and without (single-task) concurrently performing a cognitive test as whole-body motion analysis was performed. Outcome variables included test completion time, average tandem gait velocity, cadence, and whole-body COM frontal plane displacement. Concussion participants took significantly longer to complete the dual-task tandem gait test than controls throughout the first 2 weeks post-injury (mean time=16.4 [95% CI: 13.4-19.4] vs. 10.1 [95% CI: 6.4-13.7] seconds; p=0.03). Single-task tandem gait times were significantly lower 72h post-injury (p=0.04). Dual-task cadence was significantly lower for concussion participants than controls (89.5 [95% CI: 68.6-110.4] vs. 127.0 [95% CI: 97.4-156.6] steps/minute; p=0.04). Moderately-high to high correlations between tandem gait test time and whole-body COM medial-lateral displacement were detected at each time point during dual-task gait (r s =0.70-0.93; p=0.03-0.001). Adding a cognitive task during the tandem gait test resulted in longer detectable deficits post-concussion compared to the traditional single-task tandem gait test. As a clinical tool to assess dynamic motor function, tandem gait may assist with return to sport decisions after concussion. Copyright © 2017 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  16. Estimation of single crystal elastic constants using ultrasonic testing - a feasibility study

    International Nuclear Information System (INIS)

    Phani Kumar, K.K.; Rentala, Vamsi Krishna; Gautam, Jaiprakash; Mylavarapu, Phani

    2015-01-01

    Estimation of single crystal elastic constants (SCEC) of metallic materials is of paramount importance in the development of crystal plasticity based models as well as for studying the effect of microstructure on wave propagation. SCEC are usually determined destructively by tensile and shear loading a single crystal specimen. These constants can also be estimated non-destructively, using X-ray diffraction measurements on a polycrystalline specimen. However, the aforementioned procedures have a limitation of either the sample size (in case of X-ray diffraction) or, availability of single crystal (in case of destructive testing). Hence, in this study, an effort has been undertaken to estimate SCEC by subjecting polycrystalline specimens to ultrasonic testing. Ultrasonic longitudinal and shear velocities, longitudinal attenuation coefficient and ultrasonic backscattered grain noise will be measured on pure Cu specimen. Further, these parameters will also be calculated analytically using existing relationships involving, elastic constants, grain size probability level, ultrasonic longitudinal and shear wave velocities, attenuation coefficient and backscattered grain noise. By minimizing the difference between experimentally measured and analytically calculated ultrasonic parameters, an attempt will be made to estimate single crystal elastic constants. (author)

  17. Single well surfactant test to evaluate surfactant floods using multi tracer method

    Science.gov (United States)

    Sheely, Clyde Q.

    1979-01-01

    Data useful for evaluating the effectiveness of or designing an enhanced recovery process said process involving mobilizing and moving hydrocarbons through a hydrocarbon bearing subterranean formation from an injection well to a production well by injecting a mobilizing fluid into the injection well, comprising (a) determining hydrocarbon saturation in a volume in the formation near a well bore penetrating formation, (b) injecting sufficient mobilizing fluid to mobilize and move hydrocarbons from a volume in the formation near the well bore, and (c) determining the hydrocarbon saturation in a volume including at least a part of the volume of (b) by an improved single well surfactant method comprising injecting 2 or more slugs of water containing the primary tracer separated by water slugs containing no primary tracer. Alternatively, the plurality of ester tracers can be injected in a single slug said tracers penetrating varying distances into the formation wherein the esters have different partition coefficients and essentially equal reaction times. The single well tracer method employed is disclosed in U.S. Pat. No. 3,623,842. This method designated the single well surfactant test (SWST) is useful for evaluating the effect of surfactant floods, polymer floods, carbon dioxide floods, micellar floods, caustic floods and the like in subterranean formations in much less time and at much reduced cost compared to conventional multiwell pilot tests.

  18. Measurement of single-top cross section and test of anomalous $Wtb$ coupling

    Energy Technology Data Exchange (ETDEWEB)

    Jung, Ji-Eun [Seoul National Univ. (Korea, Republic of)

    2010-01-01

    The top quark is most often produced in tt pairs via the strong interaction, however electroweak production of a singly-produced top quark is also possible. Electroweak single-top production is more difficult to observe than tt production. Studying single-top production is important for the following reasons. It provides direct measurement of the CKM matrix element and also single-top events are a background to several searches for SM or non-SM signals, such as Higgs boson searches. The information of spin polarization of top-quark can be used to t est anomalous W-t-b coupling. This thesis describes the result of a measurement of single-top cross-section and a test of anomalous W-t-b coupling using 4.8 f b-1 of data collected by the CDF Run II experiment at the Fermilab Tevatron. The measured cross-section is 1.83$+0.7\\atop{-0.6}$ pb and measured limit of |Vtb| is 0.41 at 95% CL. The fraction of V+A coupling is 0 ± 28 (%).

  19. A New Kind of Single-Well Tracer Test for Assessing Subsurface Heterogeneity

    Science.gov (United States)

    Hansen, S. K.; Vesselinov, V. V.; Lu, Z.; Reimus, P. W.; Katzman, D.

    2017-12-01

    Single-well injection-withdrawal (SWIW) tracer tests have historically been interpreted using the idealized assumption of tracer path reversibility (i.e., negligible background flow), with background flow due to natural hydraulic gradient being an un-modeled confounding factor. However, we have recently discovered that it is possible to use background flow to our advantage to extract additional information about the subsurface. To wit: we have developed a new kind of single-well tracer test that exploits flow due to natural gradient to estimate the variance of the log hydraulic conductivity field of a heterogeneous aquifer. The test methodology involves injection under forced gradient and withdrawal under natural gradient, and makes use of a relationship, discovered using a large-scale Monte Carlo study and machine learning techniques, between power law breakthrough curve tail exponent and log-hydraulic conductivity variance. We will discuss how we performed the computational study and derived this relationship and then show an application example in which our new single-well tracer test interpretation scheme was applied to estimation of heterogeneity of a formation at the chromium contamination site at Los Alamos National Laboratory. Detailed core hole records exist at the same site, from which it was possible to estimate the log hydraulic conductivity variance using a Kozeny-Carman relation. The variances estimated using our new tracer test methodology and estimated by direct inspection of core were nearly identical, corroborating the new methodology. Assessment of aquifer heterogeneity is of critical importance to deployment of amendments associated with in-situ remediation strategies, since permeability contrasts potentially reduce the interaction between amendment and contaminant. Our new tracer test provides an easy way to obtain this information.

  20. Teoria da Resposta ao Item Teoria de la respuesta al item Item response theory

    Directory of Open Access Journals (Sweden)

    Eutalia Aparecida Candido de Araujo

    2009-12-01

    Full Text Available A preocupação com medidas de traços psicológicos é antiga, sendo que muitos estudos e propostas de métodos foram desenvolvidos no sentido de alcançar este objetivo. Entre os trabalhos propostos, destaca-se a Teoria da Resposta ao Item (TRI que, a princípio, veio completar limitações da Teoria Clássica de Medidas, empregada em larga escala até hoje na medida de traços psicológicos. O ponto principal da TRI é que ela leva em consideração o item particularmente, sem relevar os escores totais; portanto, as conclusões não dependem apenas do teste ou questionário, mas de cada item que o compõe. Este artigo propõe-se a apresentar esta Teoria que revolucionou a teoria de medidas.La preocupación con las medidas de los rasgos psicológicos es antigua y muchos estudios y propuestas de métodos fueron desarrollados para lograr este objetivo. Entre estas propuestas de trabajo se incluye la Teoría de la Respuesta al Ítem (TRI que, en principio, vino a completar las limitaciones de la Teoría Clásica de los Tests, ampliamente utilizada hasta hoy en la medida de los rasgos psicológicos. El punto principal de la TRI es que se tiene en cuenta el punto concreto, sin relevar las puntuaciones totales; por lo tanto, los resultados no sólo dependen de la prueba o cuestionario, sino que de cada ítem que lo compone. En este artículo se propone presentar la Teoría que revolucionó la teoría de medidas.The concern with measures of psychological traits is old and many studies and proposals of methods were developed to achieve this goal. Among these proposed methods highlights the Item Response Theory (IRT that, in principle, came to complete limitations of the Classical Test Theory, which is widely used until nowadays in the measurement of psychological traits. The main point of IRT is that it takes into account the item in particular, not relieving the total scores; therefore, the findings do not only depend on the test or questionnaire

  1. Critical evaluation of the pulsed laser method for single event effects testing and fundamental studies

    International Nuclear Information System (INIS)

    Melinger, J.S.; Buchner, S.; McMorrow, D.; Stapor, W.J.; Weatherford, T.R.; Campbell, A.B.; Eisen, H.

    1994-01-01

    In this paper the authors present an evaluation of the pulsed laser as a technique for single events effects (SEE) testing. They explore in detail the important optical effects, such as laser beam propagation, surface reflection, and linear and nonlinear absorption, which determine the nature of laser-generated charge tracks in semiconductor materials. While there are differences in the structure of laser- and ion-generated charge tracks, they show that in many cases the pulsed laser remains an invaluable tool for SEE testing. Indeed, for several SEE applications, they show that the pulsed laser method represents a more practical approach than conventional accelerator-based methods

  2. Testing of plain and fibrous concrete single cavity prestressed concrete reactor vessel models

    International Nuclear Information System (INIS)

    Oland, C.B.

    1985-01-01

    Two single-cavity prestressed concrete reactor vessel (PCRV) models were fabricated and tested to failure to demonstrate the structural response and ultimate pressure capacity of models cast from high-strength concretes. Concretes with design compressive strengths in excess of 70 MPa (10,000 psi) were developed for this investigation. One model was cast from plain concrete and failed in shear at the head region. The second model was cast from fiber reinforced concrete and failed by rupturing the circumferential prestressing at the sidewall of the structure. The tests also demonstrated the capabilities of the liner system to maintain a leak-tight pressure boundary. 3 refs., 4 figs

  3. Effectiveness Analysis of a Non-Destructive Single Event Burnout Test Methodology

    CERN Document Server

    Oser, P; Spiezia, G; Fadakis, E; Foucard, G; Peronnard, P; Masi, A; Gaillard, R

    2014-01-01

    It is essential to characterize power MosFETs regarding their tolerance to destructive Single Event Burnouts (SEB). Therefore, several non-destructive test methods have been developed to evaluate the SEB cross-section of power devices. A power MosFET has been evaluated using a test circuit, designed according to standard non-destructive test methods discussed in the literature. Guidelines suggest a prior adaptation of auxiliary components to the device sensitivity before the radiation test. With the first value chosen for the de-coupling capacitor, the external component initiated destructive events and affected the evaluation of the cross-section. As a result, the influence of auxiliary components on the device cross-section was studied. This paper presents the obtained experimental results, supported by SPICE simulations, to evaluate and discuss how the circuit effectiveness depends on the external components.

  4. Single-well "push-pull" partitioning tracer test for NAPL detection in the subsurface.

    Science.gov (United States)

    Istok, Jonathan D; Field, Jennifer A; Schroth, Martin H; Davis, Brian M; Dwarakanath, Varadarajan

    2002-06-15

    Previous environmental applications of partitioning tracer tests to detect and quantify nonaqueous phase liquid (NAPL) contamination in the subsurface have been limited to well-to-well tests. However, theory and numerical modeling suggests that single-well injection-extraction ("push-pull") partitioning tracer tests can also potentially detect and quantify NAPL contamination. In this type of test, retardation factors for injected partitioning tracers are estimated from the increase in apparent dispersion observed in extraction-phase breakthrough curves in the presence of NAPL. A series of laboratory push-pull tests was conducted in physical aquifer models (PAMs) packed with natural aquifer sediment prepared with and without the presence of trichloroethene (TCE) NAPL. Field tests were conducted in an aquifer contaminated with petroleum hydrocarbon NAPL. Injected test solutions contained a suite of partitioning and conservative (nonpartitioning) alcohol tracers. Laboratory push-pull partitioning tracer tests were able to detect and quantify sorption of partitioning tracers to aquifer sediment (in the absence of NAPL) and to detect NAPL when it was present. NAPL saturations computed from estimated retardation factors bracketed those computed from known volumes of emplaced NAPL in the sediment pack. However, numerical modeling with assumed homogeneous NAPL distribution and linear equilibrium partitioning of tracers between aqueous and NAPL phases was unable to reproduce all features of observed breakthrough curves. Excavation of the sediment pack after all tests indicated that a portion of the emplaced NAPL had sunk to the bottom of the PAM invalidating the modeling assumption of homogeneous NAPL distribution. Moreover, the apparent dispersion in extraction-phase breakthrough curves decreased when the injection-extraction pumping rate was decreased, suggesting that mass transfer limitations existed during laboratory tests. Field push-pull partitioning tracer tests were

  5. Fabrication and Testing of Full-Length Single-Cell Externally Fueled Converters for Thermionic Reactors

    Energy Technology Data Exchange (ETDEWEB)

    Schock, Alfred

    1994-06-01

    The preceding paper described designs and analyses of thermionic reactors employing full-core-length single-cell converters with their heated emitters located on the outside of their internally cooled collectors, and it presented results of detailed parametric analyses which illustrate the benefits of this unconventional design. The present paper describes the fabrication and testing of full-length prototypical converters, both unfueled and fueled, and presents parametric results of electrically heated tests. The unfueled converter tests demonstrated the practicality of operating such long converters without shorting across a 0.3-mm interelectrode gap. They produced a measured peak output of 751 watts(e) from a single diode and a peak efficiency of 15.4%. The fueled converter tests measured the parametric performance of prototypic UO(subscript 2)-fueled converters designed for subsequent in-pile testing. They employed revolver-shaped tungsten elements with a central emitter hole surrounded by six fuel chambers. The full-length converters were heated by a water-cooled RF-induction coil inside an ion-pumped vacuum chamber. This required development of high-vacuum coaxial RF feedthroughs. In-pile test rules required multiple containment of the UO (subscript 2)-fuel, which complicated the fabrication of the test article and required successful development of techniques for welding tungsten and other refractory components. The test measured a peak power output of 530 watts(e) or 7.1 watts/cm (superscript 2) at an efficiency of 11.5%. There are three copies in the file. Cross-Reference a copy FSC-ESD-217-94-529 in the ESD files with a CID #8574.

  6. A scale purification procedure for evaluation of differential item functioning

    NARCIS (Netherlands)

    Khalid, Muhammad Naveed; Glas, Cornelis A.W.

    2014-01-01

    Item bias or differential item functioning (DIF) has an important impact on the fairness of psychological and educational testing. In this paper, DIF is seen as a lack of fit to an item response (IRT) model. Inferences about the presence and importance of DIF require a process of so-called test

  7. An Integer Programming Approach to Item Bank Design.

    Science.gov (United States)

    van der Linden, Wim J.; Veldkamp, Bernard P.; Reese, Lynda M.

    2000-01-01

    Presents an integer programming approach to item bank design that can be used to calculate an optimal blueprint for an item bank in order to support an existing testing program. Demonstrates the approach empirically using an item bank designed for the Law School Admission Test. (SLD)

  8. Item response theory - A first approach

    Science.gov (United States)

    Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

    2017-07-01

    The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.

  9. Predicting muscle forces during the propulsion phase of single leg triple hop test.

    Science.gov (United States)

    Alvim, Felipe Costa; Lucareli, Paulo Roberto Garcia; Menegaldo, Luciano Luporini

    2018-01-01

    Functional biomechanical tests allow the assessment of musculoskeletal system impairments in a simple way. Muscle force synergies associated with movement can provide additional information for diagnosis. However, such forces cannot be directly measured noninvasively. This study aims to estimate muscle activations and forces exerted during the preparation phase of the single leg triple hop test. Two different approaches were tested: static optimization (SO) and computed muscle control (CMC). As an indirect validation, model-estimated muscle activations were compared with surface electromyography (EMG) of selected hip and thigh muscles. Ten physically healthy active women performed a series of jumps, and ground reaction forces, kinematics and EMG data were recorded. An existing OpenSim model with 92 musculotendon actuators was used to estimate muscle forces. Reflective markers data were processed using the OpenSim Inverse Kinematics tool. Residual Reduction Algorithm (RRA) was applied recursively before running the SO and CMC. For both, the same adjusted kinematics were used as inputs. Both approaches presented similar residuals amplitudes. SO showed a closer agreement between the estimated activations and the EMGs of some muscles. Due to inherent EMG methodological limitations, the superiority of SO in relation to CMC can be only hypothesized. It should be confirmed by conducting further studies comparing joint contact forces. The workflow presented in this study can be used to estimate muscle forces during the preparation phase of the single leg triple hop test and allows investigating muscle activation and coordination. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Application of Phased Array Ultrasonic Testing (PAUT) on Single V-Butt Weld Integrity Determination

    International Nuclear Information System (INIS)

    Amry Amin Abas; Mohd Kamal Shah Shamsudin; Norhazleena Azaman

    2015-01-01

    Phased Array Ultrasonic Testing (PAUT) utilizes arrays of piezoelectric elements that are embedded in an epoxy base. The benefit of having such kind of array is that beam forming such as steering and focusing the beam front possible. This enables scanning patterns such as linear scan, sectorial scan and depth focusing scan to be performed. Ultrasonic phased array systems can potentially be employed in almost any test where conventional ultrasonic flaw detectors have traditionally been used. Weld inspection and crack detection are the most important applications, and these tests are done across a wide range of industries including aerospace, power generation, petrochemical, metal billet and tubular goods suppliers, pipeline construction and maintenance, structural metals, and general manufacturing. Phased arrays can also be effectively used to profile remaining wall thickness in corrosion survey applications. The benefits of PAUT are simplifying inspection of components of complex geometry, inspection of components with limited access, testing of welds with multiple angles from a single probe and increasing the probability of detection while improving signal-to-noise ratio. This paper compares the result of inspection on several specimens using PAUT as to digital radiography. The specimens are welded plates with single V-butt weld made of carbon steel. Digital radiography is done using blue imaging plate with x-ray source. PAUT is done using Olympus MX2 with 5 MHz probe consisting of 64 elements. The location, size and length of defect is compared. (author)

  11. Should the diagnosis of COPD be based on a single spirometry test?

    Science.gov (United States)

    Schermer, Tjard R; Robberts, Bas; Crockett, Alan J; Thoonen, Bart P; Lucas, Annelies; Grootens, Joke; Smeele, Ivo J; Thamrin, Cindy; Reddel, Helen K

    2016-09-29

    Clinical guidelines indicate that a chronic obstructive pulmonary disease (COPD) diagnosis is made from a single spirometry test. However, long-term stability of diagnosis based on forced expiratory volume in 1 s over forced vital capacity (FEV 1 /FVC) ratio has not been reported. In primary care subjects at risk for COPD, we investigated shifts in diagnostic category (obstructed/non-obstructed). The data were from symptomatic 40+ years (ex-)smokers referred for diagnostic spirometry, with three spirometry tests, each 12±2 months apart. The obstruction was based on post-bronchodilator FEV 1 /FVC smokers or SABA users at year 1. Change from non-obstructed to obstructed was more likely for males, older subjects, current smokers and patients with lower baseline FEV 1 % predicted, and less likely for those with higher baseline BMI. Up to one-third of symptomatic (ex-)smokers with baseline obstruction on diagnostic spirometry had shifted to non-obstructed when routinely re-tested after 1 or 2 years. Given the implications for patients and health systems of a diagnosis of COPD, it should not be based on a single spirometry test.

  12. A closed-form analytical solution for thermal single-well injection-withdrawal tests

    Science.gov (United States)

    Jung, Yoojin; Pruess, Karsten

    2012-03-01

    Thermal single-well injection-withdrawal (SWIW) tests entail pumping cold water into a hot and usually fractured reservoir, and monitoring the temperature recovery during subsequent backflow. Such tests have been proposed as a potential means to characterize properties of enhanced geothermal systems (EGS), such as fracture spacing, connectivity, and porosity. In this paper we develop an analytical solution for thermal SWIW tests, using an idealized model of a single vertical fracture with linear flow geometry embedded in impermeable conductive wall rocks. The analytical solution shows that the time dependence of temperature recovery is dominated by the heat exchange between fracture and matrix rock, but strong thermal diffusivities of rocks as compared to typical solute diffusivities are not necessarily advantageous for characterizing fracture-matrix interactions. The effect of fracture aperture on temperature recovery during backflow is weak, particularly when the fracture aperture is smaller than 0.1 cm. The solution also shows that temperature recovery during backflow is independent of the applied injection and backflow rates. This surprising result implies that temperature recovery is independent of the height of the fracture, or the specific fracture-matrix interface areas per unit fracture length, suggesting that thermal SWIW tests will not be able to characterize fracture growth that may be achieved by stimulation treatments.

  13. The use of an item response theory-based disability item bank across diseases: accounting for differential item functioning.

    Science.gov (United States)

    Weisscher, Nadine; Glas, Cees A; Vermeulen, Marinus; De Haan, Rob J

    2010-05-01

    There is not a single universally accepted activity of daily living (ADL) instrument available to compare disability assessments across different patient groups. We developed a generic item bank of ADL items using item response theory, the Academic Medical Center Linear Disability Scale (ALDS). When comparing outcomes of the ALDS between patients groups, item characteristics of the ALDS should be comparable across groups. The aim of the study was to assess the differential item functioning (DIF) in a group of patients with various disorders to investigate the comparability across these groups. Cross-sectional, multicenter study including 1,283 in- and outpatients with a variety of disorders and disability levels. The sample was divided in two groups: (1) mainly neurological patients (n=497; vascular medicine, Parkinson's disease and neuromuscular disorders) and (2) patients from internal medicine (n=786; pulmonary diseases, chronic pain, rheumatoid arthritis, and geriatric patients). Eighteen of 72 ALDS items showed statistically significant DIF (P<0.01). However, the DIF could effectively be modeled by the introduction of disease-specific parameters. In the subgroups studied, DIF could be modeled in such a way that the ensemble of the items comprised a scale applicable in both groups.

  14. An analysis of differential item functioning by gender in the Learning Disability Screening Questionnaire (LDSQ).

    Science.gov (United States)

    Murray, Aja Louise; Booth, Tom; McKenzie, Karen

    2015-04-01

    The Learning Disability Screening Questionnaire (LDSQ; McKenzie & Paxton, 2006) was developed as a brief screen for intellectual disability. Although several previous studies have evaluated the LDSQ with respect to its utility as a clinical and research tool, no studies have considered the fairness of the test across males and females. In the current study we, therefore, used a multi-group item response theory approach to assess differential item functioning across gender in a sample of 211 males and 132 females assessed in clinical and forensic settings. Although the test did not show evidence of differential item functioning by gender, it was necessary to exclude one item due to estimation problems and to combine two very highly related items (concerning reading and writing ability) into a single literacy item Thus, in addition to being generally supportive of the utility of the LDSQ, our results also highlight possible areas of weakness in the tool and suggest possible amendments that could be made to test content to improve the test in future revisions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. A Study on Detecting of Differential Item Functioning of PISA 2006 Science Literacy Items in Turkish and American Samples

    Science.gov (United States)

    Çikirikçi Demirtasli, Nükhet; Ulutas, Seher

    2015-01-01

    Problem Statement: Item bias occurs when individuals from different groups (different gender, cultural background, etc.) have different probabilities of responding correctly to a test item despite having the same skill levels. It is important that tests or items do not have bias in order to ensure the accuracy of decisions taken according to test…

  16. Single-dose Intravenous Toxicology Testing of Daebohwalryeok Pharmcopuncture in Sprague-Dawley Rats.

    Science.gov (United States)

    Sun, Seung-Ho; Park, Sunju; Jeong, Jong-Jin; Lee, Kwang-Ho; Yu, Jun-Sang; Seo, Hyung-Sik; Kwon, Ki-Rok

    2015-06-01

    The aims of the study were to test the single-dose intravenous toxicity of Daebohwalryeok pharmacopuncture (DHRP) in Sprague-Dawley (SD) rats and to estimate the crude lethal dose. The experiments were conducted at Biotoxtech Co., a Good Laboratory Practice (GLP) laboratory, according to the GLP regulation and were approved by the Institutional Animal Care and Use Committee of Biotoxtech Co. (Approval no: 110156). The rats were divided into three groups: DHRP was injected into the rats in the two test groups at doses of 10 mL/kg and 20 mL/kg, respectively, and normal saline solution was injected into the rats in the control group. Single doses of DHRP were injected intravenously into 6 week old SD rats (5 male and 5 female rats per group). General symptoms were observed and weights were measured during the 14 day observation period after the injection. After the observation period, necropsies were done. Then, histopathological tests were performed. Weight data were analyzed with a one-way analysis of variance (ANOVA) by using statistical analysis system (SAS, version 9.2). No deaths and no statistical significant weight changes were observed for either male or female SD rats in either the control or the test groups during the observation period. In addition, no treatment related general symptoms or necropsy abnormalities were observed. Histopathological results showed no DHRP related effects in the 20 mL/kg DHRP group for either male or female rats. Under the conditions of this study, the results from single-dose intravenous injections of DHRP showed that estimated lethal doses for both male and female rats were above 20 mL/kg.

  17. Single-dose Intravenous Toxicology Testing of Daebohwalryeok Pharmcopuncture in Sprague-Dawley Rats

    Directory of Open Access Journals (Sweden)

    Seung-Ho Sun

    2015-06-01

    Full Text Available Objectives: The aims of the study were to test the single-dose intravenous toxicity of Daebohwalryeok pharmacopuncture (DHRP in Sprague-Dawley (SD rats and to estimate the crude lethal dose. Methods: The experiments were conducted at Biotoxtech Co., a Good Laboratory Practice (GLP laboratory, according to the GLP regulation and were approved by the Institutional Animal Care and Use Committee of Biotoxtech Co. (Approval no: 110156. The rats were divided into three groups: DHRP was injected into the rats in the two test groups at doses of 10 mL/kg and 20 mL/kg, respectively, and normal saline solution was injected into the rats in the control group. Single doses of DHRP were injected intravenously into 6 week old SD rats (5 male and 5 female rats per group. General symptoms were observed and weights were measured during the 14 day observation period after the injection. After the observation period, necropsies were done. Then, histopathological tests were performed. Weight data were analyzed with a one-way analysis of variance (ANOVA by using statistical analysis system (SAS, version 9.2. Results: No deaths and no statistical significant weight changes were observed for either male or female SD rats in either the control or the test groups during the observation period. In addition, no treatment related general symptoms or necropsy abnormalities were observed. Histopathological results showed no DHRP related effects in the 20 mL/kg DHRP group for either male or female rats. Conclusion: Under the conditions of this study, the results from single-dose intravenous injections of DHRP showed that estimated lethal doses for both male and female rats were above 20 mL/kg.

  18. Are single-well "push-pull" tests suitable tracer methods for aquifer characterization?

    Science.gov (United States)

    Hebig, Klaus; Zeilfelder, Sarah; Ito, Narimitsu; Machida, Isao; Scheytt, Traugott; Marui, Atsunao

    2013-04-01

    Recently, investigations were conducted for geological and hydrogeological characterisation of the sedimentary coastal basin of Horonobe (Hokkaido, Japan). Coastal areas are typical geological settings in Japan, which are less tectonically active than the mountain ranges. In Asia, and especially in Japan, these areas are often densely populated. Therefore, it is important to investigate the behaviour of solutes in such unconsolidated aquifers. In such settings sometimes only single boreholes or groundwater monitoring wells are available for aquifer testing for various reasons, e.g. depths of more than 100 m below ground level and slow groundwater velocities due to density driven flow. A standard tracer test with several involved groundwater monitoring wells is generally very difficult or even not possible at these depths. One of the most important questions in our project was how we can obtain information about chemical and hydraulic properties in such aquifers. Is it possible to characterize solute transport behaviour parameters with only one available groundwater monitoring well or borehole? A so-called "push-pull" test may be one suitable method for aquifer testing with only one available access point. In a push-pull test a known amount of several solutes including a conservative tracer is injected into the aquifer ("push") and afterwards extracted ("pull"). The measured breakthrough curve during the pumping back phase can then be analysed. This method has already been used previously with various aims, also in the recent project (e.g. Hebig et al. 2011, Zeilfelder et al. 2012). However, different test setups produced different tracer breakthrough curves. As no systematic evaluation of this aquifer tracer test method was done so far, nothing is known about its repeatability. Does the injection and extraction rate influence the shape of the breakthrough curve? Which role plays the often applied "chaser", which is used to push the test solution out from the

  19. Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating

    Science.gov (United States)

    He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei

    2013-01-01

    Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…

  20. Use of an Inclusive Option and the Optimal Number of Options for Multiple-Choice Items.

    Science.gov (United States)

    Crehan, Kevin D.; And Others

    1993-01-01

    Studies with 220 college students found that multiple-choice test items with 3 items are more difficult than those with 4 items, and items with the none-of-these option are more difficult than those without this option. Neither format manipulation affected item discrimination. Implications for test construction are discussed. (SLD)

  1. Firefly Optimization and Mathematical Modeling of a Vehicle Crash Test Based on Single-Mass

    Directory of Open Access Journals (Sweden)

    Andreas Klausen

    2014-01-01

    Full Text Available In this paper mathematical modeling of a vehicle crash test based on a single-mass is studied. The model under consideration consists of a single-mass coupled with a spring and/or a damper. The parameters for the spring and damper are obtained by analyzing the measured acceleration in the center of gravity of the vehicle during a crash. A model with a nonlinear spring and damper is also proposed and the parameters will be optimized with different damper and spring characteristics and optimization algorithms. The optimization algorithms used are interior-point and firefly algorithm. The objective of this paper is to compare different methods used to establish a simple model of a car crash and validate the results against real crash data.

  2. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate and massive objects require a longer procedure and will therefore take longer.

  3. Single-Event Effect Testing of the Linear Technology LTC6103HMS8#PBF Current Sense Amplifier

    Science.gov (United States)

    Yau, Ka-Yen; Campola, Michael J.; Wilcox, Edward

    2016-01-01

    The LTC6103HMS8#PBF (henceforth abbreviated as LTC6103) current sense amplifier from Linear Technology was tested for both destructive and non-destructive single-event effects (SEE) using the heavy-ion cyclotron accelerator beam at Lawrence Berkeley National Laboratory (LBNL) Berkeley Accelerator Effects (BASE) facility. During testing, the input voltages and output currents were monitored to detect single event latch-up (SEL) and single-event transients (SETs).

  4. Fractal and Morphological Characteristics of Single Marble Particle Crushing in Uniaxial Compression Tests

    Directory of Open Access Journals (Sweden)

    Yidong Wang

    2015-01-01

    Full Text Available Crushing of rock particles is a phenomenon commonly encountered in geotechnical engineering practice. It is however difficult to study the crushing of rock particles using classical theory because the physical structure of the particles is complex and irregular. This paper aims at evaluating fractal and morphological characteristics of single rock particle. A large number of particle crushing tests are conducted on single rock particle. The force-displacement curves and the particle size distributions (PSD of crushed particles are analysed based on particle crushing tests. Particle shape plays an important role in both the micro- and macroscale responses of a granular assembly. The PSD of an assortment of rocks are analysed by fractal methods, and the fractal dimension is obtained. A theoretical formula for particle crushing strength is derived, utilising the fractal model, and a simple method is proposed for predicting the probability of particle survival based on the Weibull statistics. Based on a few physical assumptions, simple equations are derived for determining particle crushing energy. The results of applying these equations are tested against the actual experimental data and prove to be very consistent. Fractal theory is therefore applicable for analysis of particle crushing.

  5. Exploratory Item Classification Via Spectral Graph Clustering.

    Science.gov (United States)

    Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

    2017-01-01

    Large-scale assessments are supported by a large item pool. An important task in test development is to assign items into scales that measure different characteristics of individuals, and a popular approach is cluster analysis of items. Classical methods in cluster analysis, such as the hierarchical clustering, K-means method, and latent-class analysis, often induce a high computational overhead and have difficulty handling missing data, especially in the presence of high-dimensional responses. In this article, the authors propose a spectral clustering algorithm for exploratory item cluster analysis. The method is computationally efficient, effective for data with missing or incomplete responses, easy to implement, and often outperforms traditional clustering algorithms in the context of high dimensionality. The spectral clustering algorithm is based on graph theory, a branch of mathematics that studies the properties of graphs. The algorithm first constructs a graph of items, characterizing the similarity structure among items. It then extracts item clusters based on the graphical structure, grouping similar items together. The proposed method is evaluated through simulations and an application to the revised Eysenck Personality Questionnaire.

  6. Tidal Volume Single Breath Washout of Two Tracer Gases - A Practical and Promising Lung Function Test

    Science.gov (United States)

    Singer, Florian; Stern, Georgette; Thamrin, Cindy; Fuchs, Oliver; Riedel, Thomas; Gustafsson, Per; Frey, Urs; Latzin, Philipp

    2011-01-01

    Background Small airway disease frequently occurs in chronic lung diseases and may cause ventilation inhomogeneity (VI), which can be assessed by washout tests of inert tracer gas. Using two tracer gases with unequal molar mass (MM) and diffusivity increases specificity for VI in different lung zones. Currently washout tests are underutilised due to the time and effort required for measurements. The aim of this study was to develop and validate a simple technique for a new tidal single breath washout test (SBW) of sulfur hexafluoride (SF6) and helium (He) using an ultrasonic flowmeter (USFM). Methods The tracer gas mixture contained 5% SF6 and 26.3% He, had similar total MM as air, and was applied for a single tidal breath in 13 healthy adults. The USFM measured MM, which was then plotted against expired volume. USFM and mass spectrometer signals were compared in six subjects performing three SBW. Repeatability and reproducibility of SBW, i.e., area under the MM curve (AUC), were determined in seven subjects performing three SBW 24 hours apart. Results USFM reliably measured MM during all SBW tests (n = 60). MM from USFM reflected SF6 and He washout patterns measured by mass spectrometer. USFM signals were highly associated with mass spectrometer signals, e.g., for MM, linear regression r-squared was 0.98. Intra-subject coefficient of variation of AUC was 6.8%, and coefficient of repeatability was 11.8%. Conclusion The USFM accurately measured relative changes in SF6 and He washout. SBW tests were repeatable and reproducible in healthy adults. We have developed a fast, reliable, and straightforward USFM based SBW method, which provides valid information on SF6 and He washout patterns during tidal breathing. PMID:21423739

  7. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions

    Directory of Open Access Journals (Sweden)

    Yoon Soo ePark

    2016-02-01

    Full Text Available This study investigates the impact of item parameter drift (IPD on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effect on item parameters and examinee ability.

  8. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions.

    Science.gov (United States)

    Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

    2016-01-01

    This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.

  9. Are Inferential Reading Items More Susceptible to Cultural Bias than Literal Reading Items?

    Science.gov (United States)

    Banks, Kathleen

    2012-01-01

    The purpose of this article is to illustrate a seven-step process for determining whether inferential reading items were more susceptible to cultural bias than literal reading items. The seven-step process was demonstrated using multiple-choice data from the reading portion of a reading/language arts test for fifth and seventh grade Hispanic,…

  10. Factor Analysis of Multidimensional Polytomous Item Response Data Suffering from Ignorable Item Nonresponse.

    Science.gov (United States)

    Bernaards A., Coen; Sijtsma, Klaas

    1999-01-01

    Used simulation to study the problem of missing item responses in tests and questionnaires when factor analysis is used to study the structure of the items. Factor loadings based on the EM algorithm best approximated the loading structure, with imputation of the mean per person across the scores for that person being the best alternative. (SLD)

  11. Respiratory Symptoms Items from the COPD Assessment Test Identify Ever-Smokers with Preserved Lung Function at Higher Risk for Poor Respiratory Outcomes. An Analysis of the Subpopulations and Intermediate Outcome Measures in COPD Study Cohort.

    Science.gov (United States)

    Martinez, Carlos H; Murray, Susan; Barr, R Graham; Bleecker, Eugene; Bowler, Russell P; Christenson, Stephanie A; Comellas, Alejandro P; Cooper, Christopher B; Couper, David; Criner, Gerard J; Curtis, Jeffrey L; Dransfield, Mark T; Hansel, Nadia N; Hoffman, Eric A; Kanner, Richard E; Kleerup, Eric; Krishnan, Jerry A; Lazarus, Stephen C; Leidy, Nancy K; O'Neal, Wanda; Martinez, Fernando J; Paine, Robert; Rennard, Stephen I; Tashkin, Donald P; Woodruff, Prescott G; Han, MeiLan K

    2017-05-01

    Ever-smokers without airflow obstruction scores greater than or equal to 10 on the COPD Assessment Test (CAT) still have frequent acute respiratory disease events (exacerbation-like), impaired exercise capacity, and imaging abnormalities. Identification of these subjects could provide new opportunities for targeted interventions. We hypothesized that the four respiratory-related items of the CAT might be useful for identifying such individuals, with discriminative ability similar to CAT, which is an eight-item questionnaire used to assess chronic obstructive pulmonary disease impact, including nonrespiratory questions, with scores ranging from 0 to 40. We evaluated ever-smoker participants in the Subpopulations and Intermediate Outcomes in COPD Study without airflow obstruction (FEV 1 /FVC ≥0.70; FVC above the lower limit of normal). Using the area under the receiver operating characteristic curve, we compared responses to both CAT and the respiratory symptom-related CAT items (cough, phlegm, chest tightness, and breathlessness) and their associations with longitudinal exacerbations. We tested agreement between the two strategies (κ statistic), and we compared demographics, lung function, and symptoms among subjects identified as having high symptoms by each strategy. Among 880 ever-smokers with normal lung function (mean age, 61 yr; 52% women) and using a CAT cutpoint greater than or equal to 10, we classified 51.8% of individuals as having high symptoms, 15.3% of whom experienced at least one exacerbation during 1-year follow-up. After testing sensitivity and specificity of different scores for the first four questions to predict any 1-year follow-up exacerbation, we selected cutpoints of 0-6 as representing a low burden of symptoms versus scores of 7 or higher as representing a high burden of symptoms for all subsequent comparisons. The four respiratory-related items with cutpoint greater than or equal to 7 selected 45.8% participants, 15.6% of whom

  12. Summary of Group Development and Testing for Single Shell Tank Closure at Hanford

    Energy Technology Data Exchange (ETDEWEB)

    Harbour, John, R.

    2005-04-28

    This report is a summary of the bench-scale and large scale experimental studies performed by Savannah River National Laboratory for CH2M HILL to develop grout design mixes for possible use in producing fill materials as a part of Tank Closure of the Single-Shell Tanks at Hanford. The grout development data provided in this report demonstrates that these design mixes will produce fill materials that are ready for use in Hanford single shell tank closure. The purpose of this report is to assess the ability of the proposed grout specifications to meet the current requirements for successful single shell tank closure which will include the contracting of services for construction and operation of a grout batch plant. The research and field experience gained by SRNL in the closure of Tanks 17F and 20F at the Savannah River Site was leveraged into the grout development efforts for Hanford. It is concluded that the three Hanford grout design mixes provide fill materials that meet the current requirements for successful placement. This conclusion is based on the completion of recommended testing using Hanford area materials by the operators of the grout batch plant. This report summarizes the regulatory drivers and the requirements for grout mixes as tank fill material. It is these requirements for both fresh and cured grout properties that drove the development of the grout formulations for the stabilization, structural and capping layers.

  13. Experimental study on the single event effects in pulse width modulators by laser testing

    International Nuclear Information System (INIS)

    Zhao Wen; Guo Xiaoqiang; Chen Wei; Guo Hongxia; Lin Dongsheng; Luo Yinhong; Ding Lili; Wang Yuanming; Wang Hanning

    2015-01-01

    This paper presents single event effect (SEE) characteristics of UC1845AJ pulse width modulators (PWMs) by laser testing. In combination with analysis to map PWM circuitry in the microchip dies, the typical SEE response waveforms for laser pulses located in different circuit blocks of UC1845AJ are obtained and the SEE mechanisms are analyzed. The laser SEE test results show that there are some differences in the SEE mechanisms of different circuit blocks, and phase shifts or changes in the duty cycles of few output pulses are the main SEE behaviors for UC1845AJ. In addition, a new SEE behavior which manifests as changes in the duty cycles of many output pulses is revealed. This means that an SEE hardened design should be considered. (paper)

  14. Identification of candidate children for maturity-onset diabetes of the young type 2 (MODY2 gene testing: a seven-item clinical flowchart (7-iF.

    Directory of Open Access Journals (Sweden)

    Michele Pinelli

    Full Text Available MODY2 is the most prevalent monogenic form of diabetes in Italy with an estimated prevalence of about 0.5-1.5%. MODY2 is potentially indistinguishable from other forms of diabetes, however, its identification impacts on patients' quality of life and healthcare resources. Unfortunately, DNA direct sequencing as diagnostic test is not readily accessible and expensive. In addition current guidelines, aiming to establish when the test should be performed, proved a poor detection rate. Aim of this study is to propose a reliable and easy-to-use tool to identify candidate patients for MODY2 genetic testing. We designed and validated a diagnostic flowchart in the attempt to improve the detection rate and to increase the number of properly requested tests. The flowchart, called 7-iF, consists of 7 binary "yes or no" questions and its unequivocal output is an indication for whether testing or not. We tested the 7-iF to estimate its clinical utility in comparison to the clinical suspicion alone. The 7-iF, in a prospective 2-year study (921 diabetic children showed a precision of about the 76%. Using retrospective data, the 7-iF showed a precision in identifying MODY2 patients of about 80% compared to the 40% of the clinical suspicion. On the other hand, despite a relatively high number of missing MODY2 patients, the 7-iF would not suggest the test for 90% of the non-MODY2 patients, demonstrating that a wide application of this method might 1 help less experienced clinicians in suspecting MODY2 patients and 2 reducing the number of unnecessary tests. With the 7-iF, a clinician can feel confident of identifying a potential case of MODY2 and suggest the molecular test without fear of wasting time and money. A Qaly-type analysis estimated an increase in the patients' quality of life and savings for the health care system of about 9 million euros per year.

  15. Evaluation of canine adverse food reactions by patch testing with single proteins, single carbohydrates and commercial foods.

    Science.gov (United States)

    Johansen, Cornelia; Mariani, Claire; Mueller, Ralf S

    2017-10-01

    Adverse food reaction (AFR) is an important differential diagnosis for the pruritic dog. It is usually diagnosed by feeding an elimination diet with a novel protein and carbohydrate source for eight weeks followed by subsequent food provocation. A previous study demonstrated that patch testing dogs with foods had a high sensitivity and negative predictability for selection of elimination diet ingredients. The aim of this study was to investigate patch testing with proteins, carbohydrates and dry commercial dog food in dogs to determine whether there was value in patch testing to aid the diagnosis of canine adverse food reaction. Twenty five privately owned dogs, with confirmed AFR, underwent provocation trials with selected food antigens and patch testing. For proteins, carbohydrates and dry dog food the sensitivity of patch testing was 100%, 70% and 22.2%, respectively; the negative predictive values of patch testing were 100%, 79% and 72%. The positive predictive values of patch testing for proteins and carbohydrates were 75% and 74%, respectively. This study confirmed that patch testing may be useful for the selection of a suitable protein source for an elimination diet in dogs with suspected AFR, but not as a diagnostic tool for canine AFR. Results for proteins are more reliable than for carbohydrates and the majority of positive patch test reactions were observed with raw protein. Patch testing with commercial dog food does not seem to be useful. © 2017 ESVD and ACVD.

  16. Design of a single-borehole hydraulic test programme allowing for interpretation-based errors

    International Nuclear Information System (INIS)

    Black, J.H.

    1987-07-01

    Hydraulic testing using packers in single boreholes is one of the most important sources of data to safety assessment modelling in connection with the disposal of radioactive waste. It is also one of the most time-consuming and expensive. It is important that the results are as reliable as possible and as accurate as necessary for the use that is made of them. There are many causes of possible error and inaccuracy ranging from poor field practice to inappropriate interpretation procedure. The report examines and attempts to quantify the size of error arising from the accidental use of an inappropriate or inadequate interpretation procedure. In doing so, it can be seen which interpretation procedure or combination of procedures results in least error. Lastly, the report attempts to use the previous conclusions from interpretation to propose forms of field test procedure where interpretation-based errors will be minimised. Hydraulic tests (sometimes known as packer tests) come in three basic forms: slug/pulse, constant flow and constant head. They have different characteristics, some measuring a variable volume of rock (dependent on hydraulic conductivity) and some having a variable duration (dependent on hydraulic conductivity). A combination of different tests in the same interval is seen as desirable. For the purposes of assessing interpretation-based errors, slug and pulse tests are considered together as are constant flow and constant head tests. The same method is used in each case to assess errors. The method assumes that the simplest analysis procedure (cylindrical flow in homogeneous isotropic porous rock) will be used on each set of field data. The error is assessed by calculating synthetic data for alternative configurations (e.g. fissured rock, anisotropic rock, inhomogeneous rock - i.e. skin - etc.) and then analyzing this data using the simplest analysis procedure. 28 refs., 26 figs

  17. SHIPPING OF RADIOACTIVE ITEMS

    CERN Document Server

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate, preparation of the package and related paperwork). Large and massive objects require a longer procedure and will therefore take longer.

  18. The Long-Term Conditions Questionnaire: conceptual framework and item development.

    Science.gov (United States)

    Peters, Michele; Potter, Caroline M; Kelly, Laura; Hunter, Cheryl; Gibbons, Elizabeth; Jenkinson, Crispin; Coulter, Angela; Forder, Julien; Towers, Ann-Marie; A'Court, Christine; Fitzpatrick, Ray

    2016-01-01

    To identify the main issues of importance when living with long-term conditions to refine a conceptual framework for informing the item development of a patient-reported outcome measure for long-term conditions. Semi-structured qualitative interviews (n=48) were conducted with people living with at least one long-term condition. Participants were recruited through primary care. The interviews were transcribed verbatim and analyzed by thematic analysis. The analysis served to refine the conceptual framework, based on reviews of the literature and stakeholder consultations, for developing candidate items for a new measure for long-term conditions. Three main organizing concepts were identified: impact of long-term conditions, experience of services and support, and self-care. The findings helped to refine a conceptual framework, leading to the development of 23 items that represent issues of importance in long-term conditions. The 23 candidate items formed the first draft of the measure, currently named the Long-Term Conditions Questionnaire. The aim of this study was to refine the conceptual framework and develop items for a patient-reported outcome measure for long-term conditions, including single and multiple morbidities and physical and mental health conditions. Qualitative interviews identified the key themes for assessing outcomes in long-term conditions, and these underpinned the development of the initial draft of the measure. These initial items will undergo cognitive testing to refine the items prior to further validation in a survey.

  19. Exploring Differential Effects across Two Decoding Treatments on Item-Level Transfer in Children with Significant Word Reading Difficulties: A New Approach for Testing Intervention Elements

    Science.gov (United States)

    Steacy, Laura M.; Elleman, Amy M.; Lovett, Maureen W.; Compton, Donald L.

    2016-01-01

    In English, gains in decoding skill do not map directly onto increases in word reading. However, beyond the Self-Teaching Hypothesis, little is known about the transfer of decoding skills to word reading. In this study, we offer a new approach to testing specific decoding elements on transfer to word reading. To illustrate, we modeled word-reading…

  20. Testing the Fundamental Difference Hypothesis: L2 Adult, L2 Child, and L1 Child Comparisons in the Acquisition of Korean "Wh"-Constructions with Negative Polarity Items

    Science.gov (United States)

    Song, Hyang Suk; Schwartz, Bonnie D.

    2009-01-01

    The fundamental difference hypothesis (FDH; Bley-Vroman, 1989, 1990) contends that the nature of language in natives is fundamentally different from the nature of language in adult nonnatives. This study tests the FDH in two ways: (a) via second language (L2) poverty-of-the-stimulus (POS) problems (e.g., Schwartz & Sprouse, 2000) and (b) via a…