WorldWideScience

Sample records for survey items measuring

  1. Harmonizing Measures of Cognitive Performance Across International Surveys of Aging Using Item Response Theory.

    Science.gov (United States)

    Chan, Kitty S; Gross, Alden L; Pezzin, Liliana E; Brandt, Jason; Kasper, Judith D

    2015-12-01

    To harmonize measures of cognitive performance using item response theory (IRT) across two international aging studies. Data for persons ≥65 years from the Health and Retirement Study (HRS, N = 9,471) and the English Longitudinal Study of Aging (ELSA, N = 5,444). Cognitive performance measures varied (HRS fielded 25, ELSA 13); 9 were in common. Measurement precision was examined for IRT scores based on (a) common items, (b) common items adjusted for differential item functioning (DIF), and (c) DIF-adjusted all items. Three common items (day of date, immediate word recall, and delayed word recall) demonstrated DIF by survey. Adding survey-specific items improved precision but mainly for HRS respondents at lower cognitive levels. IRT offers a feasible strategy for harmonizing cognitive performance measures across other surveys and for other multi-item constructs of interest in studies of aging. Practical implications depend on sample distribution and the difficulty mix of in-common and survey-specific items. © The Author(s) 2015.

  2. Surveillance indicators for potential reduced exposure products (PREPs: developing survey items to measure awareness

    Directory of Open Access Journals (Sweden)

    McNeill Ann

    2009-10-01

    Full Text Available Abstract Background Over the past decade, tobacco companies have introduced cigarettes and smokeless tobacco products (known as Potential Reduced Exposure Products, PREPs with purportedly lower levels of some toxins than conventional cigarettes and smokeless products. It is essential that public health agencies monitor awareness, interest, use, and perceptions of these products so that their impact on population health can be detected at the earliest stages. Methods This paper reviews and critiques existing strategies for measuring awareness of PREPs from 16 published and unpublished studies. From these measures, we developed new surveillance items and subjected them to two rounds of cognitive testing, a common and accepted method for evaluating questionnaire wording. Results Our review suggests that high levels of awareness of PREPs reported in some studies are likely to be inaccurate. Two likely sources of inaccuracy in awareness measures were identified: 1 the tendency of respondents to misclassify "no additive" and "natural" cigarettes as PREPs and 2 the tendency of respondents to mistakenly report awareness as a result of confusion between PREPs brands and similarly named familiar products, for example, Eclipse chewing gum and Accord automobiles. Conclusion After evaluating new measures with cognitive interviews, we conclude that as of winter 2006, awareness of reduced exposure products among U.S. smokers was likely to be between 1% and 8%, with the higher estimates for some products occurring in test markets. Recommended measurement strategies for future surveys are presented.

  3. Making Meaningful Measurement in Survey Research: The Use of Person and Item Maps

    Science.gov (United States)

    Royal, Kenneth D.

    2009-01-01

    Quality measurement is essential in every form of research, including institutional research and assessment. Unfortunately, most survey research today (both published and unpublished) is lacking with regards to quality measurement. Reporting means and standard deviations based on ordinal measures is an inappropriate, yet widespread practice in the…

  4. A preference-based measure of health: the VR-6D derived from the veterans RAND 12-Item Health Survey.

    Science.gov (United States)

    Selim, Alfredo J; Rogers, William; Qian, Shirley X; Brazier, John; Kazis, Lewis E

    2011-10-01

    The Veterans RAND 12-Item Health Survey (VR-12) is currently the major endpoint used in the Medicare managed care outcomes measure in the Healthcare Effectiveness Data and Information Set (HEDIS(®)), referred to as the Health Outcomes Survey (HOS). The purpose of this study is to adapt the Brazier SF-6D utility measure to the VR-12 to generate a single utility index. We used the HOS cohorts 2 and 3 for SF-36 data and 9 for VR-12 data. We calculated SF-6D scores from the SF-36 using the algorithms developed by Brazier and colleagues. The values of the Brazier SF-6D were used to estimate utility scores from the VR-12 using a mapping approach based on a 2-stage mapping procedure, named as VR-6D. The VR-6D derived from the VR-12 has similar distributional properties as the SF-6D. The change in VR-6D showed significant variations across disease groups with different levels of morbidity and mortality. This study produced a utility measure for the VR-12 that is comparable to the SF-6D and responsive to change. The VR-6D can be used in evaluations of health care plans and cost-effectiveness analysis to compare the health gains that health care interventions can achieve.

  5. Development and Reliability of Items Measuring the Nonmedical Use of Prescription Drugs for the Youth Risk Behavior Survey: Results Froman Initial Pilot Test

    Science.gov (United States)

    Howard, Melissa M.; Weiler, Robert M.; Haddox, J. David

    2009-01-01

    Background: The purpose of this study was to develop and test the reliability of self-report survey items designed to monitor the nonmedical use of prescription drugs among adolescents. Methods: Eighteen nonmedical prescription drug items designed to be congruent with the substance abuse items in the US Centers for Disease Control and Prevention's…

  6. Unfair items detection in educational measurement

    CERN Document Server

    Bakman, Yefim

    2012-01-01

    Measurement professionals cannot come to an agreement on the definition of the term 'item fairness'. In this paper a continuous measure of item unfairness is proposed. The more the unfairness measure deviates from zero, the less fair the item is. If the measure exceeds the cutoff value, the item is identified as definitely unfair. The new approach can identify unfair items that would not be identified with conventional procedures. The results are in accord with experts' judgments on the item qualities. Since no assumptions about scores distributions and/or correlations are assumed, the method is applicable to any educational test. Its performance is illustrated through application to scores of a real test.

  7. A continuous-scale measure of child development for population-based epidemiological surveys: a preliminary study using Item Response Theory for the Denver Test.

    Science.gov (United States)

    Drachler, Maria de Lourdes; Marshall, Tom; de Carvalho Leite, José Carlos

    2007-03-01

    A method for translating research data from the Denver Test into individual scores of developmental status measured in a continuous scale is presented. It was devised using the Denver Developmental Screening Test (DDST) but can be used for Denver II. The DDST was applied in a community-based survey of 3389 under-5-year-olds in Porto Alegre, Brazil. The items of success were standardised by logistic regression on log chronological age. Each child's ability age was then estimated by maximum likelihood as the age in this reference population corresponding to the child's success and failures in the test. The score of developmental status is the natural logarithm of this ability age divided by chronological age and thus measures the delay or advance in the child's ability age compared with chronological age. This method estimates development status using both difficulty and discriminating power of each item in the reference population, an advantage over scores based on total number of items correctly performed or failed, which depend on difficulty only. The score corresponds with maternal opinion of child developmental status and with the 3-category scale of the DDST. It shows good construct validity, indicated by symmetrical and homogeneous variability from 3 months upwards, and reasonable results in describing gender differences in development by age, the mean score increasing with socio-economic conditions and diminishing among low-birthweight children. If a standardised measure of development status (z-scores) is required, this can be obtained by dividing the score by its standard deviation. Concurrent and discriminant validity of the score must be examined in further studies.

  8. Measuring student learning with item response theory

    Directory of Open Access Journals (Sweden)

    Young-Jin Lee

    2008-01-01

    Full Text Available We investigate short-term learning from hints and feedback in a Web-based physics tutoring system. Both the skill of students and the difficulty and discrimination of items were determined by applying item response theory (IRT to the first answers of students who are working on for-credit homework items in an introductory Newtonian physics course. We show that after tutoring a shifted logistic item response function with lower discrimination fits the students’ second responses to an item previously answered incorrectly. Student skill decreased by 1.0 standard deviation when students used no tutoring between their (incorrect first and second attempts, which we attribute to “item-wrong bias.” On average, using hints or feedback increased students’ skill by 0.8 standard deviation. A skill increase of 1.9 standard deviation was observed when hints were requested after viewing, but prior to attempting to answer, a particular item. The skill changes measured in this way will enable the use of IRT to assess students based on their second attempt in a tutoring environment.

  9. Item response theory for measurement validity.

    Science.gov (United States)

    Yang, Frances M; Kao, Solon T

    2014-06-01

    Item response theory (IRT) is an important method of assessing the validity of measurement scales that is underutilized in the field of psychiatry. IRT describes the relationship between a latent trait (e.g., the construct that the scale proposes to assess), the properties of the items in the scale, and respondents' answers to the individual items. This paper introduces the basic premise, assumptions, and methods of IRT. To help explain these concepts we generate a hypothetical scale using three items from a modified, binary (yes/no) response version of the Center for Epidemiological Studies-Depression scale that was administered to 19,399 respondents. We first conducted a factor analysis to confirm the unidimensionality of the three items and then proceeded with Mplus software to construct the 2-Parameter Logic (2-PL) IRT model of the data, a method which allows for estimates of both item discrimination and item difficulty. The utility of this information both for clinical purposes and for scale construction purposes is discussed.

  10. Measuring response styles in Likert items.

    Science.gov (United States)

    Böckenholt, Ulf

    2017-03-01

    The recently proposed class of item response tree models provides a flexible framework for modeling multiple response processes. This feature is particularly attractive for understanding how response styles may affect answers to attitudinal questions. Facilitating the disassociation of response styles and attitudinal traits, item response tree models can provide powerful process tests of how different response formats may affect the measurement of substantive traits. In an empirical study, 3 response formats were used to measure the 2-dimensional Personal Need for Structure traits. Different item response tree models are proposed to capture the response styles for each of the response formats. These models show that the response formats give rise to similar trait measures but different response-style effects. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  11. Controlling for rater effects when comparing survey items with incomplete Likert data.

    Science.gov (United States)

    Schulz, E M; Sun, A

    2001-01-01

    The rating scale model (Andrich, 1978) was applied to data from a survey that directed students to rate their satisfaction with college services on a five point Likert scale. Because students used different services, and students were directed to rate only the services they used, the items were differentially exposed to a person factor that we call "pleasability." Differential exposure to pleasability makes items' average rating a biased measure of their performance. In contrast, item parameter estimates in the rating scale model corrected for differential exposure to pleasability. Compared to items' average ratings, item parameter estimates in the rating scale model did a better job of predicting which item received the higher rating when any two items were rated by the same rater.

  12. Development of the Quantitative Reasoning Items on the National Survey of Student Engagement

    Directory of Open Access Journals (Sweden)

    Amber D. Dumford

    2015-01-01

    Full Text Available As society’s needs for quantitative skills become more prevalent, college graduates require quantitative skills regardless of their career choices. Therefore, it is important that institutions assess students’ engagement in quantitative activities during college. This study chronicles the process taken by the National Survey of Student Engagement (NSSE to develop items that measure students’ participation in quantitative reasoning (QR activities. On the whole, findings across the quantitative and qualitative analyses suggest good overall properties for the developed QR items. The items show great promise to explore and evaluate the frequency with which college students participate in QR-related activities. Each year, hundreds of institutions across the United States and Canada participate in NSSE, and, with the addition of these new items on the core survey, every participating institution will have information on this topic. Our hope is that these items will spur conversations on campuses about students’ use of quantitative reasoning activities.

  13. Measurement Assurance for End-Item Users

    Science.gov (United States)

    Mimbs, Scott M.

    2008-01-01

    The goal of a Quality Management System (QMS) as specified in ISO 9001 and AS9100 is to assure the end product meets specifications and customer requirements. Measuring devices, often called measuring and test equipments (MTE), provide the evidence of product conformity to the prescribed requirements. Therefore the processes which employ MTE can become a weak link to the overall QMS if proper attention is not given to development and execution of these processes. Traditionally, calibration of MTE is given more focus in industry standards and process control efforts than the equally important proper usage of the same equipment. It is a common complaint of calibration laboratory personnel that MTE users are only interested in "a sticker." If the QMS requires the MTE "to demonstrate conformity of the product," then the quality of the measurement process must be adequate for the task. This leads to an ad hoc definition; measurement assurance is a discipline that assures that all processes, activities, environments, standards, and procedures involved in making a measurement produce a result that can be rigorously evaluated for validity and accuracy. To evaluate that the existing measurement processes are providing an adequate level of quality to support the decisions based upon this measurement data, an understanding of measurement assurance basics is essential. This topic is complimentary to the calibration standard, ANSI/NCSL Z540.3-2006, which targets the calibration of MTE at the organizational level. This paper will discuss general measurement assurance when MTE is used to provide evidence of product conformity, therefore the target audience of this paper is end item users of MTE. A central focus of the paper will be the verification of tolerances and the associated risks, so calibration professionals may find the paper useful in communication with their customers, MTE users.

  14. The Feasibility of Single-Item Measures for Organizational Justice

    Science.gov (United States)

    Jordan, Jeremy S.; Turner, Brian A.

    2008-01-01

    Researchers in a number of disciplines have examined the utility of single-item measures for both affective and cognitive constructs. While these authors have indicated that, under certain circumstances, the use of single-item measures is appropriate, there remains concern regarding the reliability and validity of single-item measures. This study…

  15. A single-item self-rated health measure correlates with objective health status in the elderly: a survey in suburban Beijing

    Directory of Open Access Journals (Sweden)

    Qinqin eMeng

    2014-04-01

    Full Text Available IntroductionThe measurement of health status of the elderly remains one important topic. Self-rated health status (SRH is considered to be a simple indicator to measure the health status of the old population. But some researchers still take a skeptical view about its reliability. This study aims to investigate the association between self-rated health indicator and health status of the elderly and discuss its subsequent public health implications.MethodsIn a total 1096 people who were 60 years of age or older from 1784 households from a suburban area of Beijing were interviewed using multistage stratified cluster sampling. SRH was measured by a single question please choose one point in this 0-100 scale which can best represent your health today?. The disease status and physical functional status were also obtained. A multiple linear regression was conducted to test the associate between SRH and individual’s disease/functional status.ResultsThe average of SRH scores of the elderly was 72.49±15.64 (on a 1 to 100 scale. The SRH scores declined not only with the severity of self-reported mental/disease status, but also with the decrease of physical functional status. Multiple linear regression showed that after adjustment for other variables, two-week sickness, chronic diseases, hospitalization, and ability of self-care (washing and dressing were able to explain 35% of the variation in SRH among the elderly. Among them, disease status and self-care ability were the most powerful predictor of SRH. After adjusting other variables, physical functional status could explain only 5% of the variation in SRH.ConclusionSRH reflects the disease/functional health status of the elderly. It is an easy-to-implement variable and it can reduce both recall bias and investigator bias, thus being widely used in health surveys. It is a cost-effective means of measuring the health status. However, the comparability of SRH in different populations should be studied

  16. Survey Page Length and Progress Indicators: What Are Their Relationships to Item Nonresponse?

    Science.gov (United States)

    Bowman, Nicholas A.; Herzog, Serge; Sarraf, Shimon; Tukibayeva, Malika

    2014-01-01

    The popularity of online student surveys has been associated with greater item nonresponse. This chapter presents research aimed at exploring what factors might help minimize item nonresponse, such as altering online survey page length and using progress indicators.

  17. Detecting measurement disturbance effects: the graphical display of item characteristics.

    Science.gov (United States)

    Schumacker, Randall E

    2015-01-01

    Traditional identification of misfitting items in Rasch measurement models have interpreted the Infit and Outfit z standardized statistic. A more recent approach made possible by Winsteps is to specify "group = 0" in the control file and subsequently view the item characteristic curve for each item against the true probability curve. The graphical display reveals whether an item follows the true probability curve or deviates substantially, thus indicating measurement disturbance. Probability of item response and logit ability are easily copied into data vectors in R software then graphed. An example control file, output item data, and subsequent preparation of an overlay graph for misfit items are presented using Winsteps and R software. For comparison purposes the data are also analyzed using a multi-dimensional (MD) mapping procedure.

  18. Refine test items for accurate measurement: six valuable tips.

    Science.gov (United States)

    Siroky, Karen; Di Leonardi, Bette Case

    2015-01-01

    Nursing Professional Development (NPD) specialists frequently design test items to assess competence, to measure learning outcomes, and to create active learning experiences. This article presents six valuable tips for improving test items and using test results to strengthen validity of measurement. NPD specialists can readily apply these tips and examples to measure knowledge with greater accuracy.

  19. Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS: An item response theory approach

    Directory of Open Access Journals (Sweden)

    JOSEPH P. EIMICKE

    2009-06-01

    Full Text Available The aims of this paper are to present findings related to differential item functioning (DIF in the Patient Reported Outcome Measurement Information System (PROMIS depression item bank, and to discuss potential threats to the validity of results from studies of DIF. The 32 depression items studied were modified from several widely used instruments. DIF analyses of gender, age and education were performed using a sample of 735 individuals recruited by a survey polling firm. DIF hypotheses were generated by asking content experts to indicate whether or not they expected DIF to be present, and the direction of the DIF with respect to the studied comparison groups. Primary analyses were conducted using the graded item response model (for polytomous, ordered response category data with likelihood ratio tests of DIF, accompanied by magnitude measures. Sensitivity analyses were performed using other item response models and approaches to DIF detection. Despite some caveats, the items that are recommended for exclusion or for separate calibration were "I felt like crying" and "I had trouble enjoying things that I used to enjoy." The item, "I felt I had no energy," was also flagged as evidencing DIF, and recommended for additional review. On the one hand, false DIF detection (Type 1 error was controlled to the extent possible by ensuring model fit and purification. On the other hand, power for DIF detection might have been compromised by several factors, including sparse data and small sample sizes. Nonetheless, practical and not just statistical significance should be considered. In this case the overall magnitude and impact of DIF was small for the groups studied, although impact was relatively large for some individuals.

  20. Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS): An item response theory approach

    Science.gov (United States)

    Teresi, Jeanne A.; Ocepek-Welikson, Katja; Kleinman, Marjorie; Eimicke, Joseph P.; Crane, Paul K.; Jones, Richard N.; Lai, Jin-shei; Choi, Seung W.; Hays, Ron D.; Reeve, Bryce B.; Reise, Steven P.; Pilkonis, Paul A.; Cella, David

    2009-01-01

    The aims of this paper are to present findings related to differential item functioning (DIF) in the Patient Reported Outcome Measurement Information System (PROMIS) depression item bank, and to discuss potential threats to the validity of results from studies of DIF. The 32 depression items studied were modified from several widely used instruments. DIF analyses of gender, age and education were performed using a sample of 735 individuals recruited by a survey polling firm. DIF hypotheses were generated by asking content experts to indicate whether or not they expected DIF to be present, and the direction of the DIF with respect to the studied comparison groups. Primary analyses were conducted using the graded item response model (for polytomous, ordered response category data) with likelihood ratio tests of DIF, accompanied by magnitude measures. Sensitivity analyses were performed using other item response models and approaches to DIF detection. Despite some caveats, the items that are recommended for exclusion or for separate calibration were “I felt like crying” and “I had trouble enjoying things that I used to enjoy.” The item, “I felt I had no energy,” was also flagged as evidencing DIF, and recommended for additional review. On the one hand, false DIF detection (Type 1 error) was controlled to the extent possible by ensuring model fit and purification. On the other hand, power for DIF detection might have been compromised by several factors, including sparse data and small sample sizes. Nonetheless, practical and not just statistical significance should be considered. In this case the overall magnitude and impact of DIF was small for the groups studied, although impact was relatively large for some individuals. PMID:20336180

  1. Refinement of the Brazilian Household Food Insecurity Measurement Scale: Recommendation for a 14-item EBIA

    Directory of Open Access Journals (Sweden)

    Ana Maria Segall-Corrêa

    2014-04-01

    Full Text Available OBJECTIVE: To review and refine Brazilian Household Food Insecurity Measurement Scale structure. METHODS: The study analyzed the impact of removing the item "adult lost weight" and one of two possibly redundant items on Brazilian Household Food Insecurity Measurement Scale psychometric behavior using the one-parameter logistic (Rasch model. Brazilian Household Food Insecurity Measurement Scale psychometric behavior was analyzed with respect to acceptable adjustment values ranging from 0.7 to 1.3, and to severity scores of the items with theoretically expected gradients. The socioeconomic and food security indicators came from the 2004 National Household Sample Survey, which obtained complete answers to Brazilian Household Food Insecurity Measurement Scale items from 112,665 households. RESULTS: Removing the items "adult reduced amount..." followed by "adult ate less..." did not change the infit of the remaining items, except for "adult lost weight", whose infit increased from 1.21 to 1.56. The internal consistency and item severity scores did not change when "adult ate less" and one of the two redundant items were removed. CONCLUSION: Brazilian Household Food Insecurity Measurement Scale reanalysis reduced the number of scale items from 16 to 14 without changing its internal validity. Its use as a nationwide household food security measure is strongly recommended.

  2. Item Banking Enables Stand-Alone Measurement of Driving Ability.

    Science.gov (United States)

    Khadka, Jyoti; Fenwick, Eva K; Lamoureux, Ecosse L; Pesudovs, Konrad

    2016-12-01

    To explore whether large item sets, as used in item banking, enable important latent traits, such as driving, to form stand-alone measures. The 88-item activity limitation (AL) domain of the glaucoma module of the Eye-tem Bank was interviewer-administered to patients with glaucoma. Rasch analysis was used to calibrate all items in AL domain on the same interval-level scale and test its psychometric properties. Based on Rasch dimensionality metrics, the AL scale was separated into subscales. These subscales underwent separate Rasch analyses to test whether they could form stand-alone measures. Independence of these measures was tested with Bland and Altman (B&A) Limit of Agreement (LOA). The AL scale was completed by 293 patients (median age, 71 years). It demonstrated excellent precision (3.12). However, Rasch analysis dimensionality metrics indicated that the domain arguably had other dimensions which were driving, luminance, and reading. Once separated, the remaining AL items, driving and luminance subscales, were unidimensional and had excellent precision of 4.25, 2.94, and 2.22, respectively. The reading subscale showed poor precision (1.66), so it was not examined further. The luminance subscale demonstrated excellent agreement (mean bias, 0.2 logit; 95% LOA, -2.2 to 3.3 logit); however, the driving subscale demonstrated poor agreement (mean bias, 1.1 logit; 95% LOA, -4.8 to 7.0 logit) with the AL scale. These findings indicate that driving items in the AL domain of the glaucoma module were perceived and responded to differently from the other AL items, but the reading and luminance items were not. Therefore, item banking enables stand-alone measurement of driving ability in glaucoma.

  3. Single-Item Measurement of Suicidal Behaviors: Validity and Consequences of Misclassification.

    Directory of Open Access Journals (Sweden)

    Alexander J Millner

    Full Text Available Suicide is a leading cause of death worldwide. Although research has made strides in better defining suicidal behaviors, there has been less focus on accurate measurement. Currently, the widespread use of self-report, single-item questions to assess suicide ideation, plans and attempts may contribute to measurement problems and misclassification. We examined the validity of single-item measurement and the potential for statistical errors. Over 1,500 participants completed an online survey containing single-item questions regarding a history of suicidal behaviors, followed by questions with more precise language, multiple response options and narrative responses to examine the validity of single-item questions. We also conducted simulations to test whether common statistical tests are robust against the degree of misclassification produced by the use of single-items. We found that 11.3% of participants that endorsed a single-item suicide attempt measure engaged in behavior that would not meet the standard definition of a suicide attempt. Similarly, 8.8% of those who endorsed a single-item measure of suicide ideation endorsed thoughts that would not meet standard definitions of suicide ideation. Statistical simulations revealed that this level of misclassification substantially decreases statistical power and increases the likelihood of false conclusions from statistical tests. Providing a wider range of response options for each item reduced the misclassification rate by approximately half. Overall, the use of single-item, self-report questions to assess the presence of suicidal behaviors leads to misclassification, increasing the likelihood of statistical decision errors. Improving the measurement of suicidal behaviors is critical to increase understanding and prevention of suicide.

  4. Using item response theory to measure extreme response style in marketing research: a global investigation

    NARCIS (Netherlands)

    Jong, de Martijn G.; Steenkamp, Jan-Benedict E.M.; Fox, Jean-Paul; Baumgartner, Hans

    2008-01-01

    Extreme response style (ERS) is an important threat to the validity of survey-based marketing research. In this article, the authors present a new item response theory–based model for measuring ERS. This model contributes to the ERS literature in two ways. First, the method improves on existing proc

  5. Reliability and validity of the International Spinal Cord Injury Basic Pain Data Set items as self-report measures

    DEFF Research Database (Denmark)

    Jensen, M P; Widerström-Noga, E; Richards, J S;

    2010-01-01

    To evaluate the psychometric properties of a subset of International Spinal Cord Injury Basic Pain Data Set (ISCIBPDS) items that could be used as self-report measures in surveys, longitudinal studies and clinical trials....

  6. Reliability and validity of the International Spinal Cord Injury Basic Pain Data Set items as self-report measures

    DEFF Research Database (Denmark)

    2010-01-01

    To evaluate the psychometric properties of a subset of International Spinal Cord Injury Basic Pain Data Set (ISCIBPDS) items that could be used as self-report measures in surveys, longitudinal studies and clinical trials.......To evaluate the psychometric properties of a subset of International Spinal Cord Injury Basic Pain Data Set (ISCIBPDS) items that could be used as self-report measures in surveys, longitudinal studies and clinical trials....

  7. Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2013-01-01

    Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

  8. Measurement equivalence of seven selected items of posttraumatic growth between black and white adult survivors of Hurricane Katrina.

    Science.gov (United States)

    Rhodes, Alison M; Tran, Thanh V

    2013-02-01

    This study examined the equivalence or comparability of the measurement properties of seven selected items measuring posttraumatic growth among self-identified Black (n = 270) and White (n = 707) adult survivors of Hurricane Katrina, using data from the Baseline Survey of the Hurricane Katrina Community Advisory Group Study. Internal consistency reliability was equally good for both groups (Cronbach's alphas = .79), as were correlations between individual scale items and their respective overall scale. Confirmatory factor analysis of a congeneric measurement model of seven selected items of posttraumatic growth showed adequate measures of fit for both groups. The results showed only small variation in magnitude of factor loadings and measurement errors between the two samples. Tests of measurement invariance showed mixed results, but overall indicated that factor loading, error variance, and factor variance were similar between the two samples. These seven selected items can be useful for future large-scale surveys of posttraumatic growth.

  9. Stability of Differential Item Functioning over a Single Population in Survey Data

    Science.gov (United States)

    Dodeen, Hamzeh

    2004-01-01

    This study investigates the stability of differential item functioning (DIF) in survey data. Surveys are conducted periodically, and their results are often reported by aggregating responses. Estimating the stability of DIF across subsets of a survey population can be an important indicator in determining the likelihood of DIF stability over…

  10. Neural measures reveal a fixed item limit in subitizing.

    Science.gov (United States)

    Ester, Edward F; Drew, Trafton; Klee, Daniel; Vogel, Edward K; Awh, Edward

    2012-05-23

    For centuries, it has been known that humans can rapidly and accurately enumerate small sets of items, a process referred to as subitizing. However, there is still active debate regarding the mechanisms that mediate this ability. For example, some have argued that subitizing reflects the operation of a fixed-capacity individuation mechanism that enables concurrent access to a small number of items. However, others have argued that subitizing reflects the operation of a continuous numerical estimation mechanism whose precision varies with numerosity in a manner consistent with Weber's law. Critically, quantitative models based on either of these predictions can provide a reasonable description of subitizing performance, making it difficult to discriminate between these alternatives solely on the basis of subjects' behavioral performance. Here, we attempted to discriminate between fixed-capacity and continuous estimation models of subitizing using neural measures. In two experiments, we recorded EEGs while subjects performed a demanding subitizing task and examined set-size-dependent changes in a neurophysiological marker of visual selection (the N2pc event-related potential component) evoked by an array of to-be-enumerated items. In both experiments, N2pc amplitudes increased monotonically within the subitizing range before reaching an asymptotic limit at approximately three items. Moreover, inter-participant differences in the location of this asymptote were strongly predictive of behavioral estimates of subitizing span derived from a fixed-capacity model. Thus, neural activity linked with subitizing ability shows evidence of an early and discrete limit in the number of items that can be concurrently apprehended, supporting a fixed-capacity model of this process.

  11. Validity and measurement precision of the PROMIS physical function item bank and a content validity-driven 20-item short form in rheumatoid arthritis compared with traditional measures

    NARCIS (Netherlands)

    Voshaar, M.A.; Klooster, P.M. ten; Glas, C.A.; Vonkeman, H.E.; Taal, E.; Krishnan, E.; Bernelot Moens, H.J.; Boers, M.; Terwee, C.B.; Riel, P.L.C.M. van; Laar, M.A. van der

    2015-01-01

    OBJECTIVE: To evaluate the content validity and measurement properties of the Patient-Reported Outcome Measurement Information System (PROMIS) physical function item bank and a 20-item short form in patients with RA in comparison with the HAQ disability index (HAQ-DI) and 36-item Short Form Health S

  12. An Effect Size Measure for Raju's Differential Functioning for Items and Tests

    Science.gov (United States)

    Wright, Keith D.; Oshima, T. C.

    2015-01-01

    This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…

  13. Measurement equivalence in mixed mode surveys

    Directory of Open Access Journals (Sweden)

    Joop J. Hox

    2015-02-01

    Full Text Available Surveys increasingly use mixed mode data collection (e.g., combining face-to-face and web because this controls costs and helps to maintain good response rates. However, a combination of different survey modes in one study, be it cross-sectional or longitudinal, can lead to different kinds of measurement errors. For example, respondents in a face-to-face survey or a web survey may interpret the same question differently, and might give a different answer, just because of the way the question is presented. This effect of survey mode on the question-answer process is called measurement mode effect. This study develops methodological and statistical tools to identify the existence and size of mode effects in a mixed mode survey. In addition, it assesses the size and importance of mode effects in measurement instruments using a specific mixed mode panel survey (Netherlands Kinship Panel Study. Most measurement instruments in the NKPS are multi-item scales, therefore confirmatory factor analysis (CFA will be used as the main analysis tool, using propensity score methods to correct for selection effects.The results show that the NKPS scales by and large have measurement equivalence, but in most cases only partial measurement equivalence. Controlling for respondent differences on demographic variables, and on scale scores from the previous uni-mode measurement occasion, tends to improve measurement equivalence, but not for all scales. The discussion ends with a review of the implications of our results for analyses employing these scales.

  14. Bayesian item response theory models for measurement variance

    NARCIS (Netherlands)

    Verhagen, A.J.

    2012-01-01

    Tests, surveys and questionnaires are all around us these days, and there is an increasing interest in comparing the resulting scores: between countries, between males and females, or over measurement occasions. In the design and analysis of such measurement instruments, a major concern is that the

  15. Curriculum, Translation, and Differential Functioning of Measurement and Geometry Items

    Science.gov (United States)

    Emenogu, Barnabas C.; Childs, Ruth A.

    2005-01-01

    A test item exhibits differential item functioning (DIF) if students with the same ability find it differentially difficult. When the item is administered in French and English, differences in language difficulty and meaning are the most likely explanations. However, curriculum differences may also contribute to DIF. The responses of Ontario…

  16. Item response theory-based measure of global disability in multiple sclerosis derived from the Performance Scales and related items.

    Science.gov (United States)

    Chamot, Eric; Kister, Ilya; Cutter, Gary R

    2014-10-03

    The eight Performance Scales and three assimilated scales (PS) used in North American Research Committee on Multiple Sclerosis (NARCOMS) registry surveys cover a broad range of neurologic domains commonly affected by multiple sclerosis (mobility, hand function, vision, fatigue, cognition, bladder/bowel, sensory, spasticity, pain, depression, and tremor/coordination). Each scale consists of a single 6-to-7-point Likert item with response categories ranging from "normal" to "total disability". Relatively little is known about the performances of the summary index of disability derived from these scales (the Performance Scales Sum or PSS). In this study, we demonstrate the value of a combination of classical and modern methods recently proposed by the Patient-Reported Outcome Measurement Information System (PROMIS) network to evaluate the psychometric properties of the PSS and derive an improved measure of global disability from the PS. The study sample included 7,851adults with MS who completed a NARCOMS intake questionnaire between 2003 and 2011. Factor analysis, bifactor modeling, and item response theory (IRT) analysis were used to evaluate the dimension(s) of disability underlying the PS; calibrate the 11 scales; and generate three alternative summary scores of global disability corresponding to different model assumptions and practical priorities. The construct validity of the three scores was compared by examining the magnitude of their associations with participant's background characteristics, including unemployment. We derived structurally valid measures of global disability from the PS through the proposed methodology that were superior to the PSS. The measure most applicable to clinical practice gives similar weight to physical and mental disability. Overall reliability of the new measure is acceptable for individual comparisons (0.87). Higher scores of global disability were significantly associated with older age at assessment, longer disease duration

  17. Development of an assessment tool to measure students′ perceptions of respiratory care education programs: Item generation, item reduction, and preliminary validation

    Directory of Open Access Journals (Sweden)

    Ghazi Alotaibi

    2013-01-01

    Full Text Available Objectives: Students who perceived their learning environment positively are more likely to develop effective learning strategies, and adopt a deep learning approach. Currently, there is no validated instrument for measuring the educational environment of educational programs on respiratory care (RC. The aim of this study was to develop an instrument to measure students′ perception of the RC educational environment. Materials and Methods: Based on the literature review and an assessment of content validity by multiple focus groups of RC educationalists, potential items of the instrument relevant to RC educational environment construct were generated by the research group. The initial 71 item questionnaire was then field-tested on all students from the 3 RC programs in Saudi Arabia and was subjected to multi-trait scaling analysis. Cronbach′s alpha was used to assess internal consistency reliabilities. Results: Two hundred and twelve students (100% completed the survey. The initial instrument of 71 items was reduced to 65 across 5 scales. Convergent and discriminant validity assessment demonstrated that the majority of items correlated more highly with their intended scale than a competing one. Cronbach′s alpha exceeded the standard criterion of >0.70 in all scales except one. There was no floor or ceiling effect for scale or overall score. Conclusions: This instrument is the first assessment tool developed to measure the RC educational environment. There was evidence of its good feasibility, validity, and reliability. This first validation of the instrument supports its use by RC students to evaluate educational environment.

  18. Validity of Suicidality Items from the Youth Risk Behavior Survey in a High School Sample

    Science.gov (United States)

    May, Alexis; Klonsky, E. David

    2011-01-01

    The Youth Risk Behavior Survey (YRBS) is used by the United States Centers for Disease Control to estimate rates of suicidal thoughts and behaviors in adolescents. This study investigated the validity of the YRBS suicidality items by examining their relationship to criterion variables including loneliness, anxiety, depression, substance use, and…

  19. Validation of Single-Item Screening Measures for Provider Burnout in a Rural Health Care Network.

    Science.gov (United States)

    Waddimba, Anthony C; Scribani, Melissa; Nieves, Melinda A; Krupa, Nicole; May, John J; Jenkins, Paul

    2016-06-01

    We validated three single-item measures for emotional exhaustion (EE) and depersonalization (DP) among rural physician/nonphysician practitioners. We linked cross-sectional survey data (on provider demographics, satisfaction, resilience, and burnout) with administrative information from an integrated health care network (1 academic medical center, 6 community hospitals, 31 clinics, and 19 school-based health centers) in an eight-county underserved area of upstate New York. In total, 308 physicians and advanced-practice clinicians completed a self-administered, multi-instrument questionnaire (65.1% response rate). Significant proportions of respondents reported high EE (36.1%) and DP (9.9%). In multivariable linear mixed models, scores on EE/DP subscales of the Maslach Burnout Inventory were regressed on each single-item measure. The Physician Work-Life Study's single-item measure (classifying 32.8% of respondents as burning out/completely burned out) was correlated with EE and DP (Spearman's ρ = .72 and .41, p burnout.

  20. Natural language in measuring user emotions: A qualitative approach to quantitative survey-based emotion measurement

    NARCIS (Netherlands)

    Tonetto, L.M.; Desmet, P.M.A.

    2012-01-01

    This paper presents an approach to developing surveys that measure user experiences with the use of natural everyday language. The common approach to develop questionnaires that measure experience is to translate theoretical factors into verbal survey items. This theory-based approach can impair the

  1. Preferred reporting items for studies mapping onto preference-based outcome measures: The MAPS statement.

    Science.gov (United States)

    Petrou, Stavros; Rivero-Arias, Oliver; Dakin, Helen; Longworth, Louise; Oppe, Mark; Froud, Robert; Gray, Alastair

    2015-08-01

    'Mapping' onto generic preference-based outcome measures is increasingly being used as a means of generating health utilities for use within health economic evaluations. Despite publication of technical guides for the conduct of mapping research, guidance for the reporting of mapping studies is currently lacking. The MAPS (MApping onto Preference-based measures reporting Standards) statement is a new checklist, which aims to promote complete and transparent reporting of mapping studies. The primary audiences for the MAPS statement are researchers reporting mapping studies, the funders of the research, and peer reviewers and editors involved in assessing mapping studies for publication.A de novo list of 29 candidate reporting items and accompanying explanations was created by a working group comprised of six health economists and one Delphi methodologist. Following a two-round, modified Delphi survey with representatives from academia, consultancy, health technology assessment agencies and the biomedical journal editorial community, a final set of 23 items deemed essential for transparent reporting, and accompanying explanations, was developed. The items are contained in a user friendly 23 item checklist. They are presented numerically and categorised within six sections, namely: (i) title and abstract; (ii) introduction; (iii) methods; (iv) results; (v) discussion; and (vi) other. The MAPS statement is best applied in conjunction with the accompanying MAPS explanation and elaboration document.It is anticipated that the MAPS statement will improve the clarity, transparency and completeness of reporting of mapping studies. To facilitate dissemination and uptake, the MAPS statement is being co-published by eight health economics and quality of life journals, and broader endorsement is encouraged. The MAPS working group plans to assess the need for an update of the reporting checklist in five years' time.This statement was published jointly in Applied Health Economics

  2. The Importance of Item Wording: The Distinction Between Measuring High Standards versus Measuring Perfectionism and Why It Matters

    Science.gov (United States)

    Blasberg, Jonathan S.; Hewitt, Paul L.; Flett, Gordon L.; Sherry, Simon B.; Chen, Chang

    2016-01-01

    In the current research, we illustrate the impact that item wording has on the content of personality scales and how differences in item wording influence empirical results. We present evidence indicating that items in certain scales used to measure "adaptive" perfectionism fail to capture the disabling all-or-nothing approach that is…

  3. Use of Cognitive Interviews to Adapt PROMIS Measurement Items for Spanish Speakers Living with HIV

    Directory of Open Access Journals (Sweden)

    R. Solorio

    2016-01-01

    Full Text Available Purpose. To use cognitive interviewing techniques to assess comprehension of existing Patient-Reported Outcomes Measurement Information System (PROMIS items among Latinos living with HIV and then refine items based on participant feedback. Methods. Latino monolingual Spanish speakers living with HIV (n=56 participated in cognitive interviews. Items from four PROMIS domains, including depression, anxiety, fatigue, and alcohol use, were assessed for comprehension. Audiotaped interviews and handwritten notes were subjected to content analysis to identify problems specific to each instrument for each domain. Results. The assessments from the cognitive interviews identified areas for improvement in each domain. We present data on the type of items that were difficult to comprehend and provide examples for how items were refined based on participants’ and PROMIS Statistical Coordinating Center (PSCC feedback. Six out of 48 depression items, 7 out of the 61 anxiety items, 18 out of 42 fatigue items, and 7 out of 44 alcohol use items were found to have poor comprehension. These items were refined based on participant feedback; the items were then submitted to the PSCC for additional guidance on linguistics and grammar to improve comprehension. Conclusions. Cognitive interviews may be used to enhance comprehension of PROMIS items among Latinos.

  4. A 6-item scale for overall, emotional and social loneliness: confirmatory tests on survey data

    NARCIS (Netherlands)

    de Jong Gierveld, J.; van Tilburg, T.

    2006-01-01

    Loneliness is an indicator of social well-being and pertains to the feeling of missing an intimate relationship (emotional loneliness) or missing a wider social network (social loneliness). The 11-item De Jong Gierveld Loneliness Scale has proved to be a valid and reliable measurement instrument for

  5. Assessing the Validity of a Single-Item HIV Risk Stage-of-Change Measure

    Science.gov (United States)

    Napper, Lucy E.; Branson, Catherine M.; Fisher, Dennis G.; Reynolds, Grace L.; Wood, Michelle M.

    2008-01-01

    This study examined the validity of a single-item measure of HIV risk stage of change that HIV prevention contractors were required to collect by the California State Office of AIDS. The single-item measure was compared to the more conventional University of Rhode Island Change Assessment (URICA). Participants were members of Los Angeles…

  6. Assessing the validity of single-item life satisfaction measures: results from three large samples.

    Science.gov (United States)

    Cheung, Felix; Lucas, Richard E

    2014-12-01

    The present paper assessed the validity of single-item life satisfaction measures by comparing single-item measures to the Satisfaction with Life Scale (SWLS)-a more psychometrically established measure. Two large samples from Washington (N = 13,064) and Oregon (N = 2,277) recruited by the Behavioral Risk Factor Surveillance System and a representative German sample (N = 1,312) recruited by the Germany Socio-Economic Panel were included in the present analyses. Single-item life satisfaction measures and the SWLS were correlated with theoretically relevant variables, such as demographics, subjective health, domain satisfaction, and affect. The correlations between the two life satisfaction measures and these variables were examined to assess the construct validity of single-item life satisfaction measures. Consistent across three samples, single-item life satisfaction measures demonstrated substantial degree of criterion validity with the SWLS (zero-order r = 0.62-0.64; disattenuated r = 0.78-0.80). Patterns of statistical significance for correlations with theoretically relevant variables were the same across single-item measures and the SWLS. Single-item measures did not produce systematically different correlations compared to the SWLS (average difference = 0.001-0.005). The average absolute difference in the magnitudes of the correlations produced by single-item measures and the SWLS was very small (average absolute difference = 0.015-0.042). Single-item life satisfaction measures performed very similarly compared to the multiple-item SWLS. Social scientists would get virtually identical answer to substantive questions regardless of which measure they use.

  7. Employment of Item Response Theory to measure change in Children's Analogical Thinking Modifiability Test

    OpenAIRE

    Queiroz,Odoisa Antunes de; Primi,Ricardo; Carvalho,Lucas de Francisco; Enumo,Sônia Regina Fiorim

    2013-01-01

    Dynamic testing, with an intermediate phase of assistance, measures changes between pretest and post-test assuming a common metric between them. To test this assumption we applied the Item Response Theory in the responses of 69 children to dynamic cognitive testing Children's Analogical Thinking Modifiability Test adapted, with 12 items, totaling 828 responses, with the purpose of verifying if the original scale yields the same results as the equalized scale obtained by Item Response Theory i...

  8. Measurement of Ethnic Background in Cross-national School Surveys

    DEFF Research Database (Denmark)

    Jensen, Helene Nordahl; Krølner, Rikke; Páll, Gabrilla

    2011-01-01

    Indicators such as country of birth and language spoken at home have been used as proxy measures for ethnic background, but the validity of these indicators in surveys among school children remains unclear. This study aimed at comparing item response and student-parent agreement on four questions...

  9. A single-item measure of social identification: reliability, validity, and utility.

    Science.gov (United States)

    Postmes, Tom; Haslam, S Alexander; Jans, Lise

    2013-12-01

    This paper introduces a single-item social identification measure (SISI) that involves rating one's agreement with the statement 'I identify with my group (or category)' followed by a 7-point scale. Three studies provide evidence of the validity (convergent, divergent, and test-retest) of SISI with a broad range of social groups. Overall, the estimated reliability of SISI is good. To address the broader issue of single-item measure reliability, a meta-analysis of 16 widely used single-item measures is reported. The reliability of single-item scales ranges from low to reasonably high. Compared with this field, reliability of the SISI is high. In general, short measures struggle to achieve acceptable reliability because the constructs they assess are broad and heterogeneous. In the case of social identification, however, the construct appears to be sufficiently homogeneous to be adequately operationalized with a single item.

  10. Methodological note: allocation of disability items in the American Community Survey.

    Science.gov (United States)

    Siordia, Carlos; Young, Rebekah

    2013-04-01

    Determining the prevalence and correlates of disability requires the use of sample surveys in data analysis. In an effort to generate complete datasets, allocation procedures (i.e., the assignment of values to missing or illogical responses) are frequently used for missing or inconsistent responses. The goal of this investigation was to explore how six disability-related questions vary in their degree of allocation and how research results may be sensitive to this procedure. This is important because many researchers using large disability information banks are not survey methodologists and may be unaware of how the Census Bureau's editing procedures can influence research findings. We use 2010 1-year Public Use Microdata Sample files from the American Community Survey (ACS). We investigated the allocation rates of the following disability items: self-care; hearing; vision; independent living; ambulatory; and cognitive ability. We also asked how allocation rates varied by demographic characteristics and whether the allocated values could influence multivariate results. Disability item allocation in ACS data have detectable patterns, where the rate of disability allocation is higher for mail surveys, males, older people, groups who speak English not well or not at all, US citizens, Latinos(as), and for people living in or near poverty. Multivariate models may be sensitive to how these allocated values are treated. The rate of allocations varies as a function of demographic variables because of methodological procedures and survey participation behaviors. Because allocation rates may affect research and policy about the disabled population, more research is required. Copyright © 2013 Elsevier Inc. All rights reserved.

  11. Development of an item bank for computerized adaptive test (CAT) measurement of pain

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Aaronson, Neil K; Chie, Wei-Chu

    2016-01-01

    PURPOSE: Patient-reported outcomes should ideally be adapted to the individual patient while maintaining comparability of scores across patients. This is achievable using computerized adaptive testing (CAT). The aim here was to develop an item bank for CAT measurement of the pain domain as measured...... by the EORTC QLQ-C30 questionnaire. METHODS: The development process consisted of four steps: (1) literature search, (2) formulation of new items and expert evaluations, (3) pretesting and (4) field-testing and psychometric analyses for the final selection of items. RESULTS: In step 1, we identified 337 pain...... were obtained from 1103 cancer patients from five countries. Psychometric evaluations showed that 16 items could be retained in a unidimensional item bank. Evaluations indicated that use of the CAT measure may reduce sample size requirements with 15-25 % compared to using the QLQ-C30 pain scale...

  12. Psychometric Properties of the Brazilian 12-Item Short-Form Health Survey Version 2 (SF-12v2

    Directory of Open Access Journals (Sweden)

    Bruno Figueiredo Damásio

    2015-04-01

    Full Text Available The 12-Item Short-Form Health Survey, in its initial (SF-12 and revised form (SF-12v2 is a widely used measure to evaluate health-related quality of life (HRQoL. The present study evaluates the factor structure and reliability of the Brazilian version of the SF-12v2. Participants were 627 subjects (74.1% women, aged from 18 to 88 years (M = 38.6; SD = 13.16, from 17 Brazilian states. Confirmatory factor analyses suggested two pairs of error terms to be highly correlated (3a-3b; and 4a-4b. A qualitative inspection showed an overlap of content among these items. The respecified model presented adequate fit indices. Convergent validity was also tested with measures of health-related self-care, subjective happiness, life satisfaction, depression and self-efficacy. Expected correlations were found between the SF-12v2 and these measures. Results showed initial evidence in favor of using the SF-12v2 as a measure of physical and mental health in the Brazilian context.

  13. Blooms' separation of the final exam of Engineering Mathematics II: Item reliability using Rasch measurement model

    Science.gov (United States)

    Fuaad, Norain Farhana Ahmad; Nopiah, Zulkifli Mohd; Tawil, Norgainy Mohd; Othman, Haliza; Asshaari, Izamarlina; Osman, Mohd Hanif; Ismail, Nur Arzilah

    2014-06-01

    In engineering studies and researches, Mathematics is one of the main elements which express physical, chemical and engineering laws. Therefore, it is essential for engineering students to have a strong knowledge in the fundamental of mathematics in order to apply the knowledge to real life issues. However, based on the previous results of Mathematics Pre-Test, it shows that the engineering students lack the fundamental knowledge in certain topics in mathematics. Due to this, apart from making improvements in the methods of teaching and learning, studies on the construction of questions (items) should also be emphasized. The purpose of this study is to assist lecturers in the process of item development and to monitor the separation of items based on Blooms' Taxonomy and to measure the reliability of the items itself usingRasch Measurement Model as a tool. By using Rasch Measurement Model, the final exam questions of Engineering Mathematics II (Linear Algebra) for semester 2 sessions 2012/2013 were analysed and the results will provide the details onthe extent to which the content of the item providesuseful information about students' ability. This study reveals that the items used in Engineering Mathematics II (Linear Algebra) final exam are well constructed but the separation of the items raises concern as it is argued that it needs further attention, as there is abig gap between items at several levels of Blooms' cognitive skill.

  14. Psychometric comparison of single-item, short, and comprehensive depression screening measures in Korean young adults.

    Science.gov (United States)

    Kim, Hee-Ju; Abraham, Ivo

    2016-04-01

    Integrating long depression-screening instruments into routine clinical practice and research studies is often impractical, necessitating short-item if not single-item measures with comparable psychometric properties. To examine whether single-item or short depression-screening measures are comparable to a comprehensive screening measure in reliability (i.e., internal consistency and test-retest reliability) and validity (i.e., convergent, concurrent, and predictive validity) in Korean young adults within a Classical Testing Theory framework. A total of 458 students from six nursing colleges in South Korea completed three depression measures: the 20-item Center for Epidemiologic Studies-Depression screening instrument (CES-D; comprehensive measure); the five-item Profile of Mood States-Brief depression subscale (POMS-B depression subscale; short measure); a single-item Likert measure; and a single-item numeric rating scale. Internal consistency reliability was tested by Cronbach's alpha and item-total correlations; test-retest reliability by intraclass correlation coefficient (ICC); convergent validity by correlation with the CES-D; concurrent validity by the correlation with perceived stress level and sleep quality; and predictive validity by receiver operating characteristic curve to predict the two groups with different depression levels. The POMS-B depression subscale was comparable to the comprehensive CES-D scale in internal consistency reliability (alpha=.85); test-retest reliability (ICC=.76); and convergent (r=.81 with CES-D), concurrent (r=.64 with perceived stress level, r=.34 with sleep quality), and predictive validity (area under the curve=.88). The two single-item options were not comparable to the comprehensive CES-D. The short POMS-B depression subscale shows an acceptable balance between practical clinical and research needs and psychometric quality. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Student Ratings of the Importance of Survey Items, Multiplicative Factor Analysis, and the Validity of the Community of Inquiry Survey

    Science.gov (United States)

    Diaz, Sebastian R.; Swan, Karen; Ice, Philip; Kupczynski, Lori

    2010-01-01

    This research builds upon prior validation studies of the Community of Inquiry (CoI) survey by utilizing multiple rating measures to validate the survey's tripartite structure (teaching presence, social presence, and cognitive presence). In prior studies exploring the construct validity of these 3 subscales, only respondents' course ratings were…

  16. Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank

    NARCIS (Netherlands)

    Oude Voshaar, Martijn A.H.; Ten Klooster, Peter M.; Vonkeman, Harald E.; van de Laar, Mart A.F.J.

    2017-01-01

    Objective: Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Study

  17. The 18 Household Food Security Survey items provide valid food security classifications for adults and children in the Caribbean

    Directory of Open Access Journals (Sweden)

    Nunes Cheryl

    2006-02-01

    Full Text Available Abstract Background We tested the properties of the 18 Household Food Security Survey (HFSS items, and the validity of the resulting food security classifications, in an English-speaking middle-income country. Methods Survey of primary school children in Trinidad and Tobago. Parents completed the HFSS. Responses were analysed for the 10 adult-referenced items and the eight child-referenced items. Item response theory models were fitted. Item calibrations and subject scores from a one-parameter logistic (1PL model were compared with those from either two-parameter logistic model (2PL or a model for differential item functioning (DIF by ethnicity. Results There were 5219 eligible with 3858 (74% completing at least one food security item. Adult item calibrations (standard error in the 1PL model ranged from -4.082 (0.019 for the 'worried food would run out' item to 3.023 (0.042 for 'adults often do not eat for a whole day'. Child item calibrations ranged from -3.715 (0.025 for 'relied on a few kinds of low cost food' to 3.088 (0.039 for 'child didn't eat for a whole day'. Fitting either a 2PL model, which allowed discrimination parameters to vary between items, or a differential item functioning model, which allowed item calibrations to vary between ethnic groups, had little influence on interpretation. The classification based on the adult-referenced items showed that there were 19% of respondents who were food insecure without hunger, 10% food insecure with moderate hunger and 6% food insecure with severe hunger. The classification based on the child-referenced items showed that there were 23% of children who were food insecure without hunger and 9% food insecure with hunger. In both children and adults food insecurity showed a strong, graded association with lower monthly household income (P Conclusion These results support the use of 18 HFSS items to classify food security status of adults or children in an English-speaking country where food

  18. Measuring Consumers’ Environmental Responsibility: A Synthesis of Constructs and Measurement Scale Items

    Directory of Open Access Journals (Sweden)

    K. M. R. Taufique

    2014-04-01

    Full Text Available It is universal that central to all production is consumption. Without proper management, production along with consumption is likely to be the main sources of environmental problems. This very reality calls for consumers to be environmentally responsible in their consumption behavior. The objective of this paper is to prepare a synthesis of all the possible factors and measurement scale items to be used for assessing consumers’ environmental responsibility. For making such synthesis, all major works done on the field have been thoroughly reviewed.The paper comes up with a total of six parameters that include knowledge & awareness, attitude, green consumer value, emotional affinity toward nature, willingness to act and environment related past behavior. These tentative, yet inclusive set of parameters are thought to be useful for guiding the designing of large scale future empirical researches for developing a dependable inclusive set of parameters to test consumer’ environmental responsibility. A conceptual model and possible measurement items are proposed for further empirical research.

  19. Surveying techniques in vibration measurement

    Directory of Open Access Journals (Sweden)

    Kuras Przemyslaw

    2015-01-01

    Full Text Available In order to determine the actual dynamic characteristics of engineering structures, it is necessary to perform direct measurements. The paper focuses on the problem of using various devices to measure vibration, with particular emphasis on surveying instruments. The main tool used in this study is the radar interferometer, which has been compared to: robotic total station, GNSS receivers and sensors (accelerometer and encoder. The results of four dynamic experiments are presented. They were performed on: industrial chimney, drilling tower, railway bridge and pedestrian footbridge. The obtained results have been discussed in terms of the requirements imposed by the standard ISO 4866:2010.

  20. Bayesian randomized item response modeling for sensitive measurements

    NARCIS (Netherlands)

    Avetisyan, M.

    2012-01-01

    In behavioral, health, and social sciences, any endeavor involving measurement is directed at accurate representation of the latent concept with the manifest observation. However, when sensitive topics, such as substance abuse, tax evasion, or felony, are inquired, substantial distortion of reported

  1. Updated U.S. population standard for the Veterans RAND 12-item Health Survey (VR-12).

    Science.gov (United States)

    Selim, Alfredo J; Rogers, William; Fleishman, John A; Qian, Shirley X; Fincke, Benjamin G; Rothendler, James A; Kazis, Lewis E

    2009-02-01

    The purpose of this project was to develop an updated U.S. population standard for the Veterans RAND 12-item Health Survey (VR-12). We used a well-defined and nationally representative sample of the U.S. population from 52,425 responses to the Medical Expenditure Panel Survey (MEPS) collected between 2000 and 2002. We applied modified regression estimates to update the non-proprietary 1990 scoring algorithms. We applied the updated standard to the Medicare Health Outcomes Survey (HOS) to compute the VR-12 physical (PCS((MEPS standard))) and mental (MCS((MEPS standard))) component summaries based on the MEPS. We compared these scores to PCS and MCS based on the 1990 U.S. population standard. Using the updated U.S. population standard, the average VR-12 PCS((MEPS standard)) and MCS((MEPS standard)) scores in the Medicare HOS were 39.82 (standard deviation [SD] = 12.2) and 50.08 (SD = 11.4), respectively. For the same Medicare HOS, the average PCS and MCS scores based on the 1990 standard were 1.40 points higher and 0.99 points lower in comparison to VR-12 PCS and MCS, respectively. Changes in the U.S. population between 1990 and today make the old standard obsolete for the VR-12, so the updated standard developed here is widely available to serve as such a contemporary standard for future applications for health-related quality of life (HRQoL) assessments.

  2. Measuring organizational effectiveness in information and communication technology companies using item response theory.

    Science.gov (United States)

    Trierweiller, Andréa Cristina; Peixe, Blênio César Severo; Tezza, Rafael; Pereira, Vera Lúcia Duarte do Valle; Pacheco, Waldemar; Bornia, Antonio Cezar; de Andrade, Dalton Francisco

    2012-01-01

    The aim of this paper is to measure the effectiveness of the organizations Information and Communication Technology (ICT) from the point of view of the manager, using Item Response Theory (IRT). There is a need to verify the effectiveness of these organizations which are normally associated to complex, dynamic, and competitive environments. In academic literature, there is disagreement surrounding the concept of organizational effectiveness and its measurement. A construct was elaborated based on dimensions of effectiveness towards the construction of the items of the questionnaire which submitted to specialists for evaluation. It demonstrated itself to be viable in measuring organizational effectiveness of ICT companies under the point of view of a manager through using Two-Parameter Logistic Model (2PLM) of the IRT. This modeling permits us to evaluate the quality and property of each item placed within a single scale: items and respondents, which is not possible when using other similar tools.

  3. Measuring Belief in Conspiracy Theories: Validation of a French and English Single-Item Scale

    Directory of Open Access Journals (Sweden)

    Anthony Lantian

    2016-02-01

    Full Text Available We designed, in French and in English, a single-item scale to measure people’s general tendency to believe in conspiracy theories. The validity and reliability of this scale was assessed in 3 studies (total 'N' = 555. In Study 1 ('N' = 152, positive correlations between the single-item scale and 3 other conspiracy belief scales on a French student sample suggested good concurrent validity. In Study 2 ('N' = 292, we replicated these results on a larger and more heterogeneous Internet American sample. Moreover, the scale showed good predictive validity—responses predicted participants’ willingness to receive a bi-monthly newsletter about alleged conspiracy theories. Finally, in Study 3 ('N' = 111, we observed good test-retest reliability and demonstrated both convergent and discriminant validity of the single-item scale. Overall these results suggest that the single-item conspiracy belief scale has good validity and reliability and may be used to measure conspiracy belief in favor of lengthier existing scales. In addition, the validation of the single-item scale led us to develop and start validating French versions of the 'Generic Conspiracist Beliefs scale', the 'Conspiracy Mentality Questionnaire', and a 10-item version (instead of the 15-item original version of the 'Belief in Conspiracy Theories Inventory'.

  4. Test-retest reliability of selected items of Health Behaviour in School-aged Children (HBSC survey questionnaire in Beijing, China

    Directory of Open Access Journals (Sweden)

    Liu Yang

    2010-08-01

    Full Text Available Abstract Background Children's health and health behaviour are essential for their development and it is important to obtain abundant and accurate information to understand young people's health and health behaviour. The Health Behaviour in School-aged Children (HBSC study is among the first large-scale international surveys on adolescent health through self-report questionnaires. So far, more than 40 countries in Europe and North America have been involved in the HBSC study. The purpose of this study is to assess the test-retest reliability of selected items in the Chinese version of the HBSC survey questionnaire in a sample of adolescents in Beijing, China. Methods A sample of 95 male and female students aged 11 or 15 years old participated in a test and retest with a three weeks interval. Student Identity numbers of respondents were utilized to permit matching of test-retest questionnaires. 23 items concerning physical activity, sedentary behaviour, sleep and substance use were evaluated by using the percentage of response shifts and the single measure Intraclass Correlation Coefficients (ICC with 95% confidence interval (CI for all respondents and stratified by gender and age. Items on substance use were only evaluated for school children aged 15 years old. Results The percentage of no response shift between test and retest varied from 32% for the item on computer use at weekends to 92% for the three items on smoking. Of all the 23 items evaluated, 6 items (26% showed a moderate reliability, 12 items (52% displayed a substantial reliability and 4 items (17% indicated almost perfect reliability. No gender and age group difference of the test-retest reliability was found except for a few items on sedentary behaviour. Conclusions The overall findings of this study suggest that most selected indicators in the HBSC survey questionnaire have satisfactory test-retest reliability for the students in Beijing. Further test-retest studies in a large

  5. Factors affecting study efficiency and item non-response in health surveys in developing countries: the Jamaica national healthy lifestyle survey

    Directory of Open Access Journals (Sweden)

    Bennett Franklyn

    2007-02-01

    Full Text Available Abstract Background Health surveys provide important information on the burden and secular trends of risk factors and disease. Several factors including survey and item non-response can affect data quality. There are few reports on efficiency, validity and the impact of item non-response, from developing countries. This report examines factors associated with item non-response and study efficiency in a national health survey in a developing Caribbean island. Methods A national sample of participants aged 15–74 years was selected in a multi-stage sampling design accounting for 4 health regions and 14 parishes using enumeration districts as primary sampling units. Means and proportions of the variables of interest were compared between various categories. Non-response was defined as failure to provide an analyzable response. Linear and logistic regression models accounting for sample design and post-stratification weighting were used to identify independent correlates of recruitment efficiency and item non-response. Results We recruited 2012 15–74 year-olds (66.2% females at a response rate of 87.6% with significant variation between regions (80.9% to 97.6%; p Conclusion Informative health surveys are possible in developing countries. While survey response rates may be satisfactory, item non-response was high in respect of income and sexual practice. In contrast to developed countries, non-response to questions on income is higher and has different correlates. These findings can inform future surveys.

  6. Bayesian modeling of measurement error in predictor variables using item response theory

    NARCIS (Netherlands)

    Fox, Gerardus J.A.; Glas, Cornelis A.W.

    2000-01-01

    This paper focuses on handling measurement error in predictor variables using item response theory (IRT). Measurement error is of great important in assessment of theoretical constructs, such as intelligence or the school climate. Measurement error is modeled by treating the predictors as unobserved

  7. Bayesian modeling of measurement error in predictor variables using item response theory

    NARCIS (Netherlands)

    Fox, Jean-Paul; Glas, Cees A.W.

    2000-01-01

    This paper focuses on handling measurement error in predictor variables using item response theory (IRT). Measurement error is of great important in assessment of theoretical constructs, such as intelligence or the school climate. Measurement error is modeled by treating the predictors as unobserved

  8. Measuring the quality of life in hypertension according to Item Response Theory

    Directory of Open Access Journals (Sweden)

    José Wicto Pereira Borges

    Full Text Available ABSTRACT OBJECTIVE To analyze the Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL – Mini-questionnaire of Quality of Life in Hypertension using the Item Response Theory. METHODS This is an analytical study conducted with 712 persons with hypertension treated in thirteen primary health care units of Fortaleza, State of Ceará, Brazil, in 2015. The steps of the analysis by the Item Response Theory were: evaluation of dimensionality, estimation of parameters of items, and construction of scale. The study of dimensionality was carried out on the polychoric correlation matrix and confirmatory factor analysis. To estimate the item parameters, we used the Gradual Response Model of Samejima. The analyses were conducted using the free software R with the aid of psych and mirt. RESULTS The analysis has allowed the visualization of item parameters and their individual contributions in the measurement of the latent trait, generating more information and allowing the construction of a scale with an interpretative model that demonstrates the evolution of the worsening of the quality of life in five levels. Regarding the item parameters, the items related to the somatic state have had a good performance, as they have presented better power to discriminate individuals with worse quality of life. The items related to mental state have been those which contributed with less psychometric data in the MINICHAL. CONCLUSIONS We conclude that the instrument is suitable for the identification of the worsening of the quality of life in hypertension. The analysis of the MINICHAL using the Item Response Theory has allowed us to identify new sides of this instrument that have not yet been addressed in previous studies.

  9. Linking Physical and Mental Health Summary Scores from the Veterans RAND 12-Item Health Survey (VR-12) to the PROMIS(®) Global Health Scale.

    Science.gov (United States)

    Schalet, Benjamin D; Rothrock, Nan E; Hays, Ron D; Kazis, Lewis E; Cook, Karon F; Rutsohn, Joshua P; Cella, David

    2015-10-01

    Global health measures represent an attractive option for researchers and clinicians seeking a brief snapshot of a patient's overall perspective on his or her health. Because scores on different global health measures are not comparable, comparative effectiveness research (CER) is challenging. To establish a common reporting metric so that the physical and mental health scores on the Veterans RAND 12-Item Health Survey (VR-12 (©) ) can be converted into scores on the corresponding Patient Reported Outcomes Measurement Information System (PROMIS(®)) Global Health scores. Following a single-sample linking design, participants from an Internet panel completed items from the PROMIS Global Health and VR-12 Health Survey. A common metric was created using analyses based on item response theory (IRT), producing score cross-walk tables for the mental and physical health components of each measure. The linking relationships were evaluated by calculating the standard deviation of differences between the observed and linked PROMIS scores and estimating confidence intervals by sample size. Participants (N = 2025) were 49 % male and 73 % white; mean age was 46 years. Mental and physical health subscales of the PROMIS Global Health and the VR-12. The mean VR-12 physical component and mental component scores were 45.2 and 46.6, respectively; the mean PROMIS physical and mental health scores were 48.3 and 48.5, respectively. We found evidence that the combined set of VR-12 and PROMIS items were relatively unidimensional and that we could proceed with linking. Linking worked better between the physical health than mental health scores using VR-12 item responses (vs. linking based on algorithmic scores). For each of the cross-walks, users can minimize the impact of linking error with modest increases in sample sizes. VR-12 scores can be expressed on the PROMIS Global Health metric to facilitate the evaluation of treatment, including CER. Extending these results to other common

  10. An Item Bank to Measure Systems, Services, and Policies: Environmental Factors Affecting People With Disabilities.

    Science.gov (United States)

    Lai, Jin-Shei; Hammel, Joy; Jerousek, Sara; Goldsmith, Arielle; Miskovic, Ana; Baum, Carolyn; Wong, Alex W; Dashner, Jessica; Heinemann, Allen W

    2016-12-01

    To develop a measure of perceived systems, services, and policies facilitators (see Chapter 5 of the International Classification of Functioning, Disability and Health) for people with neurologic disabilities and to evaluate the effect of perceived systems, services, and policies facilitators on health-related quality of life. Qualitative approaches to develop and refine items. Confirmatory factor analysis including 1-factor confirmatory factor analysis and bifactor analysis to evaluate unidimensionality of items. Rasch analysis to identify misfitting items. Correlational and analysis of variance methods to evaluate construct validity. Community-dwelling individuals participated in telephone interviews or traveled to the academic medical centers where this research took place. Participants (N=571) had a diagnosis of spinal cord injury, stroke, or traumatic brain injury. They were 18 years or older and English speaking. Not applicable. An item bank to evaluate environmental access and support levels of services, systems, and policies for people with disabilities. We identified a general factor defined as "access and support levels of the services, systems, and policies at the level of community living" and 3 local factors defined as "health services," "community living," and "community resources." The systems, services, and policies measure correlated moderately with participation measures: Community Participation Indicators (CPI) - Involvement, CPI - Control over Participation, Quality of Life in Neurological Disorders - Ability to Participate, Quality of Life in Neurological Disorders - Satisfaction with Role Participation, Patient-Reported Outcomes Measurement Information System (PROMIS) Ability to Participate, PROMIS Satisfaction with Role Participation, and PROMIS Isolation. The measure of systems, services, and policies facilitators contains items pertaining to health services, community living, and community resources. Investigators and clinicians can measure

  11. Thorndike, Thurstone and Rasch: A Comparison of Their Approaches to Item-Invariant Measurement.

    Science.gov (United States)

    Englehard, George, Jr.

    The methods used by E. L. Thorndike, L. L. Thurstone, and G. Rasch to address issues related to item-invariant measurement and the scoring of individual performance are compared. The analyses highlight the close connection among the three methods, and suggest that progress in measurement theory reflects the movement from essentially ad hoc methods…

  12. A randomised trial and economic evaluation of the effect of response mode on response rate, response bias, and item non-response in a survey of doctors.

    Science.gov (United States)

    Scott, Anthony; Jeon, Sung-Hee; Joyce, Catherine M; Humphreys, John S; Kalb, Guyonne; Witt, Julia; Leahy, Anne

    2011-09-05

    Surveys of doctors are an important data collection method in health services research. Ways to improve response rates, minimise survey response bias and item non-response, within a given budget, have not previously been addressed in the same study. The aim of this paper is to compare the effects and costs of three different modes of survey administration in a national survey of doctors. A stratified random sample of 4.9% (2,702/54,160) of doctors undertaking clinical practice was drawn from a national directory of all doctors in Australia. Stratification was by four doctor types: general practitioners, specialists, specialists-in-training, and hospital non-specialists, and by six rural/remote categories. A three-arm parallel trial design with equal randomisation across arms was used. Doctors were randomly allocated to: online questionnaire (902); simultaneous mixed mode (a paper questionnaire and login details sent together) (900); or, sequential mixed mode (online followed by a paper questionnaire with the reminder) (900). Analysis was by intention to treat, as within each primary mode, doctors could choose either paper or online. Primary outcome measures were response rate, survey response bias, item non-response, and cost. The online mode had a response rate 12.95%, followed by the simultaneous mixed mode with 19.7%, and the sequential mixed mode with 20.7%. After adjusting for observed differences between the groups, the online mode had a 7 percentage point lower response rate compared to the simultaneous mixed mode, and a 7.7 percentage point lower response rate compared to sequential mixed mode. The difference in response rate between the sequential and simultaneous modes was not statistically significant. Both mixed modes showed evidence of response bias, whilst the characteristics of online respondents were similar to the population. However, the online mode had a higher rate of item non-response compared to both mixed modes. The total cost of the online

  13. The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency

    DEFF Research Database (Denmark)

    Rose, Matthias; Bjørner, Jakob; Gandek, Barbara;

    2014-01-01

    OBJECTIVE: To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. STUDY DESIGN AND SETTING: The items were evaluated using qualitative and quantitative methods. A total...... of 16,065 adults answered item subsets (n>2,200/item) on the Internet, with oversampling of the chronically ill. Classical test and item response theory methods were used to evaluate 149 PROMIS PF items plus 10 Short Form-36 and 20 Health Assessment Questionnaire-Disability Index items. A graded...... response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. RESULTS: The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living...

  14. Developing energy and momentum conceptual survey (EMCS) with four-tier diagnostic test items

    Science.gov (United States)

    Afif, Nur Faadhilah; Nugraha, Muhammad Gina; Samsudin, Achmad

    2017-05-01

    Students' conceptions of work and energy are important to support the learning process in the classroom. For that reason, a diagnostic test instrument is needed to diagnose students' conception of work and energy. As a result, the researcher decided to develop Energy and Momentum Conceptual Survey (EMCS) instrument test into four-tier test diagnostic items. The purpose of this research is organized as the first step of four-tier test-formatted EMCS development as one of diagnostic test instruments on work and Energy. The research method used the 4D model (Defining, Designing, Developing and Disseminating). The instrument developed has been tested to 39 students in one of Senior High Schools. The resulting research showed that four-tier test-formatted EMCS is able to diagnose students' conception level of work and energy concept. It can be concluded that the development of four-tier test-formatted EMCS is one of potential diagnostic test instruments that able to obtain the category of students who understand concepts, misconceptions and do not understand about Work and Energy concept at all.

  15. The future of outcomes measurement: item banking, tailored short-forms, and computerized adaptive assessment.

    Science.gov (United States)

    Cella, David; Gershon, Richard; Lai, Jin-Shei; Choi, Seung

    2007-01-01

    The use of item banks and computerized adaptive testing (CAT) begins with clear definitions of important outcomes, and references those definitions to specific questions gathered into large and well-studied pools, or "banks" of items. Items can be selected from the bank to form customized short scales, or can be administered in a sequence and length determined by a computer programmed for precision and clinical relevance. Although far from perfect, such item banks can form a common definition and understanding of human symptoms and functional problems such as fatigue, pain, depression, mobility, social function, sensory function, and many other health concepts that we can only measure by asking people directly. The support of the National Institutes of Health (NIH), as witnessed by its cooperative agreement with measurement experts through the NIH Roadmap Initiative known as PROMIS (www.nihpromis.org), is a big step in that direction. Our approach to item banking and CAT is practical; as focused on application as it is on science or theory. From a practical perspective, we frequently must decide whether to re-write and retest an item, add more items to fill gaps (often at the ceiling of the measure), re-test a bank after some modifications, or split up a bank into units that are more unidimensional, yet less clinically relevant or complete. These decisions are not easy, and yet they are rarely unforgiving. We encourage people to build practical tools that are capable of producing multiple short form measures and CAT administrations from common banks, and to further our understanding of these banks with various clinical populations and ages, so that with time the scores that emerge from these many activities begin to have not only a common metric and range, but a shared meaning and understanding across users. In this paper, we provide an overview of item banking and CAT, discuss our approach to item banking and its byproducts, describe testing options, discuss an

  16. Development of the Perceived Nutrition Environment Measures Survey.

    Science.gov (United States)

    Green, Sarah H; Glanz, Karen

    2015-07-01

    Objective, observational measures of nutrition environments are now well established and widely used. Individuals' perceptions of their nutrition environments may be equally or more important, but are less well conceptualized, and comprehensive measures are not available. This paper describes the development of the Perceived Nutrition Environment Measures Survey (NEMS-P), its test-retest reliability, and its ability to discern differences between lower- and higher-SES neighborhoods. This research involved five steps: (1) development of a conceptual model and inventory of items; (2) expert review; (3) pilot testing and cognitive interviews; (4) revising the survey; and (5) administering the revised survey to participants in neighborhoods of high and low SES on two occasions to evaluate neighborhood differences and test-retest reliability. Data were collected in 2010 and 2011 and analyzed in 2011 and 2012. The final survey has 118 items. Fifty-three core items represent three types of perceived nutrition environments: community nutrition environment, consumer nutrition environment, and home food environment. Test-retest reliability for core constructs of perceived nutrition environments was moderate to good (0.52-0.83) for most measured constructs. Residents of higher-SES neighborhoods reported higher availability scores in stores, stronger agreement that healthy options were available in nearby restaurants, and higher scores for accessibility of healthy foods in their homes. The NEMS-P has moderate to good test-retest reliability and can discriminate perceptions of nutrition environments between residents of higher- and lower-SES neighborhoods. This survey is available and ready to be used. Copyright © 2015 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.

  17. Measuring Integration of Information and Communication Technology in Education: An Item Response Modeling Approach

    Science.gov (United States)

    Peeraer, Jef; Van Petegem, Peter

    2012-01-01

    This research describes the development and validation of an instrument to measure integration of Information and Communication Technology (ICT) in education. After literature research on definitions of integration of ICT in education, a comparison is made between the classical test theory and the item response modeling approach for the…

  18. Inclusion of Community in Self Scale: A Single-Item Pictorial Measure of Community Connectedness

    Science.gov (United States)

    Mashek, Debra; Cannaday, Lisa W.; Tangney, June P.

    2007-01-01

    We developed a single-item pictorial measure of community connectedness, building on the theoretical and methodological traditions of the self-expansion model (Aron & Aron, 1986). The Inclusion of Community in the Self (ICS) Scale demonstrated excellent test-retest reliability, convergent validity, and discriminant validity in a sample of 190…

  19. A single-item measure of social identification : Reliability, validity, and utility

    NARCIS (Netherlands)

    Postmes, Tom; Haslam, S. Alexander; Jans, Lise

    2013-01-01

    This paper introduces a single-item social identification measure (SISI) that involves rating one's agreement with the statement I identify with my group (or category)' followed by a 7-point scale. Three studies provide evidence of the validity (convergent, divergent, and test-retest) of SISI with a

  20. Measuring Experiential Avoidance: Reliability and Validity of the Dutch 9-item Acceptance and Action Questionnaire (AAQ)

    NARCIS (Netherlands)

    Boelen, P.A.; Reijntjes, A.H.A.

    2008-01-01

    Three studies evaluated psychometric properties of the Dutch version of the 9-item Acceptance and Action Questionnaire (AAQ)—a self-report measure designed to assess experiential avoidance as conceptualized inAcceptance and Commitment Therapy (ACT). Study 1, among bereaved adults, showed that a one-

  1. Bayesian modeling of measurement error in predictor variables using item response theory

    NARCIS (Netherlands)

    Fox, Jean-Paul; Glas, Cees A.W.

    2003-01-01

    It is shown that measurement error in predictor variables can be modeled using item response theory (IRT). The predictor variables, that may be defined at any level of an hierarchical regression model, are treated as latent variables. The normal ogive model is used to describe the relation between t

  2. Writing multiple-choice test items that promote and measure critical thinking.

    Science.gov (United States)

    Morrison, S; Free, K W

    2001-01-01

    Faculties are concerned about measurement of critical thinking especially since the National League for Nursing Accrediting Commission cited such measurement as a requirement for accreditation (NLNAC, 1997). Some writers and researchers (Alfaro-LeFevre, 1995; Blat, 1989; McPeck, 1981, 1990) describe the need to measure critical thinking within the context of a specific discipline. Based on McPeck's position that critical thinking is discipline-specific, guidelines for developing multiple-choice test items as a means of measuring critical thinking within the discipline of nursing are discussed. Specifically, criteria described by Morrison, Smith, and Britt (1996) for writing critical-thinking multiple-choice test items are reviewed and explained for promoting and measuring critical thinking.

  3. Difficulty levels of OSCE items related to examination and measurement skills.

    Science.gov (United States)

    Kanada, Yoshikiyo; Sakurai, Hiroaki; Sugiura, Yoshito

    2015-03-01

    [Purpose] The difficulty levels of level-2 OSCE (examination and measurement skills) items were examined, with a view to providing reference data for the determination of students' skills. [Subjects] A total of 284 graduates of physical (PT) and occupational (OT) therapy classes of 2011 (59 and 40), 2012 (46 and 36), and 2013 (61 and 42, respectively) were studied, with PT or OT faculty members as OSCE examiners and a simulated patient. [Methods] Scores for 11 level-2 OSCE items were compared between before and after clinical training. [Results] Scores markedly increased after clinical training. On comparison among the items, scores for sensory examination were the highest, and those for interviews were the lowest. [Conclusion] The results of this study indicate the necessity of considering an appropriate combination of different difficulty levels when adopting OSCE-based educational approaches.

  4. Further Investigating Method Effects Associated with Negatively Worded Items on Self-Report Surveys

    Science.gov (United States)

    DiStefano, Christine; Motl, Robert W.

    2006-01-01

    This article used multitrait-multimethod methodology and covariance modeling for an investigation of the presence and correlates of method effects associated with negatively worded items on the Rosenberg Self-Esteem (RSE) scale (Rosenberg, 1989) using a sample of 757 adults. Results showed that method effects associated with negative item phrasing…

  5. Further Investigating Method Effects Associated with Negatively Worded Items on Self-Report Surveys

    Science.gov (United States)

    DiStefano, Christine; Motl, Robert W.

    2006-01-01

    This article used multitrait-multimethod methodology and covariance modeling for an investigation of the presence and correlates of method effects associated with negatively worded items on the Rosenberg Self-Esteem (RSE) scale (Rosenberg, 1989) using a sample of 757 adults. Results showed that method effects associated with negative item phrasing…

  6. Disability Items From the Current Population Survey (2008-2015) and Permanent Versus Temporary Disability Status.

    Science.gov (United States)

    Ward, Bryce; Myers, Andrew; Wong, Jennifer; Ravesloot, Craig

    2017-05-01

    To examine longitudinal responses to the disability indicator questions that have been adopted as the standard across national surveys sponsored by the US Department of Health and Human Services. Data from the Current Population Survey between 2008 and 2015 were linked to create a longitudinal sample of 721 178 individual respondents. Responses to the disability questions fluctuated significantly. Although 17% of all respondents reported a disability at some point, only 3% consistently reported the same set of disabilities. Demographic differences were found between people who always reported a consistent set of disabilities and those whose responses fluctuated. The disability questions capture 2 discrete groups: people who experience a permanent disability and those who experience a temporary disability. Demographic differences between these groups suggest that this is not simply due to measurement error.

  7. Reliability and Validity of MOS-36-item Short Form of Health Survey Measuring the Quality of Life among Disabled People%SF-36量表测量伤残人员生存质量的信度与效度

    Institute of Scientific and Technical Information of China (English)

    吴学华; 李小麟

    2011-01-01

    目的 评价SF-36量表在测量地震伤残人员生存质量中的信度和效度.方法 利用自填法及访谈相结合的方式调查201例绵竹市某镇地震伤残人员,用重测信度和Cronbach's a系数分析SF-36信度;因子分析方法分析效度.结果 SF-36各领域的重测信度分别为:生理功能(PF)0.78、生理问题对功能的限制(RP)0.85、躯体疼痛(BP)0.92、健康总体评价(GH)0.82、活力(VT)0.77、社会功能(SF)0.71、心理问题对功能的限制(RE)0.79、精神健康(MH)0.66;各领域的Cronbach's a系数分别为:PF 0.89、RP 0.75、BP 0.84、GH 0.86、VT 0.78、SF 0.72、RE0.86、MH 0.50.因子分析共提取了6个主成分,基本反映了量表的8个维度,与量表的结构构思基本相符.结果 自填法及面对面访谈相结合的方式,将SF-36量表用于该地地震伤残人员生存质量测定具有较好的信度和效度.%Objective To evaluate the validity and reliability of the MOS-36-item Short Form of Health Survey (SF-36) measuring the quality of life (QOL) of disabled people injured in the earthquake. Methods A total of 201disabled people injured in the earthquake in a town of Mianzhu city were investigated via questionnaire combined with a face-to-face interview. The reliability of the SF-36 was assessed by test-reteat reliability and Cronbach's α coefficient.The validity was assessed through factor analysis. Results The test-retest reliability of the SF-36 included: physical functioning (PF) 0. 78, role limitation due to physical problems (RP) 0. 85, body pain (BP) 0. 92, general health (GH) 0. 82, vitality (VT) 0. 77, social functioning (SF) 0. 71, role limitation due to emotional problems (RE) 0. 79,and mental health (MH) 0. 66. The Cronbachs α coefficients were as the follows: PF 0. 89, RP 0. 75, BP 0. 84, GH 0. 86, VT 0. 78, SF 0. 72, RE 0. 86, and MH 0. 50. Six principal components were extracted by factor analysis and the constructs of the obtained instrument were consistent with

  8. Realizing a Rasch measurement through instructionally- sequenced domains of test items.

    Science.gov (United States)

    Schulz, E. Matthew

    2016-11-01

    This paper presents results from a project in which instructionally-sequenced domains were defined for purposes of constructing measures that that conform to an ideal in Guttman scaling and Rasch measurement. A fundamental idea in these measurement systems is that every person higher on the measurement scale can do everything that lower-level persons can do, plus at least one more thing. This idea has had limited application in educational measurement due to the stochastic nature of item response data and the sheer number of items needed to obtain reliable measures. However, it has been shown by Schulz, Lee, and Mullen [1] that this ideal can be can be realized at a higher level of abstraction - when items within a content strand are aggregated into a small number of domains that are ordered in instructional timing and difficulty. The present paper shows how this was done, and the results, in an achievement level setting project for the 2007 Grade 12 NAEP Economics Assessment.

  9. Are vocabulary tests measurement invariant between age groups? An item response analysis of three popular tests.

    Science.gov (United States)

    Fox, Mark C; Berry, Jane M; Freeman, Sara P

    2014-12-01

    Relatively high vocabulary scores of older adults are generally interpreted as evidence that older adults possess more of a common ability than younger adults. Yet, this interpretation rests on empirical assumptions about the uniformity of item-response functions between groups. In this article, we test item response models of differential responding against datasets containing younger-, middle-aged-, and older-adult responses to three popular vocabulary tests (the Shipley, Ekstrom, and WAIS-R) to determine whether members of different age groups who achieve the same scores have the same probability of responding in the same categories (e.g., correct vs. incorrect) under the same conditions. Contrary to the null hypothesis of measurement invariance, datasets for all three tests exhibit substantial differential responding. Members of different age groups who achieve the same overall scores exhibit differing response probabilities in relation to the same items (differential item functioning) and appear to approach the tests in qualitatively different ways that generalize across items. Specifically, younger adults are more likely than older adults to leave items unanswered for partial credit on the Ekstrom, and to produce 2-point definitions on the WAIS-R. Yet, older adults score higher than younger adults, consistent with most reports of vocabulary outcomes in the cognitive aging literature. In light of these findings, the most generalizable conclusion to be drawn from the cognitive aging literature on vocabulary tests is simply that older adults tend to score higher than younger adults, and not that older adults possess more of a common ability.

  10. The Laboratory Course Assessment Survey: A Tool to Measure Three Dimensions of Research-Course Design

    Science.gov (United States)

    Corwin, Lisa A.; Runyon, Christopher; Robinson, Aspen; Dolan, Erin L.

    2015-01-01

    Course-based undergraduate research experiences (CUREs) are increasingly being offered as scalable ways to involve undergraduates in research. Yet few if any design features that make CUREs effective have been identified. We developed a 17-item survey instrument, the Laboratory Course Assessment Survey (LCAS), that measures students' perceptions…

  11. Measuring the quality of life in hypertension according to Item Response Theory.

    Science.gov (United States)

    Borges, José Wicto Pereira; Moreira, Thereza Maria Magalhães; Schmitt, Jeovani; Andrade, Dalton Francisco de; Barbetta, Pedro Alberto; Souza, Ana Célia Caetano de; Lima, Daniele Braz da Silva; Carvalho, Irialda Saboia

    2017-05-04

    To analyze the Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL - Mini-questionnaire of Quality of Life in Hypertension) using the Item Response Theory. This is an analytical study conducted with 712 persons with hypertension treated in thirteen primary health care units of Fortaleza, State of Ceará, Brazil, in 2015. The steps of the analysis by the Item Response Theory were: evaluation of dimensionality, estimation of parameters of items, and construction of scale. The study of dimensionality was carried out on the polychoric correlation matrix and confirmatory factor analysis. To estimate the item parameters, we used the Gradual Response Model of Samejima. The analyses were conducted using the free software R with the aid of psych and mirt. The analysis has allowed the visualization of item parameters and their individual contributions in the measurement of the latent trait, generating more information and allowing the construction of a scale with an interpretative model that demonstrates the evolution of the worsening of the quality of life in five levels. Regarding the item parameters, the items related to the somatic state have had a good performance, as they have presented better power to discriminate individuals with worse quality of life. The items related to mental state have been those which contributed with less psychometric data in the MINICHAL. We conclude that the instrument is suitable for the identification of the worsening of the quality of life in hypertension. The analysis of the MINICHAL using the Item Response Theory has allowed us to identify new sides of this instrument that have not yet been addressed in previous studies. Analisar o Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL) por meio da Teoria da Resposta ao Item. Estudo analítico realizado com 712 pessoas com hipertensão arterial atendidas em 13 unidades de atenção primária em saúde de Fortaleza, CE, em 2015. As etapas da an

  12. Exhaustive measurement of food items in the home using a universal product code scanner

    Science.gov (United States)

    Stevens, June; Bryant, Maria; Wang, Lily; Borja, Judith; Bentley, Margaret E

    2011-01-01

    Objective We aimed to develop, test and describe the Exhaustive Home Food Inventory (EHFI), which measures foods in the home using scanning of the universal product code (UPC) and EHFI software to link codes to food identities and energy values. Design Observational design with up to three repeated measures in each household yielded a total of 218 inventories. Setting Eighty private households in North Carolina. Subjects Low-income African-American women with an infant between the ages of 12 and 18 months. Recruitment rate was 71%. Results Approximately 12 200 different food items were successfully recorded using the EHFI method. The average number of food items within a household was 147. The time required for the first measurement in a home declined from 157 to 136 min (P<0·05) for the first third compared to the last third of homes measured. In the sixty-four households in which three assessments were performed, the time required decreased from 145 to 97 min as did the time per item from 1·10 to 0·73 min. Conclusions It is feasible to record all foods and drinks in the home using UPC scanning. Further development and enhancement of databases linking UPC to food identification, nutrients and other information are needed. PMID:20602866

  13. Development of a Microsoft Excel tool for one-parameter Rasch model of continuous items: an application to a safety attitude survey

    Directory of Open Access Journals (Sweden)

    Tsair-Wei Chien

    2017-01-01

    Full Text Available Abstract Background Many continuous item responses (CIRs are encountered in healthcare settings, but no one uses item response theory’s (IRT probabilistic modeling to present graphical presentations for interpreting CIR results. A computer module that is programmed to deal with CIRs is required. To present a computer module, validate it, and verify its usefulness in dealing with CIR data, and then to apply the model to real healthcare data in order to show how the CIR that can be applied to healthcare settings with an example regarding a safety attitude survey. Methods Using Microsoft Excel VBA (Visual Basic for Applications, we designed a computer module that minimizes the residuals and calculates model’s expected scores according to person responses across items. Rasch models based on a Wright map and on KIDMAP were demonstrated to interpret results of the safety attitude survey. Results The author-made CIR module yielded OUTFIT mean square (MNSQ and person measures equivalent to those yielded by professional Rasch Winsteps software. The probabilistic modeling of the CIR module provides messages that are much more valuable to users and show the CIR advantage over classic test theory. Conclusions Because of advances in computer technology, healthcare users who are familiar to MS Excel can easily apply the study CIR module to deal with continuous variables to benefit comparisons of data with a logistic distribution and model fit statistics.

  14. A photographic method to measure food item intake. Validation in geriatric institutions.

    Science.gov (United States)

    Pouyet, Virginie; Cuvelier, Gérard; Benattar, Linda; Giboreau, Agnès

    2015-01-01

    From both a clinical and research perspective, measuring food intake is an important issue in geriatric institutions. However, weighing food in this context can be complex, particularly when the items remaining on a plate (side dish, meat or fish and sauce) need to be weighed separately following consumption. A method based on photography that involves taking photographs after a meal to determine food intake consequently seems to be a good alternative. This method enables the storage of raw data so that unhurried analyses can be performed to distinguish the food items present in the images. Therefore, the aim of this paper was to validate a photographic method to measure food intake in terms of differentiating food item intake in the context of a geriatric institution. Sixty-six elderly residents took part in this study, which was performed in four French nursing homes. Four dishes of standardized portions were offered to the residents during 16 different lunchtimes. Three non-trained assessors then independently estimated both the total and specific food item intakes of the participants using images of their plates taken after the meal (photographic method) and a reference image of one plate taken before the meal. Total food intakes were also recorded by weighing the food. To test the reliability of the photographic method, agreements between different assessors and agreements among various estimates made by the same assessor were evaluated. To test the accuracy and specificity of this method, food intake estimates for the four dishes were compared with the food intakes determined using the weighed food method. To illustrate the added value of the photographic method, food consumption differences between the dishes were explained by investigating the intakes of specific food items. Although they were not specifically trained for this purpose, the results demonstrated that the assessor estimates agreed between assessors and among various estimates made by the same

  15. Item Banks for Measuring Emotional Distress from the Patient-Reported Outcomes Measurement Information System (PROMIS[R]): Depression, Anxiety, and Anger

    Science.gov (United States)

    Pilkonis, Paul A.; Choi, Seung W.; Reise, Steven P.; Stover, Angela M.; Riley, William T.; Cella, David

    2011-01-01

    The authors report on the development and calibration of item banks for depression, anxiety, and anger as part of the Patient-Reported Outcomes Measurement Information System (PROMIS[R]). Comprehensive literature searches yielded an initial bank of 1,404 items from 305 instruments. After qualitative item analysis (including focus groups and…

  16. Measuring Galaxy Environments with Deep Redshift Surveys

    CERN Document Server

    Cooper, M C; Madgwick, D S; Gerke, B F; Yan, R; Davis, M; Cooper, Michael C.; Newman, Jeffrey A.; Madgwick, Darren S.; Gerke, Brian F.; Yan, Renbin; Davis, Marc

    2005-01-01

    We study the applicability of several galaxy environment measures (n^th-nearest-neighbor distance, counts in an aperture, and Voronoi volume) within deep redshift surveys. Mock galaxy catalogs are employed to mimic representative photometric and spectroscopic surveys at high redshift (z ~ 1). We investigate the effects of survey edges, redshift precision, redshift-space distortions, and target selection upon each environment measure. We find that even optimistic photometric redshift errors (\\sigma_z = 0.02) smear out the line-of-sight galaxy distribution irretrievably on small scales; this significantly limits the application of photometric redshift surveys to environment studies. Edges and holes in a survey field dramatically affect the estimation of environment, with the impact of edge effects depending upon the adopted environment measure. These edge effects considerably limit the usefulness of smaller survey fields (e.g. the GOODS fields) for studies of galaxy environment. In even the poorest groups and c...

  17. Individual Social Capital and Its Measurement in Social Surveys

    Directory of Open Access Journals (Sweden)

    Keming Yang

    2007-01-01

    Full Text Available With its popularity has come an unresolved issue about social capital: is it an individual or a collective property, or both? Many researchers take it for granted that social capital is collective, but most social surveys implicitly measure social capital at the individual level. After reviewing the definitions by Bourdieu, Coleman, and Putnam, I become to agree with Portes that social capital can be an individual asset and should be firstly analyzed as such; if social capital is to be analyzed as a collective property, then the analysis should explicitly draw on a clear definition of individual social capital. I thus define individual social capital as the features of social groups or networks that each individual member can access and use for obtaining further benefits. Four types of features are identified (basic, specific, generalized, and structural, and example formulations of survey questions are proposed. Following this approach, I then assess some survey questions organized under five themes commonly found in social surveys for measuring social capital: participation in organizations, social networks, trust, civic participation, and perceptions of local area. I conclude that most of these themes and questions only weakly or indirectly measure individual social capital; therefore, they should be strengthened with the conceptual framework proposed in this paper and complemented with the items used in independent surveys on social networks.

  18. Validity of temporal measures as proxies for measuring acculturation in Asian Indian survey respondents.

    Science.gov (United States)

    Bharmal, Nazleen; Hays, Ron D; McCarthy, William J

    2014-10-01

    There are few validated acculturation measures for Asian Indians in the U.S. We used the 2004 California Asian Indian Tobacco Survey to examine the relationship between temporal measures and eleven self-reported measures of acculturation. These items were combined to form an acculturation scale. We performed psychometric analysis of scale properties. Greater duration of residence in the U.S., greater percentage of lifetime in the U.S., and younger age at immigration were associated with more acculturated responses to the items for Asian Indians. Item-scale correlations for the 11-item acculturation scale ranged from 0.28-0.55 and internal consistency reliability was 0.73. Some support was found for a two-factor solution; one factor corresponding to cultural activities (α = 0.70) and the other to social behaviors (α = 0.59). Temporal measures only partially capture the full dimensions of acculturation. Our scale captured several domains and possibly two dimensions of acculturation.

  19. Cognitive interviewing methodology in the development of a pediatric item bank: a patient reported outcomes measurement information system (PROMIS study

    Directory of Open Access Journals (Sweden)

    DeWalt Darren A

    2009-01-01

    Full Text Available Abstract Background The evaluation of patient-reported outcomes (PROs in health care has seen greater use in recent years, and methods to improve the reliability and validity of PRO instruments are advancing. This paper discusses the cognitive interviewing procedures employed by the Patient Reported Outcomes Measurement Information System (PROMIS pediatrics group for the purpose of developing a dynamic, electronic item bank for field testing with children and adolescents using novel computer technology. The primary objective of this study was to conduct cognitive interviews with children and adolescents to gain feedback on items measuring physical functioning, emotional health, social health, fatigue, pain, and asthma-specific symptoms. Methods A total of 88 cognitive interviews were conducted with 77 children and adolescents across two sites on 318 items. From this initial item bank, 25 items were deleted and 35 were revised and underwent a second round of cognitive interviews. A total of 293 items were retained for field testing. Results Children as young as 8 years of age were able to comprehend the majority of items, response options, directions, recall period, and identify problems with language that was difficult for them to understand. Cognitive interviews indicated issues with item comprehension on several items which led to alternative wording for these items. Conclusion Children ages 8–17 years were able to comprehend most item stems and response options in the present study. Field testing with the resulting items and response options is presently being conducted as part of the PROMIS Pediatric Item Bank development process.

  20. Measuring and exposures from National Media Surveys

    DEFF Research Database (Denmark)

    Mortensen, Peter Stendahl

    2000-01-01

    Natinal media surveys inform about the number and kind of people being exposed to the media in question. This paper discusses to what extent these numbers may be used as measures for the exposure to ads in the media in question. In this context attention is also focussed on elements in the media...... surveys themselves that might invalidate or give unreliable measures, both when measuring a single exposure and accumulated exposures. Four media types will be discussed: TV, radio, print and the internet....

  1. A one-item workability measure mediates work demands, individual resources and health in the prediction of sickness absence

    DEFF Research Database (Denmark)

    Thorsen, Sannie Vester; Burr, Hermann; Diderichsen, Finn

    2012-01-01

    and quantitative demands. RESULTS: High age, poor health and ergonomic exposures were associated with low workability and mediated by workability to sickness absence for both genders. Low social class and low quantitative demands were associated with low workability and mediated to sickness absence among men......OBJECTIVES: The study tested the hypothesis that a one-item workability measure represented an assessment of the fit between resources (the individuals' physical and mental health and functioning) and workplace demands and that this resource/demand fit was a mediator in the prediction of sickness...... absence. We also estimated the relative importance of health and work environment for workability and sickness absence. METHODS: Baseline data were collected within a Danish work and health survey (3,214 men and 3,529 women) and followed up in a register of sickness absence. Probit regression analysis...

  2. Development and validation of a survey instrument to measure children's advertising literacy

    NARCIS (Netherlands)

    Rozendaal, E.; Opree, S.J.; Buijzen, M.A.

    2016-01-01

    The aim of this study was to develop and validate a survey measurement instrument for children's advertising literacy. Based on the multidimensional conceptualization of advertising literacy by 0056"> Rozendaal, Lapierre, Van Reijmersdal, and Buijzen (2011), 39 items were created to measure two d

  3. Development and validation of a survey instrument to measure children's advertising literacy

    NARCIS (Netherlands)

    Rozendaal, E.; Opree, S.J.; Buijzen, M.A.

    2016-01-01

    The aim of this study was to develop and validate a survey measurement instrument for children's advertising literacy. Based on the multidimensional conceptualization of advertising literacy by 0056"> Rozendaal, Lapierre, Van Reijmersdal, and Buijzen (2011), 39 items were created to measure two

  4. Computing poverty measures with survey data

    OpenAIRE

    Philippe Van Kerm

    2009-01-01

    I discuss estimation of poverty measures from household survey data in Stata and show how to derive analytic standard errors that take into account survey design features. Where needed, standard errors are adjusted for the estimation of the poverty line as a fraction of the mean or median income. The linearization approach based on influence functions is generally applicable to many estimators.

  5. Assessment of Fatigue in Rheumatoid Arthritis: A Psychometric Comparison of Single-item, Multiitem, and Multidimensional Measures

    NARCIS (Netherlands)

    Oude Voshaar, M.A.H.; Klooster, P.M. ten; Bode, C.; Vonkeman, H.E.; Glas, C.A.; Jansen, T.L.Th.A.; Albada-Kuipers, I. van; Riel, P.L.C.M. van; Laar, M.A. van der

    2015-01-01

    OBJECTIVE: To compare the psychometric functioning of multidimensional disease-specific, multiitem generic, and single-item measures of fatigue in patients with rheumatoid arthritis (RA). METHODS: Confirmatory factor analysis (CFA) and longitudinal item response theory (IRT) modeling were used to ev

  6. Developing Items to Measure Theory of Planned Behavior Constructs for Opioid Administration for Children: Pilot Testing.

    Science.gov (United States)

    Vincent, Catherine; Riley, Barth B; Wilkie, Diana J

    2015-12-01

    The Theory of Planned Behavior (TpB) is useful to direct nursing research aimed at behavior change. As proposed in the TpB, individuals' attitudes, perceived norms, and perceived behavior control predict their intentions to perform a behavior and subsequently predict their actual performance of the behavior. Our purpose was to apply Fishbein and Ajzen's guidelines to begin development of a valid and reliable instrument for pediatric nurses' attitudes, perceived norms, perceived behavior control, and intentions to administer PRN opioid analgesics when hospitalized children self-report moderate to severe pain. Following Fishbein and Ajzen's directions, we were able to define the behavior of interest and specify the research population, formulate items for direct measures, elicit salient beliefs shared by our target population and formulate items for indirect measures, and prepare and test our questionnaire. For the pilot testing of internal consistency of measurement items, Cronbach alphas were between 0.60 and 0.90 for all constructs. Test-retest reliability correlations ranged from 0.63 to 0.90. Following Fishbein and Ajzen's guidelines was a feasible and organized approach for instrument development. In these early stages, we demonstrated good reliability for most subscales, showing promise for the instrument and its use in pain management research. Better understanding of the TpB constructs will facilitate the development of interventions targeted toward nurses' attitudes, perceived norms, and/or perceived behavior control to ultimately improve their pain behaviors toward reducing pain for vulnerable children. Copyright © 2015 American Society for Pain Management Nursing. Published by Elsevier Inc. All rights reserved.

  7. Assessing the Equivalence of Paper, Mobile Phone, and Tablet Survey Responses at a Community Mental Health Center Using Equivalent Halves of a 'Gold-Standard' Depression Item Bank.

    Science.gov (United States)

    Brodey, Benjamin B; Gonzalez, Nicole L; Elkin, Kathryn Ann; Sasiela, W Jordan; Brodey, Inger S

    2017-09-06

    The computerized administration of self-report psychiatric diagnostic and outcomes assessments has risen in popularity. If results are similar enough across different administration modalities, then new administration technologies can be used interchangeably and the choice of technology can be based on other factors, such as convenience in the study design. An assessment based on item response theory (IRT), such as the Patient-Reported Outcomes Measurement Information System (PROMIS) depression item bank, offers new possibilities for assessing the effect of technology choice upon results. To create equivalent halves of the PROMIS depression item bank and to use these halves to compare survey responses and user satisfaction among administration modalities-paper, mobile phone, or tablet-with a community mental health care population. The 28 PROMIS depression items were divided into 2 halves based on content and simulations with an established PROMIS response data set. A total of 129 participants were recruited from an outpatient public sector mental health clinic based in Memphis. All participants took both nonoverlapping halves of the PROMIS IRT-based depression items (Part A and Part B): once using paper and pencil, and once using either a mobile phone or tablet. An 8-cell randomization was done on technology used, order of technologies used, and order of PROMIS Parts A and B. Both Parts A and B were administered as fixed-length assessments and both were scored using published PROMIS IRT parameters and algorithms. All 129 participants received either Part A or B via paper assessment. Participants were also administered the opposite assessment, 63 using a mobile phone and 66 using a tablet. There was no significant difference in item response scores for Part A versus B. All 3 of the technologies yielded essentially identical assessment results and equivalent satisfaction levels. Our findings show that the PROMIS depression assessment can be divided into 2 equivalent

  8. Alternate item types: continuing the quest for authentic testing.

    Science.gov (United States)

    Wendt, Anne; Kenny, Lorraine E

    2009-03-01

    Many test developers suggest that multiple-choice items can be used to evaluate critical thinking if the items are focused on measuring higher order thinking ability. The literature supports the use of alternate item types to assess additional competencies, such as higher level cognitive processing and critical thinking, as well as ways to allow examinees to demonstrate their competencies differently. This research study surveyed nurses after taking a test composed of alternate item types paired with multiple-choice items. The participants were asked to provide opinions regarding the items and the item formats. Demographic information was asked. In addition, information was collected as the participants responded to the items. The results of this study reveal that the participants thought that, in general, the items were more authentic and allowed them to demonstrate their competence better than multiple-choice items did. Further investigation into the optimal blend of alternate items and multiple-choice items is needed.

  9. Usefulness of a single-item measure of depression to predict mortality: the GAZEL prospective cohort study

    Science.gov (United States)

    Lefèvre, Thomas; Singh-Manoux, Archana; Stringhini, Silvia; Dugravot, Aline; Lemogne, Cédric; Consoli, Silla M.; Goldberg, Marcel; Zins, Marie

    2012-01-01

    Background: It remains unknown whether short measures of depression perform as well as long measures in predicting adverse outcomes such as mortality. The present study aims to examine the predictive value of a single-item measure of depression for mortality. Methods: A total of 14 185 participants of the GAZEL cohort completed the 20-item Center-for-Epidemiologic-Studies-Depression (CES-D) scale in 1996. One of these items (I felt depressed) was used as a single-item measure of depression. All-cause mortality data were available until 30 September 2009, a mean follow-up period of 12.7 years with a total of 650 deaths. Results: In Cox regression model adjusted for baseline socio-demographic characteristics, a one-unit increase in the single-item score (range 0–3) was associated with a 25% higher risk of all-cause mortality (95% CI: 13–37%, P < 0.001). Further adjustment for health-related behaviours and physical chronic diseases reduced this risk by 36% and 8%, respectively. After adjustment for all these variables, every one-unit increase in the single-item score predicted a 15% increased risk of death (95% CI: 5–27%, P < 0.01). There is also an evidence of a dose–reponse relationship between reponse scores on the single-item measure of depression and mortality. Conclusion: This study shows that a single-item measure of depression is associated with an increased risk of death. Given its simplicity and ease of administration, a very simple single-item measure of depression might be useful for identifying middle-aged adults at risk for elevated depressive symptoms in large epidemiological studies and clinical settings. PMID:21840893

  10. Adaptation of an Instrument for Measuring the Cognitive Complexity of Organic Chemistry Exam Items

    Science.gov (United States)

    Raker, Jeffrey R.; Trate, Jaclyn M.; Holme, Thomas A.; Murphy, Kristen

    2013-01-01

    Experts use their domain expertise and knowledge of examinees' ability levels as they write test items. The expert test writer can then estimate the difficulty of the test items subjectively. However, an objective method for assigning difficulty to a test item would capture the cognitive demands imposed on the examinee as well as be…

  11. The Single-Item Math Anxiety Scale: An Alternative Way of Measuring Mathematical Anxiety

    Science.gov (United States)

    Núñez-Peña, M. Isabel; Guilera, Georgina; Suárez-Pellicioni, Macarena

    2014-01-01

    This study examined whether the Single-Item Math Anxiety Scale (SIMA), based on the item suggested by Ashcraft, provided valid and reliable scores of mathematical anxiety. A large sample of university students (n = 279) was administered the SIMA and the 25-item Shortened Math Anxiety Rating Scale (sMARS) to evaluate the relation between the scores…

  12. An item response theory analysis of self-report measures of adult attachment.

    Science.gov (United States)

    Fraley, R C; Waller, N G; Brennan, K A

    2000-02-01

    Self-report measures of adult attachment are typically scored in ways (e.g., averaging or summing items) that can lead to erroneous inferences about important theoretical issues, such as the degree of continuity in attachment security and the differential stability of insecure attachment patterns. To determine whether existing attachment scales suffer from scaling problems, the authors conducted an item response theory (IRT) analysis of 4 commonly used self-report inventories: Experiences in Close Relationships scales (K. A. Brennan, C. L. Clark, & P. R. Shaver, 1998), Adult Attachment Scales (N. L. Collins & S. J. Read, 1990), Relationship Styles Questionnaire (D. W. Griffin & K. Bartholomew, 1994) and J. Simpson's (1990) attachment scales. Data from 1,085 individuals were analyzed using F. Samejima's (1969) graded response model. The authors' findings indicate that commonly used attachment scales can be improved in a number of important ways. Accordingly, the authors show how IRT techniques can be used to develop new attachment scales with desirable psychometric properties.

  13. U.S. Naval Unit Behavioral Health Needs Assessment Survey, Overview of Survey Items and Measures

    Science.gov (United States)

    2014-05-20

    other psychological distress, please seek help immediately. We encourage you to contact your unit’s chaplain or a mental health professional. If you...The prompt reads, “Have you ever been told by a doctor that you have the following conditions?” • Diabetes • Asthma • Overweight/ obesity ... psychological health of service members. Consequently, preserving the psychological health of U.S. service members is of paramount concern to military leaders

  14. Behavioral Health Needs Assessment Survey (BHNAS): Overview of Survey Items and Measures

    Science.gov (United States)

    2013-02-12

    are not opioids (including Celebrex, Vioxx, Bextra, topical lidocaine ) • Prescription opioid/narcotic painkiller (including OxyContin, Percocet...Service Members (reversed) (1 = Never to 5 = Always). • Exhibit clear thinking and reasonable action under stress (1 = Never to 5 = Always). • Show...make evidence-based decisions and policy changes impacting the psychological health of Navy expeditionary sailors. The results constitute actionable

  15. Cultural Resources Survey of Three Iberville Parish Levee Enlargement and Revetment Construction Items

    Science.gov (United States)

    1993-09-22

    and four feet front, and forty arpents in depth, and bounded on one side by land of Bonaventura Leblanc, and on the other by Juan Hebert. It appears...and on the lower by land of Bonaventura Forest. This land was surveyed by Don Luis Andry, in the year 1772, in favor of the claimant, who obtained a...1772, In favor of Bonaventura Forest, who obtained a complete grant for the same In the year 1774, from Governor Unzaga; under which grant the

  16. Use and Misuse of the Likert Item Responses and Other Ordinal Measures.

    Science.gov (United States)

    Bishop, Phillip A; Herron, Robert L

    Likert, Likert-type, and ordinal-scale responses are very popular psychometric item scoring schemes for attempting to quantify people's opinions, interests, or perceived efficacy of an intervention and are used extensively in Physical Education and Exercise Science research. However, these numbered measures are generally considered ordinal and violate some statistical assumptions needed to evaluate them as normally distributed, parametric data. This is an issue because parametric statistics are generally perceived as being more statistically powerful than non-parametric statistics. To avoid possible misinterpretation, care must be taken in analyzing these types of data. The use of visual analog scales may be equally efficacious and provide somewhat better data for analysis with parametric statistics.

  17. The Development, Validation and Application of an External Criterion Measure of Achievement Test Item Bias.

    Science.gov (United States)

    Harms, Robert A.

    Based on John Rawls' theory of justice as fairness, a nine-item rating scale was developed to serve as a criterion in studies of test item bias. Two principles underlie the scale: (1) Within a defined usage, test items should not affect students so that they are unable to do as well as their abilities would indicate; and (2) within the domain of a…

  18. The Development, Validation and Application of an External Criterion Measure of Achievement Test Item Bias.

    Science.gov (United States)

    Harms, Robert A.

    Based on John Rawls' theory of justice as fairness, a nine-item rating scale was developed to serve as a criterion in studies of test item bias. Two principles underlie the scale: (1) Within a defined usage, test items should not affect students so that they are unable to do as well as their abilities would indicate; and (2) within the domain of a…

  19. A survey of anatomical items relevant to the practice of rheumatology: upper extremity, head, neck, spine, and general concepts.

    Science.gov (United States)

    Villaseñor-Ovies, Pablo; Navarro-Zarza, José Eduardo; Saavedra, Miguel Ángel; Hernández-Díaz, Cristina; Canoso, Juan J; Biundo, Joseph J; Kalish, Robert A; de Toro Santos, Francisco Javier; McGonagle, Dennis; Carette, Simon; Alvarez-Nemegyei, José

    2016-12-01

    This study aimed to identify the anatomical items of the upper extremity and spine that are potentially relevant to the practice of rheumatology. Ten rheumatologists interested in clinical anatomy who published, taught, and/or participated as active members of Clinical Anatomy Interest groups (six seniors, four juniors), participated in a one-round relevance Delphi exercise. An initial, 560-item list that included 45 (8.0 %) general concepts items; 138 (24.8 %) hand items; 100 (17.8 %) forearm and elbow items; 147 (26.2 %) shoulder items; and 130 (23.2 %) head, neck, and spine items was compiled by 5 of the participants. Each item was graded for importance with a Likert scale from 1 (not important) to 5 (very important). Thus, scores could range from 10 (1 × 10) to 50 (5 × 10). An item score of ≥40 was considered most relevant to competent practice as a rheumatologist. Mean item Likert scores ranged from 2.2 ± 0.5 to 4.6 ± 0.7. A total of 115 (20.5 %) of the 560 initial items reached relevance. Broken down by categories, this final relevant item list was composed by 7 (6.1 %) general concepts items; 32 (27.8 %) hand items; 20 (17.4 %) forearm and elbow items; 33 (28.7 %) shoulder items; and 23 (17.6 %) head, neck, and spine items. In this Delphi exercise, a group of practicing academic rheumatologists with an interest in clinical anatomy compiled a list of anatomical items that were deemed important to the practice of rheumatology. We suggest these items be considered curricular priorities when training rheumatology fellows in clinical anatomy skills and in programs of continuing rheumatology education.

  20. Cosmological measurements with forthcoming radio continuum surveys

    CSIR Research Space (South Africa)

    Raccanelli

    2012-08-01

    Full Text Available –819 (2012) doi:10.1111/j.1365-2966.2012.20634.x Cosmological measurements with forthcoming radio continuum surveys Alvise Raccanelli,1� Gong-Bo Zhao,1 David J. Bacon,1 Matt J. Jarvis,2,3 Will J. Percival,1 Ray P. Norris,4 Huub Ro¨ttgering,5 Filipe B. Abdalla... of Universe – radio continuum: galaxies. 1 IN T RO D U C T I O N Radio surveys for cosmology are entering a new phase with the construction of the Low Frequency Array (LOFAR) for radio �E-mail: alvise.raccanelli@port.ac.uk astronomy (Ro¨ttgering 2003...

  1. A single-item global job satisfaction measure is associated with quantitative blood immune indices in white-collar employees.

    Science.gov (United States)

    Nakata, Akinori; Irie, Masahiro; Takahashi, Masaya

    2013-01-01

    Although a single-item job satisfaction measure has been shown to be reliable and inclusive as multiple-item scales in relation to health, studies including immunological data are few. The purpose of this study was to evaluate the validity of single-item job and family life satisfaction based on its association with immune indices. A total of 189 white-collar employees (70% men) underwent a blood draw for the measurement of natural killer (NK), total T, and B cell counts as well as plasma immunoglobulin (Ig) G concentrations and completed single-item job and family life satisfaction measures, respectively. The response options for satisfaction measures were 'dissatisfied' (coded 1) to 'satisfied' (coded 4). Spearman's partial correlations controlling for cofactors revealed that increased job satisfaction was positively associated with NK cells (rsp=0.201, p=0.007) and IgG (rsp=0.178, p=0.018), while family life satisfaction was unrelated to immune indices. Those who reported a combination of low job/low family life satisfaction had significantly lower NK and higher B cell counts than those with a high job/high family life satisfaction. Our study suggests that the single-item summary measure of job satisfaction, but not family life satisfaction, may be a valid tool to evaluate immune status in healthy white-collar employees.

  2. Measures of language outcomes using the Aboriginal Children's Survey.

    Science.gov (United States)

    Findlay, Leanne C; Kohen, Dafna E

    2013-01-01

    Speech and language skills are an important developmental milestone for all children, and one of the most prevalent forms of developmental delay among Aboriginal children. However, population-based indicators of Aboriginal children's language outcomes are limited. Data from the Aboriginal Children's Survey (ACS) were used to examine measures of language for Aboriginal children who were 2 to 5 years of age. Responses to ACS questions on ability in any language were examined in exploratory factor analyses to determine possible language indicators. Construct validity was examined by regressing language outcomes onto socio-demographic characteristics known to be associated with children's language. Four language outcomes were identified and labelled: expressive language, mutual understanding, story-telling, and speech and language difficulties. The conceptualization of items from the ACS into separate language indicators can be used by researchers examining young Aboriginal children's language outcomes.

  3. Use of Item Response Theory to Examine a Cardiovascular Health Knowledge Measure for Adolescents with Elevated Blood Pressure

    Directory of Open Access Journals (Sweden)

    Stephanie L. Fitzpatrick

    2012-10-01

    Full Text Available The purpose of this study was to assess the psychometric properties of a cardiovascular health knowledge measure for adolescents using item response theory. The measure was developed in the context of a cardiovascular lifestyle intervention for adolescents with elevated blood pressure. Sample consisted of 167 adolescents (mean age = 16.2 years who completed the Cardiovascular Health Knowledge Assessment (CHKA, a 34-item multiple choice test, at baseline and post-intervention. The CHKA was unidimensional and internal consistency was .65 at pretest and .74 at posttest. Rasch analysis results indicated that at pretest the items targeted adolescents with variable levels of health knowledge. However, based on results at posttest, additional hard items are needed to account for the increase in level of cardiovascular health knowledge at post-intervention. Change in knowledge scores was examined using Rasch analysis. Findings indicated there was significant improvement in health knowledge over time [t(119 = -10.3, p< .0001]. In summary, the CHKA appears to contain items that are good approximations of the construct cardiovascular health knowledge and items that target adolescents with moderate levels of knowledge.  DOI: 10.2458/azu_jmmss.v3i1.16111

  4. Measurement invariance across educational levels and gender in 12-item Zarit Burden Interview (ZBI) on caregivers of people with dementia.

    Science.gov (United States)

    Lin, Chung-Ying; Ku, Li-Jung Elizabeth; Pakpour, Amir H

    2017-08-01

    The Zarit Burden Interview (ZBI) is a commonly used self-report to assess caregiver burden. A 12-item short form of the ZBI has been developed; however, its measurement invariance has not been examined across some different demographics. It is unclear whether different genders and educational levels of a population interpret the ZBI items similarly. Therefore, this study aimed to examine the measurement invariance of the 12-item ZBI across gender and educational levels in a Taiwanese sample. Caregivers who had a family member with dementia (n = 270) completed the ZBI through telephone interviews. Three confirmatory factor analysis (CFA) models were conducted: Model 1 was the configural model, Model 2 constrained all factor loadings, Model 3 constrained all factor loadings and item intercepts. Multiple group CFAs and the differential item functioning (DIF) contrast under Rasch analyses were used to detect measurement invariance across males (n = 100) and females (n = 170) and across educational levels of junior high schools and below (n = 86) and senior high schools and above (n = 183). The fit index differences between models supported the measurement invariance across gender and across educational levels (∆ comparative fit index (CFI) = -0.010 and 0.003; ∆ root mean square error of approximation (RMSEA) = -0.006 to 0.004). No substantial DIF contrast was found across gender and educational levels (value = -0.36 to 0.29). The ZBI is appropriate for combined use and for comparisons in caregivers across gender and different educational levels in Taiwan.

  5. Creating a Screening Measure of Health Literacy for the Health Information National Trends Survey.

    Science.gov (United States)

    Champlin, Sara; Mackert, Michael

    2016-03-01

    Create a screening measure of health literacy for use with the Health Information National Trends Survey (HINTS). Participants completed a paper-based survey. Items from the survey were used to construct a health literacy screening measure. A population-based survey conducted in geographic areas of high and low minority frequency and in Central Appalachia. Two thousand nine hundred four English-speaking participants were included in this study: 66% white, 93% completed high school, mean age = 52.53 years (SD = 16.24). A health literacy screening measure was created using four items included in the HINTS survey. Scores could range from 0 (no questions affirmative/correct) to 4 (all questions answered affirmatively/correctly). Multiple regression analysis was used to determine whether demographic variables known to predict health literacy were indeed associated with the constructed health literacy screening measure. The weighted average health literacy score was 2.63 (SD = 1.00). Those who were nonwhite (p = .0005), were older (p literacy screening measure scores. This study highlights the need to assess health literacy in national surveys, but also serves as evidence that screening measures can be created within existing datasets to give researchers the ability to consider the impact of health literacy. © The Author(s) 2016.

  6. Measuring Employee Engagement in Hotels: Survey Effectiveness

    OpenAIRE

    Kolomiets, Arina

    2016-01-01

    The objectives of this thesis are to understand nature of employee engagements, analyse popular theories used to understand this phenomenon and comprehend importance of employee engagement for organization; as well as to analyse Employee Engagement Survey conducted by leading hospitality organization, question its effectiveness and propose suggestion to improve measuring process of employee engagement to ensure live, valid data, thus the managers can understand where their employees stand in ...

  7. Survey of emissivity measurement by radiometric methods.

    Science.gov (United States)

    Honner, M; Honnerová, P

    2015-02-01

    A survey of the state of the art in the field of spectral directional emissivity measurements by using radiometric methods is presented. Individual quantity types such as spectral, band, or total emissivity are defined. Principles of emissivity measurement by various methods (direct and indirect, and calorimetric and radiometric) are discussed. The paper is focused on direct radiometric methods. An overview of experimental setups is provided, including the design of individual parts such as the applied reference sources of radiation, systems of sample clamping and heating, detection systems, methods for the determination of surface temperature, and procedures for emissivity evaluation.

  8. Developing multiple-choices test items as tools for measuring the scientific-generic skills on solar system

    Science.gov (United States)

    Bhakti, Satria Seto; Samsudin, Achmad; Chandra, Didi Teguh; Siahaan, Parsaoran

    2017-05-01

    The aim of research is developing multiple-choices test items as tools for measuring the scientific of generic skills on solar system. To achieve the aim that the researchers used the ADDIE model consisting Of: Analyzing, Design, Development, Implementation, dan Evaluation, all of this as a method research. While The scientific of generic skills limited research to five indicator including: (1) indirect observation, (2) awareness of the scale, (3) inference logic, (4) a causal relation, and (5) mathematical modeling. The participants are 32 students at one of junior high schools in Bandung. The result shown that multiple-choices that are constructed test items have been declared valid by the expert validator, and after the tests show that the matter of developing multiple-choices test items be able to measuring the scientific of generic skills on solar system.

  9. An item response theory analysis of DSM-IV diagnostic criteria for personality disorders: findings from the national epidemiologic survey on alcohol and related conditions.

    Science.gov (United States)

    Harford, Thomas C; Chen, Chiung M; Saha, Tulshi D; Smith, Sharon M; Hasin, Deborah S; Grant, Bridget F

    2013-01-01

    The purpose of this study was to evaluate the psychometric properties of DSM-IV symptom criteria for assessing personality disorders (PDs) in a national population and to compare variations in proposed symptom coding for social and/or occupational dysfunction. Data were obtained from a total sample of 34,653 respondents from Waves 1 and 2 of the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). For each personality disorder, confirmatory factor analysis (CFA) established a 1-factor latent factor structure for the respective symptom criteria. A 2-parameter item response theory (IRT) model was applied to the symptom criteria for each PD to assess the probabilities of symptom item endorsements across different values of the underlying trait (latent factor). Findings were compared with a separate IRT model using an alternative coding of symptom criteria that requires distress/impairment to be related to each criterion. The CFAs yielded a good fit for a single underlying latent dimension for each PD. Findings from the IRT indicated that DSM-IV PD symptom criteria are clustered in the moderate to severe range of the underlying latent dimension for each PD and are peaked, indicating high measurement precision only within a narrow range of the underlying trait and lower measurement precision at lower and higher levels of severity. Compared with the NESARC symptom coding, the IRT results for the alternative symptom coding are shifted toward the more severe range of the latent trait but generally have lower measurement precision for each PD. The IRT findings provide support for a reliable assessment of each PD for both NESARC and alternative coding for distress/impairment. The use of symptom dysfunction for each criterion, however, raises a number of issues and implications for the DSM-5 revision currently proposed for Axis II disorders (American Psychiatric Association, 2010).

  10. Measurement equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS® Pain Interference short form items: Application to ethnically diverse cancer and palliative care populations

    Directory of Open Access Journals (Sweden)

    Jeanne A. Teresi

    2016-06-01

    Full Text Available Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System® (PROMIS® project addressed this issue with the provision of computerized adaptive tests (CAT and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF and reliability of the PROMIS pain interference short forms across diverse socio-demographic groups. Methods: DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. Results: The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, “How much did pain interfere with enjoyment of social activities?” was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in

  11. Quality of life assessed with the medical outcomes study short form 36-item health survey of patients on renal replacement therapy: A systematic review and meta-analysis

    NARCIS (Netherlands)

    Y.S. Liem (Ylian Serina); J.L. Bosch (Johanna); L.R. Arends (Lidia); M.H. Heijenbrok-Kal (Majanka); M.G.M. Hunink (Myriam)

    2007-01-01

    textabstractObjectives: The Medical Outcomes Study Short Form 36-Item Health Survey (SF-36) is the most widely used generic instrument to estimate quality of life of patients on renal replacement therapy. Purpose of this study was to summarize and compare the published literature on quality of life

  12. Quality of life assessed with the medical outcomes study short form 36-item health survey of patients on renal replacement therapy: A systematic review and meta-analysis

    NARCIS (Netherlands)

    Y.S. Liem (Ylian Serina); J.L. Bosch (Johanna); L.R. Arends (Lidia); M.H. Heijenbrok-Kal (Majanka); M.G.M. Hunink (Myriam)

    2007-01-01

    textabstractObjectives: The Medical Outcomes Study Short Form 36-Item Health Survey (SF-36) is the most widely used generic instrument to estimate quality of life of patients on renal replacement therapy. Purpose of this study was to summarize and compare the published literature on quality of

  13. Developing item banks for measuring pediatric generic health-related quality of life: an application of the International Classification of Functioning, Disability and Health for Children and Youth and item response theory.

    Science.gov (United States)

    Gandhi, Pranav K; Thompson, Lindsay A; Tuli, Sanjeev Y; Revicki, Dennis A; Shenkman, Elizabeth; Huang, I-Chan

    2014-01-01

    The purpose of this study was to develop item banks by linking items from three pediatric health-related quality of life (HRQoL) instruments using a mixed methodology. Secondary data were collected from 469 parents of children aged 8-16 years. The International Classification of Functioning, Disability and Health-Children and Youth (ICF-CY) served as a framework to compare the concepts of items from three HRQoL instruments. The structural validity of the individual domains was examined using confirmatory factor analyses. Samejima's Graded Response Model was used to calibrate items from different instruments. The known-groups validity of each domain was examined using the status of children with special health care needs (CSHCN). Concepts represented by the items in the three instruments were linked to 24 different second-level categories of the ICF-CY. Eight item banks representing eight unidimensional domains were created based on the linkage of the concepts measured by the items of the three instruments to the ICF-CY. The HRQoL results of CSHCN in seven out of eight domains (except personality) were significantly lower compared with children without special health care needs (p<0.05). This study demonstrates a useful approach to compare the item concepts from the three instruments and to generate item banks for a pediatric population.

  14. The 7-Item Generalized Anxiety Disorder Scale as a Tool for Measuring Generalized Anxiety in Multiple Sclerosis

    OpenAIRE

    Terrill, Alexandra L.; Hartoonian, Narineh; Beier, Meghan; Salem, Rana; Alschuler, Kevin

    2015-01-01

    Background: Generalized anxiety disorder (GAD) is common in multiple sclerosis (MS) but understudied. Reliable and valid measures are needed to advance clinical care and expand research in this area. The objectives of this study were to examine the psychometric properties of the 7-item Generalized Anxiety Disorder Scale (GAD-7) in individuals with MS and to analyze correlates of GAD.

  15. IDENTIFICATION OF MEASUREMENT ITEMS OF DESIGN REQUIREMENTS FOR LEAN AND AGILE SUPPLY CHAIN-CONFIRMATORY FACTOR ANALYSIS

    Directory of Open Access Journals (Sweden)

    D.Venkata Ramana

    2013-06-01

    Full Text Available This study examines the consistency approaches by confirmatory factor analysis that determines the construct validity, convergent validity, construct reliability and internal consistency of the items of strategic design requirements. The design requirements includes use of information technology, sourcing procedures, new product development, flexible manufacturing functions and demand management supply chain net work design, management, commitment and inventory management policies among manufacturers of volatile and unforeseeable products in Andhraadesh, India. This study suggested that the seven factor model with 20 items of the leagile supply chain design requirements had a good fit. Further, the study showed a val id and reliable measurement to identify critical items among the design requirements of leagile supply chains.

  16. The development and validation of a novel outcome measure to quantify mobility in the dysvascular lower extremity amputee: the amputee single item mobility measure

    Science.gov (United States)

    Norvell, Daniel C; Williams, Rhonda M; Turner, Aaron P; Czerniecki, Joseph M

    2016-01-01

    Objective: This study describes the development and psychometric evaluation of a novel patient-reported single-item mobility measure. Design: Prospective cohort study. Setting: Four Veteran’s Administration Medical Centers. Subjects: Individuals undergoing their first major unilateral lower extremity amputation; 198 met inclusion criteria; of these, 113 (57%) enrolled. Interventions: None. Main measures: The Amputee Single Item Mobility Measure, a single item measure with scores ranging from 0 to 6, was developed by an expert panel, and concurrently administered with the Locomotor Capabilities Index-5 (LCI-5) and other outcome measures at six weeks, four months, and 12 months post-amputation. Criterion and construct validity, responsiveness, and floor/ceiling effects were evaluated. Responsiveness was assessed using the standardized response mean. Results: The overall mean 12-month Amputee Single Item Mobility Measure score was 3.39 ±1.4. Scores for transmetatarsal, transtibial, and transfemoral amputees were 4.2 (±1.3), 3.2 (±1.5), and 2.9 (±1.1), respectively. Amputee Single Item Mobility Measure scores demonstrated “large” and statistically significant correlations with the LCI-5 scores at six weeks (r = 0.72), four months (r = 0.81), and 12 months (r = 0.86). At four months and 12 months, the correlation between Amputee Single Item Mobility Measure scores and hours of prosthetic use were r = 0.69 and r = 0.66, respectively, and between Amputee Single Item Mobility Measure scores and Trinity Amputation and Prosthesis Experience Scales functional restriction scores were r = 0.45 and r = 0.67, respectively. Amputee Single Item Mobility Measure scores increased significantly from six weeks to 12 months post-amputation. Minimal floor/ceiling effects were demonstrated. Conclusions: In the unilateral dysvascular amputee, the Amputee Single Item Mobility Measure has strong criterion and construct validity, excellent

  17. Testing the ruler with item response theory: increasing precision of measurement for relationship satisfaction with the Couples Satisfaction Index.

    Science.gov (United States)

    Funk, Janette L; Rogge, Ronald D

    2007-12-01

    The present study took a critical look at a central construct in couples research: relationship satisfaction. Eight well-validated self-report measures of relationship satisfaction, including the Marital Adjustment Test (MAT; H. J. Locke & K. M. Wallace, 1959), the Dyadic Adjustment Scale (DAS; G. B. Spanier, 1976), and an additional 75 potential satisfaction items, were given to 5,315 online participants. Using item response theory, the authors demonstrated that the MAT and DAS provided relatively poor levels of precision in assessing satisfaction, particularly given the length of those scales. Principal-components analysis and item response theory applied to the larger item pool were used to develop the Couples Satisfaction Index (CSI) scales. Compared with the MAS and the DAS, the CSI scales were shown to have higher precision of measurement (less noise) and correspondingly greater power for detecting differences in levels of satisfaction. The CSI scales demonstrated strong convergent validity with other measures of satisfaction and excellent construct validity with anchor scales from the nomological net surrounding satisfaction, suggesting that they assess the same theoretical construct as do prior scales. Implications for research are discussed.

  18. Assessing Testlet Effect, Impact, Differential Testlet, and Item Functioning Using Cross-Classified Multilevel Measurement Modeling

    Directory of Open Access Journals (Sweden)

    Hamdollah Ravand

    2015-05-01

    Full Text Available The present study used the two-level testlet response model (MMMT-2 to assess impact, differential item functioning (DIF, and differential testlet functioning (DTLF in a reading comprehension test. The data came from 21,641 applicants into English Masters’ programs at Iranian state universities. Testlet effects were estimated, and items and testlets that were functioning differentially for test takers of different genders and majors were identified. Also parameter estimates obtained under MMMT-2 and those obtained under the two-level hierarchical generalized linear model (HGLM-2 were compared. The results indicated that ability estimates obtained under the two models were significantly different at the lower and upper ends of the ability distribution. In addition, it was found that ignoring local item dependence (LID would result in overestimation of the precision of the ability estimates. As for the difficulty of the items, the estimates obtained under the two models were almost the same, but standard errors were significantly different.

  19. What Do You Think You Are Measuring? A Mixed-Methods Procedure for Assessing the Content Validity of Test Items and Theory-Based Scaling.

    Science.gov (United States)

    Koller, Ingrid; Levenson, Michael R; Glück, Judith

    2017-01-01

    The valid measurement of latent constructs is crucial for psychological research. Here, we present a mixed-methods procedure for improving the precision of construct definitions, determining the content validity of items, evaluating the representativeness of items for the target construct, generating test items, and analyzing items on a theoretical basis. To illustrate the mixed-methods content-scaling-structure (CSS) procedure, we analyze the Adult Self-Transcendence Inventory, a self-report measure of wisdom (ASTI, Levenson et al., 2005). A content-validity analysis of the ASTI items was used as the basis of psychometric analyses using multidimensional item response models (N = 1215). We found that the new procedure produced important suggestions concerning five subdimensions of the ASTI that were not identifiable using exploratory methods. The study shows that the application of the suggested procedure leads to a deeper understanding of latent constructs. It also demonstrates the advantages of theory-based item analysis.

  20. [Measuring workplace climate: reliability and validity of the 12-item Organizational Climate Scale (OCS-12)].

    Science.gov (United States)

    Fukui, Satoe; Haratani, Takashi; Toshima, Yutaka; Shima, Satoru; Takahashi, Masaya; Nakata, Akinori; Fukasawa, Kenji; Ohba, Sayo; Sato, Emi; Hirota, Yasuko

    2004-11-01

    In order to investigate the reliability and validity of the short version of the 30-item Organizational Climate Scale (OCS-30; Toshima and Matsuda, 1992, 1995), a self-administered questionnaire was conducted in a sample of 819 employees of two medium-sized private companies in Japan by using the OCS-30, the Generic Job Stress Questionnaire (GJSQ), and the 12-item General Health Questionnaire (GHQ-12). The OCS has two subscales, i.e., the Tradition Scale (TS) and the Organizational Environment Scale (OES). The organizational climate perceived by each worker can be grouped into four categories based on the subscale scores: low TS and high OES (Active), high TS and high OES (Governed), low TS and low OES (Disorganized), and high TS and low OES (Reluctant). Principal component analysis for the OCS-30 was submitted (varimax rotation, the number of factors = 2), and 6 items for each factor, with factor loadings greater than 0.50, were selected for the short version, which constituted the 12-item Organizational Climate Scale (OCS-12). Cronbach's alpha reliability coefficients of the two subscales of the OCS-12 were acceptable; 0.63 for the TS and 0.71 for the OES. Both two subscales of the OCS-12 were significantly correlated with the GHQ-12 and many subscales of the GJSQ, which indicated the good constructive validity of the OCS-12. Among 4 types of organizational climate categorized by the OCS-12, the "Active" group showed the lowest job stress scores. It is suggested that the OCS-12 could be a reliable and valid instrument for assessing workers' perception of workplace climate.

  1. Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form

    Science.gov (United States)

    Kalpakjian, Claire Z.; Tate, Denise G.; Kisala, Pamela A.; Tulsky, David S.

    2015-01-01

    Objective To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Design Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory- (IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. Setting We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. Participants A total of 717 individuals with SCI completed the self-esteem items. Results A unidimensional model was observed (CFI = 0.946; RMSEA = 0.087) and measurement precision was good (theta range between −2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. Conclusion This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available. PMID:26010972

  2. Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form.

    Science.gov (United States)

    Kalpakjian, Claire Z; Tate, Denise G; Kisala, Pamela A; Tulsky, David S

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory-(IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. A total of 717 individuals with SCI completed the self-esteem items. A unidimensional model was observed (CFI=0.946; RMSEA=0.087) and measurement precision was good (theta range between -2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.

  3. A test of the International Personality Item Pool representation of the Revised NEO Personality Inventory and development of a 120-item IPIP-based measure of the five-factor model.

    Science.gov (United States)

    Maples, Jessica L; Guan, Li; Carter, Nathan T; Miller, Joshua D

    2014-12-01

    There has been a substantial increase in the use of personality assessment measures constructed using items from the International Personality Item Pool (IPIP) such as the 300-item IPIP-NEO (Goldberg, 1999), a representation of the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992). The IPIP-NEO is free to use and can be modified to accommodate its users' needs. Despite the substantial interest in this measure, there is still a dearth of data demonstrating its convergence with the NEO PI-R. The present study represents an investigation of the reliability and validity of scores on the IPIP-NEO. Additionally, we used item response theory (IRT) methodology to create a 120-item version of the IPIP-NEO. Using an undergraduate sample (n = 359), we examined the reliability, as well as the convergent and criterion validity, of scores from the 300-item IPIP-NEO, a previously constructed 120-item version of the IPIP-NEO (Johnson, 2011), and the newly created IRT-based IPIP-120 in comparison to the NEO PI-R across a range of outcomes. Scores from all 3 IPIP measures demonstrated strong reliability and convergence with the NEO PI-R and a high degree of similarity with regard to their correlational profiles across the criterion variables (rICC = .983, .972, and .976, respectively). The replicability of these findings was then tested in a community sample (n = 757), and the results closely mirrored the findings from Sample 1. These results provide support for the use of the IPIP-NEO and both 120-item IPIP-NEO measures as assessment tools for measurement of the five-factor model. (c) 2014 APA, all rights reserved.

  4. A survey report: how hospitals measure liquidity.

    Science.gov (United States)

    Cleverley, W O; Massar, G S

    1983-11-01

    Liquidity is an important financial concept that is widely understood although not authoritatively defined. In many situations the actual assessment of liquidity is based on the relationship of current assets and current liabilities. Nationally, a decline in traditional measures of liquidity such as current and quick ratios has occurred for both general industry and the hospital industry. There are a variety of possible explanations for this trend, but one of special interest in this article was the effect of financial reporting practices. A recent Principles & Practices Board survey of Financial Analysis Service subscribers indicated that there is a potential for underreporting working capital, (current assets less current liabilities), in the hospital industry. However, this does not necessarily imply that the recent decline in liquidity measures is in any way due to reporting practices. No information about changes in reporting practices was obtained in this study. Finally, the results of the study do suggest that examination of more than one liquidity indicator is useful. Specifically, restricting attention to just the current ratio could be misleading. In this vein, it is interesting to note that six measures of liquidity are used in the FAS. All may provide insight into an accurate assessment of liquidity.

  5. Development and Validation of the 34-Item Disability Screening Questionnaire (DSQ-34 for Use in Low and Middle Income Countries Epidemiological and Development Surveys.

    Directory of Open Access Journals (Sweden)

    Jean-François Trani

    Full Text Available Although 80% of persons with disabilities live in low and middle-income countries, there is still a lack of comprehensive, cross-culturally validated tools to identify persons facing activity limitations and functioning difficulties in these settings. In absence of such a tool, disability estimates vary considerably according to the methodology used, and policies are based on unreliable estimates.The Disability Screening Questionnaire composed of 27 items (DSQ-27 was initially designed by a group of international experts in survey development and disability in Afghanistan for a national survey. Items were selected based on major domains of activity limitations and functioning difficulties linked to an impairment as defined by the International Classification of Functioning, Disability and Health. Face, content and construct validity, as well as sensitivity and specificity were examined. Based on the results obtained, the tool was subsequently refined and expanded to 34 items, tested and validated in Darfur, Sudan. Internal consistency for the total DSQ-34 using a raw and standardized Cronbach's Alpha and within each domain using a standardized Cronbach's Alpha was examined in the Asian context (India and Nepal. Exploratory factor analysis (EFA using principal axis factoring (PAF evaluated the lowest number of factors to account for the common variance among the questions in the screen. Test-retest reliability was determined by calculating intraclass correlation (ICC and inter-rater reliability by calculating the kappa statistic; results were checked using Bland-Altman plots. The DSQ-34 was further tested for standard error of measurement (SEM and for the minimum detectable change (MDC. Good internal consistency was indicated by Cronbach's Alpha of 0.83/0.82 for India and 0.76/0.78 for Nepal. We confirmed our assumption for EFA using the Kaiser-Meyer-Olkin measure of sampling well above the accepted cutoff of 0.40 for India (0.82 and Nepal (0

  6. Development and Validation of the 34-Item Disability Screening Questionnaire (DSQ-34) for Use in Low and Middle Income Countries Epidemiological and Development Surveys

    Science.gov (United States)

    Trani, Jean-François; Babulal, Ganesh Muneshwar; Bakhshi, Parul

    2015-01-01

    Background Although 80% of persons with disabilities live in low and middle-income countries, there is still a lack of comprehensive, cross-culturally validated tools to identify persons facing activity limitations and functioning difficulties in these settings. In absence of such a tool, disability estimates vary considerably according to the methodology used, and policies are based on unreliable estimates. Methods and Findings The Disability Screening Questionnaire composed of 27 items (DSQ-27) was initially designed by a group of international experts in survey development and disability in Afghanistan for a national survey. Items were selected based on major domains of activity limitations and functioning difficulties linked to an impairment as defined by the International Classification of Functioning, Disability and Health. Face, content and construct validity, as well as sensitivity and specificity were examined. Based on the results obtained, the tool was subsequently refined and expanded to 34 items, tested and validated in Darfur, Sudan. Internal consistency for the total DSQ-34 using a raw and standardized Cronbach’s Alpha and within each domain using a standardized Cronbach’s Alpha was examined in the Asian context (India and Nepal). Exploratory factor analysis (EFA) using principal axis factoring (PAF) evaluated the lowest number of factors to account for the common variance among the questions in the screen. Test-retest reliability was determined by calculating intraclass correlation (ICC) and inter-rater reliability by calculating the kappa statistic; results were checked using Bland-Altman plots. The DSQ-34 was further tested for standard error of measurement (SEM) and for the minimum detectable change (MDC). Good internal consistency was indicated by Cronbach’s Alpha of 0.83/0.82 for India and 0.76/0.78 for Nepal. We confirmed our assumption for EFA using the Kaiser-Meyer-Olkin measure of sampling well above the accepted cutoff of 0.40 for

  7. Average vs item response theory scores: an illustration using neighbourhood measures in relation to physical activity in adults with arthritis.

    Science.gov (United States)

    Mielenz, T J; Callahan, L F; Edwards, M C

    2017-01-01

    Our study had two main objectives: 1) to determine whether perceived neighbourhood physical features are associated with physical activity levels in adults with arthritis; and 2) to determine whether the conclusions are more precise when item response theory (IRT) scores are used instead of average scores for the perceived neighbourhood physical features scales. Information on health outcomes, neighbourhood characteristics, and physical activity levels were collected using a telephone survey of 937 participants with self-reported arthritis. Neighbourhood walkability and aesthetic features and physical activity levels were measured by self-report. Adjusted proportional odds models were constructed separately for each neighbourhood physical features scale. We found that among adults with arthritis, poorer perceived neighbourhood physical features (both walkability and aesthetics) are associated with decreased physical activity level compared to better perceived neighbourhood features. This association was only observed in our adjusted models when IRT scoring was employed with the neighbourhood physical feature scales (walkability scale: odds ratio [OR] 1.20, 95% confidence interval [CI] 1.02, 1.41; aesthetics scale: OR 1.32, 95% CI 1.09, 1.62), not when average scoring was used (walkability scale: OR 1.14, 95% CI 1.00, 1.30; aesthetics scale: OR 1.16, 95% CI 1.00, 1.36). In adults with arthritis, those reporting poorer walking and aesthetics features were found to have decreased physical activity levels compared to those reporting better features when IRT scores were used, but not when using average scores. This study may inform public health physical environmental interventions implemented to increase physical activity, especially since arthritis prevalence is expected to be close to 20% of the population in 2020. Based on NIH initiatives, future health research will utilize IRT scores. The differences found in this study may be a precursor for research on how past

  8. Testing whether the DSM-5 personality disorder trait model can be measured with a reduced set of items: An item response theory investigation of the Personality Inventory for DSM-5.

    Science.gov (United States)

    Maples, Jessica L; Carter, Nathan T; Few, Lauren R; Crego, Cristina; Gore, Whitney L; Samuel, Douglas B; Williamson, Rachel L; Lynam, Donald R; Widiger, Thomas A; Markon, Kristian E; Krueger, Robert F; Miller, Joshua D

    2015-12-01

    The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) includes an alternative model of personality disorders (PDs) in Section III, consisting in part of a pathological personality trait model. To date, the 220-item Personality Inventory for DSM-5 (PID-5; Krueger, Derringer, Markon, Watson, & Skodol, 2012) is the only extant self-report instrument explicitly developed to measure this pathological trait model. The present study used item response theory-based analyses in a large sample (n = 1,417) to investigate whether a reduced set of 100 items could be identified from the PID-5 that could measure the 25 traits and 5 domains. This reduced set of PID-5 items was then tested in a community sample of adults currently receiving psychological treatment (n = 109). Across a wide range of criterion variables including NEO PI-R domains and facets, DSM-5 Section II PD scores, and externalizing and internalizing outcomes, the correlational profiles of the original and reduced versions of the PID-5 were nearly identical (rICC = .995). These results provide strong support for the hypothesis that an abbreviated set of PID-5 items can be used to reliably, validly, and efficiently assess these personality disorder traits. The ability to assess the DSM-5 Section III traits using only 100 items has important implications in that it suggests these traits could still be measured in settings in which assessment-related resources (e.g., time, compensation) are limited.

  9. Mortality and health-related quality of life in prevalent dialysis patients: Comparison between 12-items and 36-items short-form health survey

    Directory of Open Access Journals (Sweden)

    Østhus Tone Brit

    2012-05-01

    Full Text Available Abstract Background To assess health- related quality of life (HRQOL with SF-12 and SF-36 and compare their abilities to predict mortality in chronic dialysis patients, after adjusting for traditional risk factors. Methods The Short-Form Health Survey (SF-36 with the embedded SF-12 was applied in 301 dialysis patients cross-sectionally. Physical and mental component summary (PCS-36, MCS-36, PCS-12, and MCS-12 scores were calculated. Clinical and demographic data were collected. Mortality (followed for up to 4.5 years was analyzed with Kaplan Meier plots and Cox proportional hazards, after censoring for renal transplantation. Exclusion factors were observation time Results In 252 patients (60.2 ± 15.5 years, 65.9% males, dialysis vintage 9.0, IQR 5.0-23.0 months, mortality during follow-up was 33.7%.(85 deaths. Significant correlations were observed between PCS-36 and PCS-12 (ρ = 0.93, p ρ = 0.95, p χ2 = 15.3, p = 0.002 and PCS-36 (χ2 = 16.7, p = 0.001. MCS was not associated with mortality. Adjusted hazard ratios for mortality were 2.5 (95% CI 1.0-6.3, PCS-12 and 2.7 (1.1 – 6.4, PCS-36 for the lowest compared with the highest (“best perceived” quartile of PCS. Conclusion Compromised HRQOL is an independent predictor of poor outcome in dialysis patients. The SF-12 provided similar predictions of mortality as SF-36, and may serve as an applicable clinical tool because it requires less time to complete.

  10. Measuring psychological trauma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Psychological Trauma item bank and short form

    Science.gov (United States)

    Kisala, Pamela A.; Victorson, David; Pace, Natalie; Heinemann, Allen W.; Choi, Seung W.; Tulsky, David S.

    2015-01-01

    Objective To describe the development and psychometric properties of the SCI-QOL Psychological Trauma item bank and short form. Design Using a mixed-methods design, we developed and tested a Psychological Trauma item bank with patient and provider focus groups, cognitive interviews, and item response theory based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. Setting We tested a 31-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Veterans Administration hospital. Participants A total of 716 individuals with SCI completed the trauma items Results The 31 items fit a unidimensional model (CFI=0.952; RMSEA=0.061) and demonstrated good precision (theta range between 0.6 and 2.5). Nine items demonstrated negligible DIF with little impact on score estimates. The final calibrated item bank contains 19 items Conclusion The SCI-QOL Psychological Trauma item bank is a psychometrically robust measurement tool from which a short form and a computer adaptive test (CAT) version are available. PMID:26010967

  11. Toolbox of multiple-item measures aligning with the ICF Core Sets for children and youth with cerebral palsy.

    Science.gov (United States)

    Schiariti, Verónica; Tatla, Sandy; Sauve, Karen; O'Donnell, Maureen

    2017-03-01

    Selecting appropriate measure(s) for clinical and/or research applications for children and youth with Cerebral Palsy (CP) poses many challenges. The newly developed International Classification of Functioning, Disability and Health (ICF) Core Sets for children and youth with CP serve as universal guidelines for assessment, intervention and follow-up. The aims of this study were: 1) to identify valid and reliable measures used in studies with children and youth with CP, 2) to characterize the content of each measure using the ICF Core Sets for children and youth with CP as a framework, and finally 3) to create a toolbox of psychometrically sound measures covering the content of each ICF Core Set for children and youth with CP. All clearly defined multiple-item measures used in studies with CP between 1998 and 2015 were identified. Psychometric properties were extracted when available. Construct of the measures were linked to the ICF Core Sets. Overall, 83 multiple-item measures were identified. Of these, 68 measures (80%) included reliability and validity testing. The majority of the measures were discriminative, generic and designed for school-aged children. The degree to which measures with proven psychometric properties represented the ICF Core Sets for children and youth with CP varied considerably. Finally, 25 valid and reliable measures aligned highly with the content of the ICF Core Sets, and as such, these measures are proposed as a novel ICF Core Sets-based toolbox of measures for CP. Our results will guide professionals seeking appropriate measures to meet their research and clinical needs worldwide. Copyright © 2016 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.

  12. The Climate Change Attitude Survey: Measuring Middle School Student Beliefs and Intentions to Enact Positive Environmental Change

    Science.gov (United States)

    Christensen, Rhonda; Knezek, Gerald

    2015-01-01

    The Climate Change Attitude Survey is composed of 15 Likert-type attitudinal items selected to measure students' beliefs and intentions toward the environment with a focus on climate change. This paper describes the development of the instrument and psychometric performance characteristics including reliability and validity. Data were gathered…

  13. What range of trait levels can the Autism-Spectrum Quotient (AQ) measure reliably? An item response theory analysis.

    Science.gov (United States)

    Murray, Aja Louise; Booth, Tom; McKenzie, Karen; Kuenssberg, Renate

    2016-06-01

    It has previously been noted that inventories measuring traits that originated in a psychopathological paradigm can often reliably measure only a very narrow range of trait levels that are near and above clinical cutoffs. Much recent work has, however, suggested that autism spectrum disorder traits are on a continuum of severity that extends well into the nonclinical range. This implies a need for inventories that can capture individual differences in autistic traits from very high levels all the way to the opposite end of the continuum. The Autism-Spectrum Quotient (AQ) was developed based on a closely related rationale, but there has, to date, been no direct test of the range of trait levels that the AQ can reliably measure. To assess this, we fit a bifactor item response theory model to the AQ. Results suggested that AQ measures moderately low to moderately high levels of a general autistic trait with good measurement precision. The reliable range of measurement was significantly improved by scoring the instrument using its 4-point response scale, rather than dichotomizing responses. These results support the use of the AQ in nonclinical samples, but suggest that items measuring very low and very high levels of autistic traits would be beneficial additions to the inventory. (PsycINFO Database Record

  14. Application of multidimensional item response theory models to longitudinal data

    NARCIS (Netherlands)

    Marvelde, te Janneke M.; Glas, Cees A.W.; Van Landeghem, Georges; Van Damme, Jan

    2006-01-01

    The application of multidimensional item response theory (IRT) models to longitudinal educational surveys where students are repeatedly measured is discussed and exemplified. A marginal maximum likelihood (MML) method to estimate the parameters of a multidimensional generalized partial credit model

  15. The Effect of Answering in a Preferred Versus a Non-Preferred Survey Mode on Measurement

    Directory of Open Access Journals (Sweden)

    Jolene Smyth

    2014-12-01

    Full Text Available Previous research has shown that offering respondents their preferred mode can increase response rates, but the effect of doing so on how respondents process and answer survey questions (i.e., measurement is unclear. In this paper, we evaluate whether changes in question format have different effects on data quality for those responding in their preferred mode than for those responding in a non-preferred mode for three question types (multiple answer, open-ended, and grid. Respondents were asked about their preferred mode in a 2008 survey and were recontacted in 2009. In the recontact survey, respondents were randomly assigned to one of two modes such that some responded in their preferred mode and others did not. They were also randomly assigned to one of two questionnaire forms in which the format of individual questions was varied. On the multiple answer and open-ended items, those who answered in a non-preferred mode seemed to take advantage of opportunities to satisfice when the question format allowed or encouraged it (e.g., selecting fewer items in the check-all than the forced-choice format and being more likely to skip the open-ended item when it had a larger answer box, while those who answered in a preferred mode did not. There was no difference on a grid formatted item across those who did and did not respond by their preferred mode, but results indicate that a fully labeled grid reduced item missing rates vis-à-vis a grid with only column heading labels. Results provide insight into the effect of tailoring to mode preference on commonly used questionnaire design features.

  16. Is a single-item visual analogue scale as valid, reliable and responsive as multi-item scales in measuring quality of life?

    NARCIS (Netherlands)

    Boer, A.G.E.M. de; Lanschot, J.J.B. van; Stalmeier, P.F.M.; Sandick, J.W. van; Hulscher, J.B.F.; Haes, J.C.J.M. de; Sprangers, M.A.G.

    2004-01-01

    PURPOSE: To compare the validity, reliability and responsiveness of a single, global quality of life question to multi-item scales. METHOD: Data were obtained from 83 consecutive patients with oesophageal adenocarcinoma undergoing either transhiatal or transthoracic oesophagectomy. Quality of life w

  17. Measuring Bulk Flows in Large Scale Surveys

    CERN Document Server

    Feldman, H A; Feldman, Hume A.; Watkins, Richard

    1993-01-01

    We follow a formalism presented by Kaiser to calculate the variance of bulk flows in large scale surveys. We apply the formalism to a mock survey of Abell clusters \\'a la Lauer \\& Postman and find the variance in the expected bulk velocities in a universe with CDM, MDM and IRAS--QDOT power spectra. We calculate the velocity variance as a function of the 1--D velocity dispersion of the clusters and the size of the survey.

  18. 2007 Merit Principles Survey: Performance Management Measures

    Data.gov (United States)

    Merit Systems Protection Board — 2007 survey responses from selected Federal agencies summarizing the existence of positive performance management practices and employee engagement scores for each...

  19. Face Validity of the Single Work Ability Item: Comparison with Objectively Measured Heart Rate Reserve over Several Days

    Directory of Open Access Journals (Sweden)

    Nidhi Gupta

    2014-05-01

    Full Text Available Purpose: The purpose of this study was to investigate the face validity of the self-reported single item work ability with objectively measured heart rate reserve (%HRR among blue-collar workers. Methods: We utilized data from 127 blue-collar workers (Female = 53; Male = 74 aged 18–65 years from the cross-sectional “New method for Objective Measurements of physical Activity in Daily living (NOMAD” study. The workers reported their single item work ability and completed an aerobic capacity cycling test and objective measurements of heart rate reserve monitored with Actiheart for 3–4 days with a total of 5,810 h, including 2,640 working hours. Results: A significant moderate correlation between work ability and %HRR was observed among males (R = −0.33, P = 0.005, but not among females (R = 0.11, P = 0.431. In a gender-stratified multi-adjusted logistic regression analysis, males with high %HRR were more likely to report a reduced work ability compared to males with low %HRR [OR = 4.75, 95% confidence interval (95% CI = 1.31 to 17.25]. However, this association was not found among females (OR = 0.26, 95% CI 0.03 to 2.16, and a significant interaction between work ability, %HRR and gender was observed (P = 0.03. Conclusions: The observed association between work ability and objectively measured %HRR over several days among male blue-collar workers supports the face validity of the single work ability item. It is a useful and valid measure of the relation between physical work demands and resources among male blue-collar workers. The contrasting association among females needs to be further investigated.

  20. Measurement equivalence in mixed mode surveys

    NARCIS (Netherlands)

    Hox, Joop; de Leeuw, Edith; Zijlmans, Eva

    2015-01-01

    Surveys increasingly use mixed mode data collection (e.g., combining face-to-face and web) because this controls costs and helps to maintain good response rates. However, a combination of different survey modes in one study, be it cross-sectional or longitudinal, can lead to different kinds of measu

  1. Measuring our Universe from Galaxy Redshift Surveys

    Directory of Open Access Journals (Sweden)

    Lahav Ofer

    2004-07-01

    Full Text Available Galaxy redshift surveys have achieved significant progress over the last couple of decades. Those surveys tell us in the most straightforward way what our local Universe looks like. While the galaxy distribution traces the bright side of the Universe, detailed quantitative analyses of the data have even revealed the dark side of the Universe dominated by non-baryonic dark matter as well as more mysterious dark energy (or Einstein's cosmological constant. We describe several methodologies of using galaxy redshift surveys as cosmological probes, and then summarize the recent results from the existing surveys. Finally we present our views on the future of redshift surveys in the era of precision cosmology.

  2. Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R[superscript 2] and Delta Log Odds Ratio Effect Size Measures

    Science.gov (United States)

    Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.

    2014-01-01

    The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…

  3. Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R[superscript 2] and Delta Log Odds Ratio Effect Size Measures

    Science.gov (United States)

    Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.

    2014-01-01

    The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…

  4. Item Construction Using Reflective, Formative, or Rasch Measurement Models: Implications for Group Work

    Science.gov (United States)

    Peterson, Christina Hamme; Gischlar, Karen L.; Peterson, N. Andrew

    2017-01-01

    Measures that accurately capture the phenomenon are critical to research and practice in group work. The vast majority of group-related measures were developed using the reflective measurement model rooted in classical test theory (CTT). Depending on the construct definition and the measure's purpose, the reflective model may not always be the…

  5. Evaluation of item candidates: the PROMIS qualitative item review.

    Science.gov (United States)

    DeWalt, Darren A; Rothrock, Nan; Yount, Susan; Stone, Arthur A

    2007-05-01

    One of the PROMIS (Patient-Reported Outcome Measurement Information System) network's primary goals is the development of a comprehensive item bank for patient-reported outcomes of chronic diseases. For its first set of item banks, PROMIS chose to focus on pain, fatigue, emotional distress, physical function, and social function. An essential step for the development of an item pool is the identification, evaluation, and revision of extant questionnaire items for the core item pool. In this work, we also describe the systematic process wherein items are classified for subsequent statistical processing by the PROMIS investigators. Six phases of item development are documented: identification of extant items, item classification and selection, item review and revision, focus group input on domain coverage, cognitive interviews with individual items, and final revision before field testing. Identification of items refers to the systematic search for existing items in currently available scales. Expert item review and revision was conducted by trained professionals who reviewed the wording of each item and revised as appropriate for conventions adopted by the PROMIS network. Focus groups were used to confirm domain definitions and to identify new areas of item development for future PROMIS item banks. Cognitive interviews were used to examine individual items. Items successfully screened through this process were sent to field testing and will be subjected to innovative scale construction procedures.

  6. Impact of different scoring algorithms applied to multiple-mark survey items on outcome assessment: an in-field study on health-related knowledge.

    Science.gov (United States)

    Domnich, A; Panatto, D; Arata, L; Bevilacqua, I; Apprato, L; Gasparini, R; Amicizia, D

    2015-01-01

    Health-related knowledge is often assessed through multiple-choice tests. Among the different types of formats, researchers may opt to use multiple-mark items, i.e. with more than one correct answer. Although multiple-mark items have long been used in the academic setting - sometimes with scant or inconclusive results - little is known about the implementation of this format in research on in-field health education and promotion. A study population of secondary school students completed a survey on nutrition-related knowledge, followed by a single- lecture intervention. Answers were scored by means of eight different scoring algorithms and analyzed from the perspective of classical test theory. The same survey was re-administered to a sample of the students in order to evaluate the short-term change in their knowledge. In all, 286 questionnaires were analyzed. Partial scoring algorithms displayed better psychometric characteristics than the dichotomous rule. In particular, the algorithm proposed by Ripkey and the balanced rule showed greater internal consistency and relative efficiency in scoring multiple-mark items. A penalizing algorithm in which the proportion of marked distracters was subtracted from that of marked correct answers was the only one that highlighted a significant difference in performance between natives and immigrants, probably owing to its slightly better discriminatory ability. This algorithm was also associated with the largest effect size in the pre-/post-intervention score change. The choice of an appropriate rule for scoring multiple- mark items in research on health education and promotion should consider not only the psychometric properties of single algorithms but also the study aims and outcomes, since scoring rules differ in terms of biasness, reliability, difficulty, sensitivity to guessing and discrimination.

  7. Differential item functioning in Patient Reported Outcomes Measurement Information System® (PROMIS® Physical Functioning short forms: Analyses across ethnically diverse groups

    Directory of Open Access Journals (Sweden)

    Richard N. Jones

    2016-06-01

    Full Text Available We analyzed physical functioning short form items derived from the PROMIS® item bank (PF16 using data from more than 5,000 recently diagnosed, ethnically diverse cancer patients. Our goal was to determine if the short form items demonstrated evidence of differential item functioning (DIF according to sociodemographic characteristics in this clinical sample. We evaluated respons-es for evidence of unidimensionality, local independence (given a single common factor, differen-tial item functioning, and DIF impact. DIF was evaluated attributable to sex, age (middle aged vs. younger and older, race/ethnicity (White vs. Black or African-American, Asian/Pacific Islander, Hispanic and level of education. We used a multiple group confirmatory factor analysis with covariates approach, a multiple indicators multiple causes (MIMIC model. We confirmed essential unidimensionality but some evidence for multidimensionality is present, particularly for basic activities of daily living items, and many instances of local dependence. The presence of local dependence calls for further review of the meaning and measurement of the physical functioning domain among cancer patients. Nearly every item demonstrated statistically significant DIF. In all group comparisons the impact of DIF was negligible. However, the Hispanic subgroup comparison revealed an impact estimate just below an arbitrary threshold for small impact. Within the limitations of local dependency violations, we conclude that items from a static short form derived from the PROMIS physical functioning item bank displayed trivial and ignorable DIF attributable to sex, race, ethnicity, age, and education among cancer patients.

  8. Development of an instrument to measure behavioral health function for work disability: item pool construction and factor analysis.

    Science.gov (United States)

    Marfeo, Elizabeth E; Ni, Pengsheng; Haley, Stephen M; Jette, Alan M; Bogusz, Kara; Meterko, Mark; McDonough, Christine M; Chan, Leighton; Brandt, Diane E; Rasch, Elizabeth K

    2013-09-01

    To develop a broad set of claimant-reported items to assess behavioral health functioning relevant to the Social Security disability determination processes, and to evaluate the underlying structure of behavioral health functioning for use in development of a new functional assessment instrument. Cross-sectional. Community. Item pools of behavioral health functioning were developed, refined, and field tested in a sample of persons applying for Social Security disability benefits (N=1015) who reported difficulties working because of mental or both mental and physical conditions. None. Social Security Administration Behavioral Health (SSA-BH) measurement instrument. Confirmatory factor analysis (CFA) specified that a 4-factor model (self-efficacy, mood and emotions, behavioral control, social interactions) had the optimal fit with the data and was also consistent with our hypothesized conceptual framework for characterizing behavioral health functioning. When the items within each of the 4 scales were tested in CFA, the fit statistics indicated adequate support for characterizing behavioral health as a unidimensional construct along these 4 distinct scales of function. This work represents a significant advance both conceptually and psychometrically in assessment methodologies for work-related behavioral health. The measurement of behavioral health functioning relevant to the context of work requires the assessment of multiple dimensions of behavioral health functioning. Specifically, we identified a 4-factor model solution that represented key domains of work-related behavioral health functioning. These results guided the development and scale formation of a new SSA-BH instrument. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  9. Item-Level and Construct Evaluation of Early Numeracy Curriculum-Based Measures

    Science.gov (United States)

    Lee, Young-Sun; Lembke, Erica; Moore, Douglas; Ginsburg, Herbert P.; Pappas, Sandra

    2012-01-01

    The present study examined the technical adequacy of curriculum-based measures (CBMs) of early numeracy. Six 1-min early mathematics tasks were administered to 137 kindergarten and first-grade students, along with an omnibus test of early mathematics. The CBM measures included Count Out Loud, Quantity Discrimination, Number Identification, Missing…

  10. Test-taker perception of what test items measure: a potential impact of face validity on student learning

    National Research Council Canada - National Science Library

    Sato, Takanori; Ikeda, Naoki

    2015-01-01

    ... agree.University students in Japan and Korea (N = 179) were given past entrance examinations administered in the respective countries and asked to read test items and record what ability they thought each item...

  11. Measuring Sexual Violence on Campus: Climate Surveys and Vulnerable Groups

    Science.gov (United States)

    de Heer, Brooke; Jones, Lynn

    2017-01-01

    Since the 2014 "Not Alone" report on campus sexual assault, the use of climate surveys to measure sexual violence on campuses across the United States has increased considerably. The current study utilizes a quasi meta-analysis approach to examine the utility of general campus climate surveys, which include a measure of sexual violence,…

  12. Item Response Theory Analysis of Two Questionnaire Measures of Arthritis-Related Self-Efficacy Beliefs from Community-Based US Samples

    Directory of Open Access Journals (Sweden)

    Thelma J. Mielenz

    2010-01-01

    Full Text Available Using item response theory (IRT, we examined the Rheumatoid Arthritis Self-efficacy scale (RASE collected from a People with Arthritis Can Exercise RCT (346 participants and 2 subscales of the Arthritis Self-efficacy scale (ASE collected from an Active Living Every Day (ALED RCT (354 participants to determine which one better identifies low arthritis self-efficacy in community-based adults with arthritis. The item parameters were estimated in Multilog using the graded response model. The 2 ASE subscales are adequately explained by one factor. There was evidence for 2 locally dependent item pairs; two items from these pairs were removed when we reran the model. The exploratory factor analysis results for RASE showed a multifactor solution which led to a 9-factor solution. In order to perform IRT analysis, one item from each of the 9 subfactors was selected. Both scales were effective at measuring a range of arthritis SE.

  13. Measuring single constructs by single items: Constructing an even shorter version of the “Short Five” personality inventory

    Science.gov (United States)

    Konstabel, Kenn; Lönnqvist, Jan-Erik; Leikas, Sointu; García Velázquez, Regina; Qin, Hiaying; Verkasalo, Markku; Walkowitz, Gari

    2017-01-01

    The aim of this study was to construct a short, 30-item personality questionnaire that would be, in terms of content and meaning of the scores, as comparable as possible with longer, well-established inventories such as NEO PI-R and its clones. To do this, we shortened the formerly constructed 60-item “Short Five” (S5) by half so that each subscale would be represented by a single item. We compared all possibilities of selecting 30 items (preserving balanced keying within each domain of the five-factor model) in terms of correlations with well-established scales, self-peer correlations, and clarity of meaning, and selected an optimal combination for each domain. The resulting shortened questionnaire, XS5, was compared to the original S5 using data from student samples in 6 different countries (Estonia, Finland, UK, Germany, Spain, and China), and a representative Finnish sample. The correlations between XS5 domain scales and their longer counterparts from well-established scales ranged from 0.74 to 0.84; the difference from the equivalent correlations for full version of S5 or from meta-analytic short-term dependability coefficients of NEO PI-R was not large. In terms of prediction of external criteria (emotional experience and self-reported behaviours), there were no important differences between XS5, S5, and the longer well-established scales. Controlling for acquiescence did not improve the prediction of criteria, self-peer correlations, or correlations with longer scales, but it did improve internal reliability and, in some analyses, comparability of the principal component structure. XS5 can be recommended as an economic measure of the five-factor model of personality at the level of domain scales; it has reasonable psychometric properties, fair correlations with longer well-established scales, and it can predict emotional experience and self-reported behaviours no worse than S5. When subscales are essential, we would still recommend using the full version

  14. Measuring single constructs by single items: Constructing an even shorter version of the "Short Five" personality inventory.

    Science.gov (United States)

    Konstabel, Kenn; Lönnqvist, Jan-Erik; Leikas, Sointu; García Velázquez, Regina; Qin, Hiaying; Verkasalo, Markku; Walkowitz, Gari

    2017-01-01

    The aim of this study was to construct a short, 30-item personality questionnaire that would be, in terms of content and meaning of the scores, as comparable as possible with longer, well-established inventories such as NEO PI-R and its clones. To do this, we shortened the formerly constructed 60-item "Short Five" (S5) by half so that each subscale would be represented by a single item. We compared all possibilities of selecting 30 items (preserving balanced keying within each domain of the five-factor model) in terms of correlations with well-established scales, self-peer correlations, and clarity of meaning, and selected an optimal combination for each domain. The resulting shortened questionnaire, XS5, was compared to the original S5 using data from student samples in 6 different countries (Estonia, Finland, UK, Germany, Spain, and China), and a representative Finnish sample. The correlations between XS5 domain scales and their longer counterparts from well-established scales ranged from 0.74 to 0.84; the difference from the equivalent correlations for full version of S5 or from meta-analytic short-term dependability coefficients of NEO PI-R was not large. In terms of prediction of external criteria (emotional experience and self-reported behaviours), there were no important differences between XS5, S5, and the longer well-established scales. Controlling for acquiescence did not improve the prediction of criteria, self-peer correlations, or correlations with longer scales, but it did improve internal reliability and, in some analyses, comparability of the principal component structure. XS5 can be recommended as an economic measure of the five-factor model of personality at the level of domain scales; it has reasonable psychometric properties, fair correlations with longer well-established scales, and it can predict emotional experience and self-reported behaviours no worse than S5. When subscales are essential, we would still recommend using the full version of

  15. Development of Scalable Test Items to Measure Thinking Levels of Secondary School Students. Final Report.

    Science.gov (United States)

    Smith, Richard B.

    A workable theory of instruction appears to depend upon the development of measurement instruments, which operationally define concepts such as transfer, analysis, synthesis, and evaluation, to make possible empirical investigation of the relationships of methods of instruction and learner variables to the attainment of different types of…

  16. A one-item question with a Likert or Visual Analog Scale adequately measured current anxiety.

    Science.gov (United States)

    Davey, Heather M; Barratt, Alexandra L; Butow, Phyllis N; Deeks, Jonathan J

    2007-04-01

    To determine whether a single question with a Likert Scale or a Visual Analog Scale (VAS) response adequately measures current anxiety. Consecutive English-speaking adult women attending a dedicated breast clinic in a major Australian city were invited to complete a demographic questionnaire, the State Trait Anxiety Inventory (STAI), and a single question with a five-point Likert Scale response and a VAS in random order. Only women who completed the STAI were included in analyses. Four hundred of 497 (80%) eligible women agreed to participate. Both measures were adequate predictors of the STAI score; correlation with STAI was 0.78 (95% confidence interval [CI] 0.73-0.82) for the VAS and 0.75 (95% CI 0.70-0.79) for the Likert Scale. However, 11% of women incorrectly completed the VAS limiting its usefulness. A single question with either a Likert Scale or VAS response may be an adequate replacement for the STAI. Both measures quickly and easily assess anxiety and may be useful for research purposes when researchers have very limited time or questionnaire space or need to reduce the burden on participants of completing many measures.

  17. Towards the Measurement of EFL Listening Beliefs with Item Response Theory Methods

    Science.gov (United States)

    Nix, John-Michael L.; Tseng, Wen-Ta

    2014-01-01

    The present research aims to identify the underlying English listening belief structure of English-as-a-foreign-language (EFL) learners, thereby informing methodologies for subsequent analysis of beliefs with respect to listening achievement. Development of a measurement model of English listening learning beliefs entailed the creation of an…

  18. [Repeated measurement of memory with valenced test items: verbal memory, working memory and autobiographic memory].

    Science.gov (United States)

    Kuffel, A; Terfehr, K; Uhlmann, C; Schreiner, J; Löwe, B; Spitzer, C; Wingenfeld, K

    2013-07-01

    A large number of questions in clinical and/or experimental neuropsychology require the multiple repetition of memory tests at relatively short intervals. Studies on the impact of the associated exercise and interference effects on the validity of the test results are rare. Moreover, hardly any neuropsychological instruments exist to date to record the memory performance with several parallel versions in which the emotional valence of the test material is also taken into consideration. The aim of the present study was to test whether a working memory test (WST, a digit-span task with neutral or negative distraction stimuli) devised by our workgroup can be used with repeated measurements. This question was also examined in parallel versions of a wordlist learning paradigm and an autobiographical memory test (AMT). Both tests contained stimuli with neutral, positive and negative valence. Twenty-four participants completed the memory testing including the working memory test and three versions of a wordlist and the AMT at intervals of a week apiece (measuring points 1. - 3.). The results reveal consistent performances across the three measuring points in the working and autobiographical memory test. The valence of the stimulus material did not influence the memory performance. In the delayed recall of the wordlist an improvement in memory performance over time was seen. The tests on working memory presented and the parallel versions for the declarative and autobiographical memory constitute informal economic instruments within the scope of the measurement repeatability designs. While the WST and AMT are appropriate for study designs with repeated measurements at relatively short intervals, longer intervals might seem more favourable for the use of wordlist learning paradigms. © Georg Thieme Verlag KG Stuttgart · New York.

  19. Reliability, precision, and measurement in the context of data from ability tests, surveys, and assessments

    Energy Technology Data Exchange (ETDEWEB)

    Fisher, W P Jr [LivingCapitalMetrics.com 5252 Annunciation St, New Orleans, Louisiana 70115 (United States); Elbaum, B [University of Miami, Florida (United States); Coulter, A, E-mail: william@livingcapitalmetrics.co, E-mail: elbaum@miami.ed, E-mail: acoulter@lsuhsc.ed [Louisiana State University, New Orleans, Louisiana (United States)

    2010-07-01

    Reliability coefficients indicate the proportion of total variance attributable to differences among measures separated along a quantitative continuum by a testing, survey, or assessment instrument. Reliability is usually considered to be influenced by both the internal consistency of a data set and the number of items, though textbooks and research papers rarely evaluate the extent to which these factors independently affect the data in question. Probabilistic formulations of the requirements for unidimensional measurement separate consistency from error by modelling individual response processes instead of group-level variation. The utility of this separation is illustrated via analyses of small sets of simulated data, and of subsets of data from a 78-item survey of over 2,500 parents of children with disabilities. Measurement reliability ultimately concerns the structural invariance specified in models requiring sufficient statistics, parameter separation, unidimensionality, and other qualities that historically have made quantification simple, practical, and convenient for end users. The paper concludes with suggestions for a research program aimed at focusing measurement research more on the calibration and wide dissemination of tools applicable to individuals, and less on the statistical study of inter-variable relations in large data sets.

  20. Maximizing measurement efficiency of behavior rating scales using Item Response Theory: An example with the Social Skills Improvement System - Teacher Rating Scale.

    Science.gov (United States)

    Anthony, Christopher J; DiPerna, James C; Lei, Pui-Wa

    2016-04-01

    Measurement efficiency is an important consideration when developing behavior rating scales for use in research and practice. Although most published scales have been developed within a Classical Test Theory (CTT) framework, Item Response Theory (IRT) offers several advantages for developing scales that maximize measurement efficiency. The current study provides an example of using IRT to maximize rating scale efficiency with the Social Skills Improvement System - Teacher Rating Scale (SSIS - TRS), a measure of student social skills frequently used in practice and research. Based on IRT analyses, 27 items from the Social Skills subscales and 14 items from the Problem Behavior subscales of the SSIS - TRS were identified as maximally efficient. In addition to maintaining similar content coverage to the published version, these sets of maximally efficient items demonstrated similar psychometric properties to the published SSIS - TRS.

  1. Development of a cross-cultural item bank for measuring quality of life related to mental health in multiple sclerosis patients.

    Science.gov (United States)

    Michel, Pierre; Auquier, Pascal; Baumstarck, Karine; Pelletier, Jean; Loundou, Anderson; Ghattas, Badih; Boyer, Laurent

    2015-09-01

    Quality of life (QoL) measurements are considered important outcome measures both for research on multiple sclerosis (MS) and in clinical practice. Computerized adaptive testing (CAT) can improve the precision of measurements made using QoL instruments while reducing the burden of testing on patients. Moreover, a cross-cultural approach is also necessary to guarantee the wide applicability of CAT. The aim of this preliminary study was to develop a calibrated item bank that is available in multiple languages and measures QoL related to mental health by combining one generic (SF-36) and one disease-specific questionnaire (MusiQoL). Patients with MS were enrolled in this international, multicenter, cross-sectional study. The psychometric properties of the item bank were based on classical test and item response theories and approaches, including the evaluation of unidimensionality, item response theory model fitting, and analyses of differential item functioning (DIF). Convergent and discriminant validities of the item bank were examined according to socio-demographic, clinical, and QoL features. A total of 1992 patients with MS and from 15 countries were enrolled in this study to calibrate the 22-item bank developed in this study. The strict monotonicity of the Cronbach's alpha curve, the high eigenvalue ratio estimator (5.50), and the adequate CFA model fit (RMSEA = 0.07 and CFI = 0.95) indicated that a strong assumption of unidimensionality was warranted. The infit mean square statistic ranged from 0.76 to 1.27, indicating a satisfactory item fit. DIF analyses revealed no item biases across geographical areas, confirming the cross-cultural equivalence of the item bank. External validity testing revealed that the item bank scores correlated significantly with QoL scores but also showed discriminant validity for socio-demographic and clinical characteristics. This work demonstrated satisfactory psychometric characteristics for a QoL item bank for MS in multiple

  2. Automatic identification of NDA measured items: Use of E-tags

    Energy Technology Data Exchange (ETDEWEB)

    Chitumbo, K.; Olsen, R. [International Atomic Energy Agency (United States); Hatcher, C.R. [Los Alamos National Lab., NM (United States); Kadner, S.P. [Aquila Technologies Group, Inc. (United States)

    1995-07-01

    This paper describes how electronic identification devices or E-tags could reduce the time spent by LAEA inspectors making nondestructive assay (NDA) measurements. As one example, the use of E-tags with a high-level neutron coincidence counter (HLNC) is discussed in detail. Sections of the paper include inspection procedures, system description, software, and future plans. Mounting of E-tabs, modifications to the HLNC, and the use of tamper indicating devices are also discussed. The technology appears to have wide application to different types of nuclear facilities and inspections and could significantly change NDA inspection procedures.

  3. Measuring the Accuracy of Survey Responses using Administrative Register Data

    DEFF Research Database (Denmark)

    Kreiner, Claus Thustrup; Lassen, David Dreyer; Leth-Petersen, Søren

    2015-01-01

    This paper shows how Danish administrative register data can be combined with survey data at the person level and be used to validate information collected in the survey. Register data are collected by automatic third party reporting and the potential errors associated with the two data sources...... are therefore plausibly orthogonal. Two examples are given to illustrate the potential of combining survey and register data. In the first example expenditure survey records with information about total expenditure are merged with income tax records holding information about income and wealth. Income and wealth...... data are used to impute total expenditure which is then compared to the survey measure. Results suggest that the two measures match each other well on average. In the second example we compare responses to a one-shot recall question about total gross personal income ¿collected in another survey...

  4. Measuring customer satisfaction using SERQUAL survey

    Directory of Open Access Journals (Sweden)

    Ardeshir Tajzadeh Namin

    2012-04-01

    Full Text Available The focus of this research is on assessing the quality of services of Tehran’s Saman bank and the available gap between customer’s expectation and perception. Also the relationship between customer’s satisfaction and each dimension of service quality (ie: reliability, tangibility, responsiveness, assurance and empathy and ranking them accordingly, is investigated. The statistical population of this research is consisted of Tehran’s Saman bank customers. The research methods of this study are descriptive-survey as well as correlation. The statistical approaches of this study are correlation, t-student as well as Friedman tests. The results from a sample of 276, shows the service quality dimensions affect customers' perception based on SERQUAL. In addition, there are significant relationship between customers' perception and their satisfaction of the offered services. However, there are negative gaps between customers' perception and their level of expectation.

  5. Working with Missing Data: Imputation of Nonresponse Items in Categorical Survey Data with a Non-Monotone Missing Pattern

    OpenAIRE

    Wilson, Machelle D; Kerstin Lueck

    2014-01-01

    The imputation of missing data is often a crucial step in the analysis of survey data. This study reviews typical problems with missing data and discusses a method for the imputation of missing survey data with a large number of categorical variables which do not have a monotone missing pattern. We develop a method for constructing a monotone missing pattern that allows for imputation of categorical data in data sets with a large number of variables using a model-based MCMC approach. We repor...

  6. A composite score for a measuring instrument utilising re-scaled Likert values and item weights from matrices of pairwise ratios

    Directory of Open Access Journals (Sweden)

    Angie Hennessy

    2009-04-01

    Full Text Available

    A methodology is proposed to develop a measuring instrument (metric for evaluating subjects from a population that cannot provide data to facilitate the development of such a metric (e.g. pre-term infants in the neonatal intensive care unit. Central to this methodology is the employment of an expert group that decides on the items to be included in the metric, the weights assigned to these items, and an index associated with the Likert scale points for each item. The experts supply pairwise ratios of an importance between items, and the geometric mean method is applied to these to establish the item weights – a well-established procedure in multi-criteria decision analysis. The ratios are found by having a managed discussion before asking the members of the expert panel to mark a visual analogue scale for each item.

    Opsomming

    ‘n Metode word aangebied waarmee ‘n meetinstrument (metriek ontwikkel kan word vir die evaluering van persone uit ‘n populasie wat nie self die data vir die ontwikkeling van die metriek kan voorsien nie (bv. vroeggebore babas in die neonatale intensiewe sorgeenheid. Die kern van hierdie werkswyse is die gebruik van ‘n deskundige groep wat die items vir die meetinstrument kies, gewigte aan die items toeken, en vir elke item ‘n indeks opstel wat met die Likert-skaal punte geassosieer word. Die deskundiges het paarsgewyse verhoudings tussen items verskaf en die meetkundig-gemiddelde metode is hierop toegepas om die itemgewigte te verkry – ‘n goedgevestigde gebruik in meerdoelwitbesluitkunde. Die paarsgewyse verhoudings is gewerf deur die deskundiges, na ‘n bestuurde bespreking, vir elke item ‘n visuele analoogskaal te laat invul.

    How to cite this article:
    Becker, P.J., Wolvaardt, J.S., Hennessy, A. & Maree, C., 2009, 'A composite score for a measuring instrument utilising re-scaled Likert values and item weights from matrices of pair wise ratios

  7. Photovoltaics characterization: A survey of diagnostic measurements

    Energy Technology Data Exchange (ETDEWEB)

    Kazmerski, L.L. [Center for Measurements and Characterization, National Renewable Energy Laboratory, Golden, Colorado 80401 (United States)

    1998-10-01

    The advancement of the photovoltaic technology is closely linked to the standard evaluation of the product, the diagnosis of problems, the validation of materials and cell properties, and the engineering and documentation of the ensemble of device properties from internal interfaces through power outputs. The focus of this paper is on some of the more common, visible, and important techniques dealing with physical-chemical through electro-optical parameters, which are linked intimately to the performance quality of materials and devices. Two areas, defined by their spatial-resolution qualities, are emphasized: macroscale and microscale measurement technologies. The importance, strengths, and limitations of these techniques are stressed, especially their significance to photovoltaics. Included are several techniques that have been developed specifically to address problems and requirements for photovoltaics. The regime of measurement literally covers arrays through atoms. {copyright} {ital 1998 Materials Research Society.}

  8. Feasibility and diagnostic accuracy of the Patient-Reported Outcomes Measurement Information System (PROMIS) item banks for routine surveillance of sleep and fatigue problems in ambulatory cancer care.

    Science.gov (United States)

    Leung, Yvonne W; Brown, Catherine; Cosio, Andrea Perez; Dobriyal, Aditi; Malik, Noor; Pat, Vivien; Irwin, Margaret; Tomasini, Pascale; Liu, Geoffrey; Howell, Doris

    2016-09-15

    Routine screening for problematic symptoms is emerging as a best practice in cancer systems globally. The objective of this observational study was to assess the feasibility and diagnostic accuracy of Patient-Reported Outcomes Measurement Information System (PROMIS) computerized adaptive testing (CAT) for fatigue and sleep-disturbance items compared with legacy measures in routine ambulatory cancer care. Patients who attended outpatient clinics at the Princess Margaret Cancer Center completed PROMIS CAT item banks and legacy measures (the Functional Assessment of Chronic Illness Therapy [FACIT]-Fatigue scale and the Insomnia Severity Index [ISI]) using tablet computers during clinic visits. The completion rates, patient acceptability, and diagnostic accuracy of PROMIS CAT were evaluated against legacy measures using receiver operating characteristic (ROC) curve analysis. Participants consisted of 336 patients (mean age ± standard deviation, 57.4 ± 15.7 years; 55% females; 75% Caucasian). Over 98% of patients did not find symptom screening was burdensome, although only 65% were willing to complete the survey at every visit. PROMIS CAT scores were significantly correlated with both FACIT-Fatigue scores (r = -0.83) and ISI scores (r = -0.57; p < 0.0001 for all). Areas under the curve (AUC) by ROC analysis for fatigue were 0.946 using the FACIT-Fatigue cutoff ≤30, 0.910 for sleep disturbance, and 0.922 for sleep impairment using the ISI cutoff ≥15. The recommended T-score cut-off for PROMIS CAT Fatigue was 57, Sleep Disturbance was 57, and Sleep Impairment was 57. The current results support the feasibility and accuracy of PROMIS CAT and its potential for use in routine ambulatory cancer care. Future research will assess feedback of these data to clinicians and evaluate effects on earlier identification of and intervention for these problems. Cancer 2016. © 2016 American Cancer Society. Cancer 2016;122:2906-2917. © 2016 American Cancer

  9. Measuring children's self-reported sport participation, risk perception and injury history: development and validation of a survey instrument.

    Science.gov (United States)

    Siesmaa, Emma J; Blitvich, Jennifer D; White, Peta E; Finch, Caroline F

    2011-01-01

    Despite the health benefits associated with children's sport participation, the occurrence of injury in this context is common. The extent to which sport injuries impact children's ongoing involvement in sport is largely unknown. Surveys have been shown to be useful for collecting children's injury and sport participation data; however, there are currently no published instruments which investigate the impact of injury on children's sport participation. This study describes the processes undertaken to assess the validity of two survey instruments for collecting self-reported information about child cricket and netball related participation, injury history and injury risk perceptions, as well as the reliability of the cricket-specific version. Face and content validity were assessed through expert feedback from primary and secondary level teachers and from representatives of peak sporting bodies for cricket and netball. Test-retest reliability was measured using a sample of 59 child cricketers who completed the survey on two occasions, 3-4 weeks apart. Based on expert feedback relating to face and content validity, modification and/or deletion of some survey items was undertaken. Survey items with low test-retest reliability (κ≤0.40) were modified or deleted, items with moderate reliability (κ=0.41-0.60) were modified slightly and items with higher reliability (κ≥0.61) were retained, with some undergoing minor modifications. This is the first survey of its kind which has been successfully administered to cricketers aged 10-16 years to collect information about injury risk perceptions and intentions for continued sport participation. Implications for its generalisation to other child sport participants are discussed.

  10. Identifying the ‘red flags’ for unhealthy weight control among adolescents: Findings from an item response theory analysis of a national survey

    Directory of Open Access Journals (Sweden)

    Utter Jennifer

    2012-08-01

    Full Text Available Abstract Background Weight control behaviors are common among young people and are associated with poor health outcomes. Yet clinicians rarely ask young people about their weight control; this may be due to uncertainty about which questions to ask, specifically around whether certain weight loss strategies are healthier or unhealthy or about what weight loss behaviors are more likely to lead to adverse outcomes. Thus, the aims of the current study are: to confirm, using item response theory analysis, that the underlying latent constructs of healthy and unhealthy weight control exist; to determine the ‘red flag’ weight loss behaviors that may discriminate unhealthy from healthy weight loss; to determine the relationships between healthy and unhealthy weight loss and mental health; and to examine how weight control may vary among demographic groups. Methods Data were collected as part of a national health and wellbeing survey of secondary school students in New Zealand (n = 9,107 in 2007. Item response theory analyses were conducted to determine the underlying constructs of weight control behaviors and the behaviors that discriminate unhealthy from healthy weight control. Results The current study confirms that there are two underlying constructs of weight loss behaviors which can be described as healthy and unhealthy weight control. Unhealthy weight control was positively correlated with depressive mood. Fasting and skipping meals for weight loss had the lowest item thresholds on the unhealthy weight control continuum, indicating that they act as ‘red flags’ and warrant further discussion in routine clinical assessments. Conclusions Routine assessments of weight control strategies by clinicians are warranted, particularly for screening for meal skipping and fasting for weight loss as these behaviors appear to ‘flag’ behaviors that are associated with poor mental wellbeing.

  11. Using the Self-Directed Search in Research: Selecting a Representative Pool of Items to Measure Vocational Interests

    Science.gov (United States)

    Poitras, Sarah-Caroline; Guay, Frederic; Ratelle, Catherine F.

    2012-01-01

    Using Item Response Theory (IRT) and Confirmatory Factor Analysis (CFA), the goal of this study was to select a reduced pool of items from the French Canadian version of the Self-Directed Search--Activities Section (Holland, Fritzsche, & Powell, 1994). Two studies were conducted. Results of Study 1, involving 727 French Canadian students,…

  12. Using the Self-Directed Search in Research: Selecting a Representative Pool of Items to Measure Vocational Interests

    Science.gov (United States)

    Poitras, Sarah-Caroline; Guay, Frederic; Ratelle, Catherine F.

    2012-01-01

    Using Item Response Theory (IRT) and Confirmatory Factor Analysis (CFA), the goal of this study was to select a reduced pool of items from the French Canadian version of the Self-Directed Search--Activities Section (Holland, Fritzsche, & Powell, 1994). Two studies were conducted. Results of Study 1, involving 727 French Canadian students,…

  13. Immediate List Recall as a Measure of Short-Term Episodic Memory: Insights from the Serial Position Effect and Item Response Theory

    Science.gov (United States)

    Gavett, Brandon E.; Horwitz, Julie E.

    2012-01-01

    The serial position effect shows that two interrelated cognitive processes underlie immediate recall of a supraspan word list. The current study used item response theory (IRT) methods to determine whether the serial position effect poses a threat to the construct validity of immediate list recall as a measure of verbal episodic memory. Archival data were obtained from a national sample of 4,212 volunteers aged 28–84 in the Midlife Development in the United States study. Telephone assessment yielded item-level data for a single immediate recall trial of the Rey Auditory Verbal Learning Test (RAVLT). Two parameter logistic IRT procedures were used to estimate item parameters and the Q1 statistic was used to evaluate item fit. A two-dimensional model better fit the data than a unidimensional model, supporting the notion that list recall is influenced by two underlying cognitive processes. IRT analyses revealed that 4 of the 15 RAVLT items (1, 12, 14, and 15) were misfit (p < .05). Item characteristic curves for items 14 and 15 decreased monotonically, implying an inverse relationship between the ability level and the probability of recall. Elimination of the four misfit items provided better fit to the data and met necessary IRT assumptions. Performance on a supraspan list learning test is influenced by multiple cognitive abilities; failure to account for the serial position of words decreases the construct validity of the test as a measure of episodic memory and may provide misleading results. IRT methods can ameliorate these problems and improve construct validity. PMID:22138320

  14. Making sense of sexual orientation measures: findings from a cognitive processing study with adolescents on health survey questions.

    Science.gov (United States)

    Austin, S Bryn; Conron, Kerith; Patel, Aarti; Freedner, Naomi

    2007-01-01

    To carry out a study using cognitive processing interview methods to explore ways in which adolescents understand sexual orientation questions currently used on epidemiologic surveys. In-depth, individual interviews were conducted to probe cognitive processes involved in answering four self-report survey questions assessing sexual identity, sexual attraction, and sex of sexual partners.A semi-structured interview guide was used to explore variation in question interpretation, information retrieval patterns and problems, item clarity, valence of reactions to items (positive, negative, neutral), respondent burden, and perceived threat associated with the measures. Thirty adolescents aged 15 to 21 of diverse sexual orientations and race/ethnicities participated in the study, including female, male, and transgender youth. A question on sexual attraction was the most consistently understood and thus was easy for nearly all youth to answer. In contrast, a measure of sexual identity with options heterosexual, bisexual, gay/lesbian, and unsure was the most difficult to answer. Most preferred a sexual identity item that also provided the intermediate options mostly heterosexual and mostly homosexual, which many said reflected their experience of feeling between categories. Participants had varying and inconsistent interpretations of sexual behavior terms, such as sex and sexual intercourse, used in assessing the sex of sexual partners. Differences in understanding could affect interpretation of survey data in important ways. Development of valid measures of sexual orientation will be essential to better monitor health disparities.

  15. Evaluation of a survey tool to measure safety climate in Australian hospital pharmacy staff.

    Science.gov (United States)

    Walpola, Ramesh L; Chen, Timothy F; Fois, Romano A; Ashcroft, Darren M; Lalor, Daniel J

    Safety climate evaluation is increasingly used by hospitals as part of quality improvement initiatives. Consequently, it is necessary to have validated tools to measure changes. To evaluate the construct validity and internal consistency of a survey tool to measure Australian hospital pharmacy patient safety climate. A 42 item cross-sectional survey was used to evaluate the patient safety climate of 607 Australian hospital pharmacy staff. Survey responses were initially mapped to the factor structure previously identified in European community pharmacy. However, as the data did not adequately fit the community pharmacy model, participants were randomly split into two groups with exploratory factor analysis performed on the first group (n = 302) and confirmatory factor analyses performed on the second group (n = 305). Following exploratory factor analysis (59.3% variance explained) and confirmatory factor analysis, a 6-factor model containing 28 items was obtained with satisfactory model fit (χ(2) (335) = 664.61 p  0.643) and model nesting between the groups (Δχ(2) (22) = 30.87, p = 0.10). Three factors (blame culture, organisational learning and working conditions) were similar to those identified in European community pharmacy and labelled identically. Three additional factors (preoccupation with improvement; comfort to question authority; and safety issues being swept under the carpet) highlight hierarchical issues present in hospital settings. This study has demonstrated the validity of a survey to evaluate patient safety climate of Australian hospital pharmacy staff. Importantly, this validated factor structure may be used to evaluate changes in safety climate over time. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. MEASUREMENT OF FRICTIONAL PRESSURE DIFFERENTIALS DURING A VENTILATION SURVEY

    Energy Technology Data Exchange (ETDEWEB)

    B.S. Prosser, PE; I.M. Loomis, PE, PhD

    2003-11-03

    During the course of a ventilation survey, both airflow quantity and frictional pressure losses are measured and quantified. The measurement of airflow has been extensively studied as the vast majority of ventilation standards/regulations are tied to airflow quantity or velocity. However, during the conduct of a ventilation survey, measurement of airflow only represents half of the necessary parameters required to directly calculate the airway resistance. The measurement of frictional pressure loss is an often misunderstood and misapplied part of the ventilation survey. This paper compares the two basic methods of frictional pressure drop measurements; the barometer and the gauge and tube. Personal experiences with each method will be detailed along with the authors' opinions regarding the applicability and conditions favoring each method.

  17. Working with Missing Data: Imputation of Nonresponse Items in Categorical Survey Data with a Non-Monotone Missing Pattern

    Directory of Open Access Journals (Sweden)

    Machelle D. Wilson

    2014-01-01

    Full Text Available The imputation of missing data is often a crucial step in the analysis of survey data. This study reviews typical problems with missing data and discusses a method for the imputation of missing survey data with a large number of categorical variables which do not have a monotone missing pattern. We develop a method for constructing a monotone missing pattern that allows for imputation of categorical data in data sets with a large number of variables using a model-based MCMC approach. We report the results of imputing the missing data from a case study, using educational, sociopsychological, and socioeconomic data from the National Latino and Asian American Study (NLAAS. We report the results of multiply imputed data on a substantive logistic regression analysis predicting socioeconomic success from several educational, sociopsychological, and familial variables. We compare the results of conducting inference using a single imputed data set to those using a combined test over several imputations. Findings indicate that, for all variables in the model, all of the single tests were consistent with the combined test.

  18. Desenvolvimento de uma escala para medir o potencial empreendedor utilizando a Teoria da Resposta ao Item (TRI Development of a scale to measure the entrepreneurial potential using the Item Response Theory (IRT

    Directory of Open Access Journals (Sweden)

    Luciano Ricardo Rath Alves

    2011-01-01

    Full Text Available Diversas variáveis estão relacionadas ao desenvolvimento da atividade empreendedora, verifica-se, entre elas, a importância do agente empreendedor. Dos estudos que contribuem para o seu entendimento, este segue a linha que defende que o empreendedor tem características e traços de personalidade singulares em relação à população, os quais são propícios ao sucesso do empreendedorismo. O objetivo deste trabalho é desenvolver uma escala para medir o potencial empreendedor utilizando a Teoria da Resposta ao Item. Foi utilizado o modelo logístico de dois parâmetros da TRI. As estimativas dos parâmetros foram obtidas a partir da amostra com 764 pessoas que responderam a um instrumento composto por 103 itens. A curva de informação e do erro padrão do teste e a interpretação qualitativa de níveis da escala permitiram determinar o intervalo mais apropriado para utilização do instrumento. Os resultados mostraram que a escala é mais adequada para avaliar indivíduos com baixo até moderadamente alto potencial empreendedor. Por isso, sugere-se que novos itens sejam incorporados ao instrumento para mensurar e interpretar níveis ainda mais elevados. A Teoria da Resposta ao Item permite que novos itens sejam calibrados a fim de mensurar os empreendedores com alto potencial empreendedor, aproveitando os dados já obtidos.Several variables are related to the development of entrepreneurial activities. An important one among them is the entrepreneurial agent. This study is one of many that contribute to the understanding of the entrepreneurial agent. In its line of thought, it upholds the idea that the entrepreneur has characteristics and personality traits that stand out from the general population and that are favorable to the success of the entrepreneurship. This study aims at developing a measurement scale for entrepreneurial potential using the Item Response Theory. The items were generated by Santos (2008 based on a theoretical model

  19. A Survey of Binary Similarity and Distance Measures

    Directory of Open Access Journals (Sweden)

    Seung-Seok Choi

    2010-02-01

    Full Text Available The binary feature vector is one of the most common representations of patterns and measuring similarity and distance measures play a critical role in many problems such as clustering, classification, etc. Ever since Jaccard proposed a similarity measure to classify ecological species in 1901, numerous binary similarity and distance measures have been proposed in various fields. Applying appropriate measures results in more accurate data analysis. Notwithstanding, few comprehensive surveys on binary measures have been conducted. Hence we collected 76 binary similarity and distance measures used over the last century and reveal their correlations through the hierarchical clustering technique.

  20. Transgender-inclusive measures of sex/gender for population surveys: Mixed-methods evaluation and recommendations

    Science.gov (United States)

    Bauer, Greta R.; Braimoh, Jessica; Scheim, Ayden I.; Dharma, Christoffer

    2017-01-01

    Given that an estimated 0.6% of the U.S. population is transgender (trans) and that large health disparities for this population have been documented, government and research organizations are increasingly expanding measures of sex/gender to be trans inclusive. Options suggested for trans community surveys, such as expansive check-all-that-apply gender identity lists and write-in options that offer maximum flexibility, are generally not appropriate for broad population surveys. These require limited questions and a small number of categories for analysis. Limited evaluation has been undertaken of trans-inclusive population survey measures for sex/gender, including those currently in use. Using an internet survey and follow-up of 311 participants, and cognitive interviews from a maximum-diversity sub-sample (n = 79), we conducted a mixed-methods evaluation of two existing measures: a two-step question developed in the United States and a multidimensional measure developed in Canada. We found very low levels of item missingness, and no indicators of confusion on the part of cisgender (non-trans) participants for both measures. However, a majority of interview participants indicated problems with each question item set. Agreement between the two measures in assessment of gender identity was very high (K = 0.9081), but gender identity was a poor proxy for other dimensions of sex or gender among trans participants. Issues to inform measure development or adaptation that emerged from analysis included dimensions of sex/gender measured, whether non-binary identities were trans, Indigenous and cultural identities, proxy reporting, temporality concerns, and the inability of a single item to provide a valid measure of sex/gender. Based on this evaluation, we recommend that population surveys meant for multi-purpose analysis consider a new Multidimensional Sex/Gender Measure for testing that includes three simple items (one asked only of a small sub-group) to assess gender

  1. Transgender-inclusive measures of sex/gender for population surveys: Mixed-methods evaluation and recommendations.

    Science.gov (United States)

    Bauer, Greta R; Braimoh, Jessica; Scheim, Ayden I; Dharma, Christoffer

    2017-01-01

    Given that an estimated 0.6% of the U.S. population is transgender (trans) and that large health disparities for this population have been documented, government and research organizations are increasingly expanding measures of sex/gender to be trans inclusive. Options suggested for trans community surveys, such as expansive check-all-that-apply gender identity lists and write-in options that offer maximum flexibility, are generally not appropriate for broad population surveys. These require limited questions and a small number of categories for analysis. Limited evaluation has been undertaken of trans-inclusive population survey measures for sex/gender, including those currently in use. Using an internet survey and follow-up of 311 participants, and cognitive interviews from a maximum-diversity sub-sample (n = 79), we conducted a mixed-methods evaluation of two existing measures: a two-step question developed in the United States and a multidimensional measure developed in Canada. We found very low levels of item missingness, and no indicators of confusion on the part of cisgender (non-trans) participants for both measures. However, a majority of interview participants indicated problems with each question item set. Agreement between the two measures in assessment of gender identity was very high (K = 0.9081), but gender identity was a poor proxy for other dimensions of sex or gender among trans participants. Issues to inform measure development or adaptation that emerged from analysis included dimensions of sex/gender measured, whether non-binary identities were trans, Indigenous and cultural identities, proxy reporting, temporality concerns, and the inability of a single item to provide a valid measure of sex/gender. Based on this evaluation, we recommend that population surveys meant for multi-purpose analysis consider a new Multidimensional Sex/Gender Measure for testing that includes three simple items (one asked only of a small sub-group) to assess gender

  2. Improving a measure of mobility-related fatigue (the mobility-tiredness scale) by establishing item intensity

    DEFF Research Database (Denmark)

    Fieo, Robert A; Mortensen, Erik L; Rantanen, Taina;

    2013-01-01

    To improve the construct validity of self-reported fatigue by establishing a formal hierarchy of scale items and to determine whether such a hierarchy could be maintained across time (aged 75-80), sex, and nationality.......To improve the construct validity of self-reported fatigue by establishing a formal hierarchy of scale items and to determine whether such a hierarchy could be maintained across time (aged 75-80), sex, and nationality....

  3. The Servant Leadership Survey: Development and Validation of a Multidimensional Measure.

    Science.gov (United States)

    van Dierendonck, Dirk; Nuijten, Inge

    2011-09-01

    PURPOSE: The purpose of this paper is to describe the development and validation of a multi-dimensional instrument to measure servant leadership. DESIGN/METHODOLOGY/APPROACH: Based on an extensive literature review and expert judgment, 99 items were formulated. In three steps, using eight samples totaling 1571 persons from The Netherlands and the UK with a diverse occupational background, a combined exploratory and confirmatory factor analysis approach was used. This was followed by an analysis of the criterion-related validity. FINDINGS: The final result is an eight-dimensional measure of 30 items: the eight dimensions being: standing back, forgiveness, courage, empowerment, accountability, authenticity, humility, and stewardship. The internal consistency of the subscales is good. The results show that the Servant Leadership Survey (SLS) has convergent validity with other leadership measures, and also adds unique elements to the leadership field. Evidence for criterion-related validity came from studies relating the eight dimensions to well-being and performance. IMPLICATIONS: With this survey, a valid and reliable instrument to measure the essential elements of servant leadership has been introduced. ORIGINALITY/VALUE: The SLS is the first measure where the underlying factor structure was developed and confirmed across several field studies in two countries. It can be used in future studies to test the underlying premises of servant leadership theory. The SLS provides a clear picture of the key servant leadership qualities and shows where improvements can be made on the individual and organizational level; as such, it may also offer a valuable starting point for training and leadership development.

  4. Measuring socio-economic data in tuberculosis prevalence surveys.

    Science.gov (United States)

    van Leth, F; Guilatco, R S; Hossain, S; Van't Hoog, A H; Hoa, N B; van der Werf, M J; Lönnroth, K

    2011-06-01

    Addressing social determinants in the field of tuberculosis (TB) has received great attention in the past years, mainly due to the fact that worldwide TB incidence has not declined as much as expected, despite highly curative control strategies. One of the objectives of the World Health Organization Global Task Force on TB Impact Measurement is to assess the prevalence of TB disease in 22 high-burden countries by active screening of a random sample of the general population. These surveys provide a unique opportunity to assess socio-economic determinants in relation to prevalent TB and its risk factors. This article describes methods of measuring the socio-economic position in the context of a TB prevalence survey. An indirect measurement using an assets score is the most feasible way of doing this. Several examples are given from recently conducted prevalence surveys of the use of an assets score, its construction, and the analyses of the obtained data.

  5. Scoring Subjectivity and Item Performance on Measures Used to Assess Violence Risk: The PCL-R and HCR-20 as Exemplars

    Science.gov (United States)

    Rufino, Katrina A.; Boccaccini, Marcus T.; Guy, Laura S.

    2011-01-01

    Although reliability is essential to validity, most research on violence risk assessment tools has paid little attention to strategies for improving rater agreement. The authors evaluated the degree to which perceived subjectivity in scoring guidelines for items from two measures--the Psychopathy Checklist-Revised (PCL-R) and the Historical,…

  6. Measuring Nonresponse Bias in a Cross-Country Enterprise Survey

    Directory of Open Access Journals (Sweden)

    Katarzyna Bańkowska

    2015-04-01

    Full Text Available Nonresponse is a common issue affecting the vast majority of surveys. Efforts to convince those unwilling to participate in a survey might not necessary result in a better picture of the target population and can lead to higher, not lower, nonresponse bias.We investigate the impact of non-response in the European Commission & European Central Bank Survey on the Access to Finance of Enterprises (SAFE, which collects evidence on the financing conditions faced by European SMEs compared with those of large firms. This survey, conducted by telephone bi-annually since 2009 by the ECB and the European Commission, provides a valuable means to search for this kind of bias, given the high heterogeneity of response propensities across countries.The study relies on so-called “Representativity Indicators” developed within the Representativity Indicators of Survey Quality (RISQ project, which measure the distance to a fully representative response. On this basis, we examine the quality of the SAFE Survey at different stages of the fieldwork as well as across different survey waves and countries. The RISQ methodology relies on rich sampling frame information, which is however partly limited in the case of the SAFE. We also assess the representativeness of the SAFE particular subsample created by linking the survey responses with the companies’ financial information from a business register; this sub-sampling is another potential source of bias which we also attempt to quantify. Finally, we suggest possible ways how to improve monitoring of the possible nonresponse bias in the future rounds of the survey.

  7. Working meeting on blood pressure measurement: suggestions for measuring blood pressure to use in populations surveys.

    Science.gov (United States)

    2003-11-01

    As part of the Pan American Hypertension Initiative (PAHI), the Pan American Health Organization and the National Heart, Lung, and Blood Institute of the National Institutes of Health of the United States of America conducted a working meeting to discuss blood pressure (BP) measurement methods used in various hypertension prevalence surveys and clinical trials, with the objective of developing a BP measurement protocol for use in hypertension prevalence surveys in the Americas. No such common protocol has existed in the Americas, so it has been difficult to compare hypertension prevention and intervention strategies. This piece describes a proposed standard method for measuring blood pressure for use in population surveys in the Region of the Americas. The piece covers: considerations for developing a common blood pressure measurement protocol, critical issues in measuring blood pressure in national surveys, minimum procedures for blood pressure measurement during surveillance, and quality assessment of blood pressure.

  8. A Method for Individualizing the Prediction of Immunogenicity of Protein Vaccines and Biologic Therapeutics: Individualized T Cell Epitope Measure (iTEM

    Directory of Open Access Journals (Sweden)

    Tobias Cohen

    2010-01-01

    Full Text Available The promise of pharmacogenomics depends on advancing predictive medicine. To address this need in the area of immunology, we developed the individualized T cell epitope measure (iTEM tool to estimate an individual's T cell response to a protein antigen based on HLA binding predictions. In this study, we validated prospective iTEM predictions using data from in vitro and in vivo studies. We used a mathematical formula that converts DRB1∗ allele binding predictions generated by EpiMatrix, an epitope-mapping tool, into an allele-specific scoring system. We then demonstrated that iTEM can be used to define an HLA binding threshold above which immune response is likely and below which immune response is likely to be absent. iTEM's predictive power was strongest when the immune response is focused, such as in subunit vaccination and administration of protein therapeutics. iTEM may be a useful tool for clinical trial design and preclinical evaluation of vaccines and protein therapeutics.

  9. Invariance Testing of the SF-36 Health Survey in Women Breast Cancer Survivors: Do Personal and Cancer-Related Variables Influence the Meaning of Quality of Life Items?

    Science.gov (United States)

    Mosewich, Amber D.; Hadd, Valerie; Crocker, Peter R. E.; Zumbo, Bruno D.

    2013-01-01

    Quality of life (QoL) is affected by issues specific to illness trajectory and thus, may differ, and potentially take on different meanings, at different stages in the cancer process. A widely used measure of QoL is the SF-36 Health Survey (SF-36; Ware 1993); therefore, support for its appropriateness in a given population is imperative. The…

  10. Invariance Testing of the SF-36 Health Survey in Women Breast Cancer Survivors: Do Personal and Cancer-Related Variables Influence the Meaning of Quality of Life Items?

    Science.gov (United States)

    Mosewich, Amber D.; Hadd, Valerie; Crocker, Peter R. E.; Zumbo, Bruno D.

    2013-01-01

    Quality of life (QoL) is affected by issues specific to illness trajectory and thus, may differ, and potentially take on different meanings, at different stages in the cancer process. A widely used measure of QoL is the SF-36 Health Survey (SF-36; Ware 1993); therefore, support for its appropriateness in a given population is imperative. The…

  11. [Portuguese-language cultural adaptation of the Items Banks of Anxiety and Depression of the Patient-Reported Outcomes Measurement Information System (PROMIS)].

    Science.gov (United States)

    Castro, Natália Fontes Caputo de; Rezende, Carlos Henrique Alves de; Mendonça, Tânia Maria da Silva; Silva, Carlos Henrique Martins da; Pinto, Rogério de Melo Costa

    2014-04-01

    The Patient-Reported Outcome Measurement Information System (PROMIS), structured in Itens Banks, provides a new tool for evaluating results that apply to various chronic diseases through advanced statistical techniques (TRI) and computerized adaptive testing (CAT). The aim of this study was to culturally adapt the Items Banks of Anxiety and Depression of PROMIS to the Portuguese language. The process followed the recommendations of PROMIS through the advanced translation, reconciliation, back-translation, FACIT review, independent review, finalization, pre-test, and incorporation of the results from the pre-test. The translated version was pre-tested in ten patients, and items 3, 46, and 53 of the Bank of Anxiety and item 46 of the bank of Depression had to be changed. Changes affected equivalence of meaning, and the final version was consistent with the Brazilian population's linguistic and cultural skills. In conclusion, for the Brazilian population the translated version proved semantically and conceptually equivalent to the original.

  12. A new measure of patient satisfaction with ocular hypotensive medications: The Treatment Satisfaction Survey for Intraocular Pressure (TSS-IOP

    Directory of Open Access Journals (Sweden)

    Stewart Jeanette A

    2003-11-01

    Full Text Available Abstract Purpose To validate the treatment-specific Treatment Satisfaction Survey for Intraocular Pressure (TSS-IOP. Methods Item content was developed by 4 heterogeneous patient focus groups (n = 32. Instrument validation involved 250 patients on ocular hypotensive medications recruited from ophthalmology practices in the Southern USA. Participants responded to demographic and test questions during a clinic visit. Standard psychometric analyses were performed on the resulting data. Sample Of the 412 patients screened, 253 consented to participate, and 250 provided complete datasets. The sample included 44% male (n = 109, 44% Black (n = 109 and 57% brown eyed (n = 142 participants, with a mean age of 64.6 years (SD 13.1 and a history of elevated IOP for an average of 8.4 yrs (SD 7.8. A majority was receiving monotherapy (60%, n = 151. Results A PC Factor analysis (w/ varimax rotation of the 31 items yielded 5 factors (Eigenvalues > 1.0 explaining 70% of the total variance. Weaker and conceptually redundant items were removed and the remaining 15 items reanalyzed. The satisfaction factors were; Eye Irritation (EI; 4 items, Convenience of Use (CofU; 3 items, Ease of Use (EofU; 3 items, Hyperemia (HYP; 3 items, and Medication Effectiveness (EFF; 2 items. Chronbach's Alphas ranged from .80 to .86. Greater distributional skew was found for less common experiences (i.e., HYP & EI with 65% & 48.4% ceilings than for more common experiences (i.e., EofU, CofU, EFF with 10.8%, 20.8% & 15.9% ceilings. TSS-IOP scales converged with conceptually related scales on a previously validated measure of treatment satisfaction, the TSQM (r = .36 to .77. Evidence of concurrent criterion-related validity was found. Patients' symptomatic ratings of eye irritation, hyperemia and difficulties using the medication correlated with satisfaction on these dimensions (r = .30-.56, all p Conclusions This study provides initial evidence that the TSS-IOP is a reliable and valid

  13. Cultural Resources Intensive Survey and Testing of Mississippi River Levee Berms, Crittenden and Desha Counties, Arkansas and Mississippi, Scott, Cape Girardeau and Pemiscot Counties, Missouri Item R-618 Knowlton; Desha County, Arkansas.

    Science.gov (United States)

    1983-11-01

    distribucion of cultural resources within the project area . In addition, information obtained in the background and literature search should be of such scope...DAC0W66-83-C-0030, Item R-618, to conduct a background, archi- val and literature search, and an intensive resources survey of teroject area of proposed...seepage through the levee during periods of flooding. The area surveyed included: 152.4 meters (500 feet) right-of-way perpen- dicular and landside frow

  14. Validation of the depression item bank from the Patient-Reported Outcomes Measurement Information System (PROMIS) in a three-month observational study.

    Science.gov (United States)

    Pilkonis, Paul A; Yu, Lan; Dodds, Nathan E; Johnston, Kelly L; Maihoefer, Catherine C; Lawrence, Suzanne M

    2014-09-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS) is an NIH Roadmap initiative devoted to developing better measurement tools for assessing constructs relevant to the clinical investigation and treatment of all diseases-constructs such as pain, fatigue, emotional distress, sleep, physical functioning, and social participation. Following creation of item banks for these constructs, our priority has been to validate them, most often in short-term observational studies. We report here on a three-month prospective observational study with depressed outpatients in the early stages of a new treatment episode (with assessments at intake, one-month follow-up, and three-month follow-up). The protocol was designed to compare the psychometric properties of the PROMIS depression item bank (administered as a computerized adaptive test, CAT) with two legacy self-report instruments: the Center for Epidemiological Studies Depression scale (CESD; Radloff, 1977) and the Patient Health Questionnaire (PHQ-9; Spitzer et al., 1999). PROMIS depression demonstrated strong convergent validity with the CESD and the PHQ-9 (with correlations in a range from .72 to .84 across all time points), as well as responsiveness to change when characterizing symptom severity in a clinical outpatient sample. Identification of patients as "recovered" varied across the measures, with the PHQ-9 being the most conservative. The use of calibrations based on models from item response theory (IRT) provides advantages for PROMIS depression both psychometrically (creating the possibility of adaptive testing, providing a broader effective range of measurement, and generating greater precision) and practically (these psychometric advantages can be achieved with fewer items-a median of 4 items administered by CAT-resulting in less patient burden).

  15. Measuring Teaching Best Practice in the Induction Years: Development and Validation of an Item-Level Assessment

    Science.gov (United States)

    Kingsley, Laurie; Romine, William

    2014-01-01

    Schools and teacher induction programs around the world routinely assess teaching best practice to inform accreditation, tenure/promotion, and professional development decisions. Routine assessment is also necessary to ensure that teachers entering the profession get the assistance they need to develop and succeed. We introduce the Item-Level…

  16. Creating Vocabulary Item Types That Measure Students' Depth of Semantic Knowledge. Research Report. ETS RR-14-02

    Science.gov (United States)

    Deane, Paul; Lawless, René R.; Li, Chen; Sabatini, John; Bejar, Isaac I.; O'Reilly, Tenaha

    2014-01-01

    We expect that word knowledge accumulates gradually. This article draws on earlier approaches to assessing depth, but focuses on one dimension: richness of semantic knowledge. We present results from a study in which three distinct item types were developed at three levels of depth: knowledge of common usage patterns, knowledge of broad topical…

  17. Measuring social exclusion in routine public health surveys: construction of a multidimensional instrument.

    Directory of Open Access Journals (Sweden)

    Addi P L van Bergen

    Full Text Available INTRODUCTION: Social exclusion is considered a major factor in the causation and maintenance of health inequalities, but its measurement in health research is still in its infancy. In the Netherlands the Institute for Social Research (SCP developed an instrument to measure the multidimensional concept of social exclusion in social and economic policy research. Here, we present a method to construct a similar measure of social exclusion using available data from public health surveys. METHODS: Analyses were performed on data from the health questionnaires that were completed by 20,877 adults in the four largest cities in the Netherlands. From each of the four questionnaires we selected the items that corresponded to those of the SCP-instrument. These were entered into a nonlinear canonical correlation analysis. The measurement properties of the resulting indices and dimension scales were assessed and compared to the SCP-instrument. RESULTS: The internal consistency of the indices and most of the dimension scales were adequate and the internal structure of the indices was as expected. Both generalisabiliy and construct validity were good: in all datasets strong associations were found between the index and a number of known risk factors of social exclusion. A limitation of content validity was that the dimension "lack of normative integration" could not be measured, because no relevant items were available. CONCLUSIONS: Our findings indicate that a measure for social exclusion can be constructed with available health questionnaires. This provides opportunities for application in public health surveillance systems in the Netherlands and elsewhere in the world.

  18. Rasch Measurement Analysis of a 25-Item Version of the Mueller/McCloskey Nurse Job Satisfaction Scale in a Sample of Nurses in Lebanon and Qatar

    Directory of Open Access Journals (Sweden)

    Michael Clinton

    2015-06-01

    Full Text Available The Mueller/McCloskey Nurse Job Satisfaction Scale (MMSS is widely used, but its psychometric characteristics have not been sufficiently validated for use in Middle Eastern countries. The objective of our methodological study was to determine the psychometric suitability of a 25-item version of the MMSS (MMSS-25 for use in middle-income and high-income Middle Eastern countries. A total of 1,322 registered nurses, 859 in Lebanon and 463 in Qatar, completed the MMSS-25 as part of a cross-sectional multinational investigation of nursing shortages in the region. We used the Rasch rating scale model to investigate the psychometric performance of the MMSS-25. We identified possible item bias among MMSS-25 items. We conducted confirmatory factor analyses (CFA to compare the fit to our data of five factor structures reported in the literature. We concluded that irrespective of administration in English or Arabic, the MMSS-25 is not sufficiently productive of measurement for use in the region. A core set of 13 items (MMSS-13, Cronbach’s α = .82 loading on five dimensions eliminates redundant MMSS items and is suitable for initial screening of nurses’ satisfaction. Of the five factor structures we examined, the MMSS-13 was the only close fit to our data (comparative fit index = 0.951; Tucker–Lewis index = 0.931; root mean square error of approximation = 0.051; p value = .401. The MMSS-13 has psychometric characteristics superior to MMSS-25, but additional items are required to meet the research-specific objectives of future studies of nurses’ job satisfaction in Middle Eastern countries.

  19. Measuring determinants of career satisfaction of anesthesiologists: validation of a survey instrument.

    Science.gov (United States)

    Afonso, Anoushka M; Diaz, James H; Scher, Corey S; Beyl, Robbie A; Nair, Singh R; Kaye, Alan David

    2013-06-01

    To measure the parameter of job satisfaction among anesthesiologists. Survey instrument. Academic anesthesiology departments in the United States. 320 anesthesiologists who attended the annual meeting of the ASA in 2009 (95% response rate). The anonymous 50-item survey collected information on 26 independent demographic variables and 24 dependent ranked variables of career satisfaction among practicing anesthesiologists. Mean survey scores were calculated for each demographic variable and tested for statistically significant differences by analysis of variance. Questions within each domain that were internally consistent with each other within domains were identified by Cronbach's alpha ≥ 0.7. P-values ≤ 0.05 were considered statistically significant. Cronbach's alpha analysis showed strong internal consistency for 10 dependent outcome questions in the practice factor-related domain (α = 0.72), 6 dependent outcome questions in the peer factor-related domain (α = 0.71), and 8 dependent outcome questions in the personal factor-related domain (α = 0.81). Although age was not a variable, full-time status, early satisfaction within the first 5 years of practice, working with respected peers, and personal choice factors were all significantly associated with anesthesiologist job satisfaction. Improvements in factors related to job satisfaction among anesthesiologists may lead to higher early and current career satisfaction. Copyright © 2013 Elsevier Inc. All rights reserved.

  20. Assessing the internal validity of a household survey-based food security measure adapted for use in Iran

    Directory of Open Access Journals (Sweden)

    Sadeghizadeh Atefeh

    2009-06-01

    Full Text Available Abstract Background The prevalence of food insecurity is an indicator of material well-being in an area of basic need. The U.S. Food Security Module has been adapted for use in a wide variety of cultural and linguistic settings around the world. We assessed the internal validity of the adapted U.S. Household Food Security Survey Module to measure adult and child food insecurity in Isfahan, Iran, using statistical methods based on the Rasch measurement model. Methods The U.S. Household Food Security Survey Module was translated into Farsi and after adaptation, administered to a representative sample. Data were provided by 2,004 randomly selected households from all sectors of the population of Isfahan, Iran, during 2005. Results 53.1 percent reported that their food had run out at some time during the previous 12 months and they did not have money to buy more, while 26.7 percent reported that an adult had cut the size of a meal or skipped a meal because there was not enough money for food, and 7.2 percent reported that an adult did not eat for a whole day because there was not enough money for food. The severity of the items in the adult scale, estimated under Rasch-model assumptions, covered a range of 6.65 logistic units, and those in the child scale 11.68 logistic units. Most Item-infit statistics were near unity, and none exceeded 1.20. Conclusion The range of severity of items provides measurement coverage across a wide range of severity of food insecurity for both adults and children. Both scales demonstrated acceptable levels of internal validity, although several items should be improved. The similarity of the response patterns in the Isfahan and the U.S. suggests that food insecurity is experienced, managed, and described similarly in the two countries.

  1. The Children's Behavior Questionnaire very short scale: psychometric properties and development of a one-item temperament scale.

    Science.gov (United States)

    Sleddens, Ester F C; Hughes, Sheryl O; O'Connor, Teresia M; Beltran, Alicia; Baranowski, Janice C; Nicklas, Theresa A; Baranowski, Tom

    2012-02-01

    Little research has been conducted on the psychometrics of the very short scale (36 items) of the Children's Behavior Questionnaire, and no one-item temperament scale has been tested for use in applied work. In this study, 237 United States caregivers completed a survey to define their child's behavioral patterns (i.e., Surgency, Negative Affectivity Effortful Control) using both scales. Psychometrics of the 36-item Children's Behavior Questionnaire were examined using classical test theory, principal factor analysis, and item response modeling. Classical test theory analysis demonstrated adequate internal consistency and factor analysis confirmed a three-factor structure. Potential improvements to the measure were identified using item response modeling. A one-item (three response categories) temperament scale was validated against the three temperament factors of the 36-item scale. The temperament response categories correlated with the temperament factors of the 36-item scale, as expected. The one-item temperament scale may be applicable for clinical use.

  2. Danish translation of a physical function item bank from the Patient-Reported Outcome Measurement Information System (PROMIS)

    DEFF Research Database (Denmark)

    Schnohr, Christina W.; Rasmussen, Charlotte L.; Langberg, Henning

    2017-01-01

    of the Physical Function item bank into Danish. METHODS: We followed the PROMIS standard procedure, including: 1) two independent translations, 2) back translation, 3) independent reviews of translation quality, and 4) cognitive interviews with a representative sample of the adult population from the municipality....... Cognitive testing revealed problem of a general issue: annoyance in case of mismatch between respondents' functional level and question difficulty, problems imagining performance on activities that the respondents did not usually do, and uncertainty whether mobility aids (e.g., canes and walkers) should...... be considered when performing an activity. Solutions to the more general issues would require revisions to the original items. CONCLUSIONS: The standard translation methodology was successful in eliminating problems in translation, and pointed to problems of a general issue in some of the original questions...

  3. "Are vocabulary tests measurement invariant between age groups? An item response analysis of three popular tests": Correction to Fox, Berry, and Freeman (2014).

    Science.gov (United States)

    2016-08-01

    Reports an error in "Are vocabulary tests measurement invariant between age groups? An item response analysis of three popular tests" by Mark C. Fox, Jane M. Berry and Sara P. Freeman (Psychology and Aging, 2014[Dec], Vol 29[4], 925-938). In the article, unneeded zeros were inadvertently included at the beginnings of some numbers in Tables 1–4. In addition, the right column in Table 4 includes three unnecessary zeros after asterisks. (The following abstract of the original article appeared in record 2014-49140-001.) Relatively high vocabulary scores of older adults are generally interpreted as evidence that older adults possess more of a common ability than younger adults. Yet, this interpretation rests on empirical assumptions about the uniformity of item-response functions between groups. In this article, we test item response models of differential responding against datasets containing younger-, middle-aged-, and older-adult responses to three popular vocabulary tests (the Shipley, Ekstrom, and WAIS–R) to determine whether members of different age groups who achieve the same scores have the same probability of responding in the same categories (e.g., correct vs. incorrect) under the same conditions. Contrary to the null hypothesis of measurement invariance, datasets for all three tests exhibit substantial differential responding. Members of different age groups who achieve the same overall scores exhibit differing response probabilities in relation to the same items (differential item functioning) and appear to approach the tests in qualitatively different ways that generalize across items. Specifically, younger adults are more likely than older adults to leave items unanswered for partial credit on the Ekstrom, and to produce 2-point definitions on the WAIS–R. Yet, older adults score higher than younger adults, consistent with most reports of vocabulary outcomes in the cognitive aging literature. In light of these findings, the most generalizable

  4. Measuring Substance Use and Misuse via Survey Research: Unfinished Business.

    Science.gov (United States)

    Johnson, Timothy P

    2015-01-01

    This article reviews unfinished business regarding the assessment of substance use behaviors by using survey research methodologies, a practice that dates back to the earliest years of this journal's publication. Six classes of unfinished business are considered including errors of sampling, coverage, non-response, measurement, processing, and ethics. It may be that there is more now that we do not know than when this work began some 50 years ago.

  5. Survey of spectral response measurements for photovoltaic devices

    Energy Technology Data Exchange (ETDEWEB)

    Hartman, J.S.; Lind, M.A.

    1981-11-01

    A survey of the photovoltaic community was conducted to ascertain the present state-of-the-art for PV spectral response measurements. Specific topics explored included measurement system designs, good and bad features of the systems, and problems encountered in the evaluation of specific cell structures and materials. The survey showed that most spectral response data are used in diagnostic analysis for the optimization of developmental solar cells. Measurement systems commonly utilize a chopped narrowband source in conjunction with a constant bias illumination which simulates the ambient end use environment. Researchers emphasized the importance of bias illumination for all types of cells in order to minimize the effects of nonlinearities in cell response. Not surprisingly single crystal silicon cells present the fewest measurement problems to the researcher and have been studied more thoroughly than any other type of solar cell. But, the accurate characterization of silicon cells is still difficult and laboratory intercomparison studies have yielded data scatter ranging from +-5% to +-15%. The measurement experience with other types of cells is less extensive. The development of reliable data bases for some solar cells is complicated by problems of cell nonuniformity, environmental instability, nonlinearity, etc. Cascade cells present new problems associated with their structue (multiple cells in series) which are just beginning to be understood. In addition, the importance of many measurement parameters (spectral content of bias light, bias light intensity, bias voltage, chopping frequency, etc.) are not fully understood for most types of solar cells.

  6. Applying OGC Standards to Develop a Land Surveying Measurement Model

    Directory of Open Access Journals (Sweden)

    Ioannis Sofos

    2017-02-01

    Full Text Available The Open Geospatial Consortium (OGC is committed to developing quality open standards for the global geospatial community, thus enhancing the interoperability of geographic information. In the domain of sensor networks, the Sensor Web Enablement (SWE initiative has been developed to define the necessary context by introducing modeling standards, like ‘Observation & Measurement’ (O&M and services to provide interaction like ‘Sensor Observation Service’ (SOS. Land surveying measurements on the other hand comprise a domain where observation information structures and services have not been aligned to the OGC observation model. In this paper, an OGC-compatible, aligned to the ‘Observation and Measurements’ standard, model for land surveying observations has been developed and discussed. Furthermore, a case study instantiates the above model, and an SOS implementation has been developed based on the 52° North SOS platform. Finally, a visualization schema is used to produce ‘Web Map Service (WMS’ observation maps. Even though there are elements that differentiate this work from classic ‘O&M’ modeling cases, the proposed model and flows are developed in order to provide the benefits of standardizing land surveying measurement data (cost reducing by reusability, higher precision level, data fusion of multiple sources, raw observation spatiotemporal repository access, development of Measurement-Based GIS (MBGIS to the geoinformation community.

  7. Quality of life and discriminating power of two questionnaires in fibromyalgia patients: fibromyalgia Impact Questionnaire and Medical Outcomes Study 36-Item Short-Form Health Survey A qualidade de vida e o poder de discriminação de dois questionários em pacientes com fibromialgia: fibromyalgia Impact Questionnaire e Medical Outcomes Study 36-Item Short-Form Health Survey

    Directory of Open Access Journals (Sweden)

    Ana Assumpção

    2010-08-01

    Full Text Available BACKGROUND: Fibromyalgia is a painful syndrome characterized by widespread chronic pain and associated symptoms with a negative impact on quality of life. OBJECTIVES: Considering the subjectivity of quality of life measurements, the aim of this study was to verify the discriminating power of two quality of life questionnaires in patients with fibromyalgia: the generic Medical Outcomes Study 36-Item Short-Form Health Survey (SF-36 and the specific Fibromyalgia Impact Questionnaire (FIQ. METHODS: A cross-sectional study was conducted on 150 participants divided into Fibromyalgia Group (FG and Control Group (CG (n=75 in each group. The participants were evaluated using the SF-36 and the FIQ. The data were analyzed by the Student t-test (α=0.05 and inferential analysis using the Receiver Operating Characteristics (ROC Curve - sensitivity, specificity and area under the curve (AUC. The significance level was 0.05. RESULTS: The sample was similar for age (CG: 47.8±8.1; FG: 47.0±7.7 years. A significant difference was observed in quality of life assessment in all aspects of both questionnaires (pCONTEXTUALIZAÇÃO: A fibromialgia é uma síndrome dolorosa caracterizada por dor espalhada e crônica e sintomas associados com um impacto negativo na qualidade de vida. OBJETIVOS: Considerando a subjetividade da mensuração de qualidade de vida, o objetivo deste estudo foi avaliar o poder de discriminação de dois questionários que avaliam a qualidade de vida de pacientes com fibromialgia: o genérico Medical Short Form Healthy Survey (SF-36 e o específico Questionário do Impacto da Fibromialgia (QIF. MÉTODOS: Foi conduzido um estudo transversal com 150 indivíduos, divididos em dois grupos: grupo fibromialgia (FM e grupo controle (GC (n=75 em ambos. Os pacientes foram avaliados pelo SF-36 e pelo QIF. Na análise dos dados, utilizou-se o teste "t de Student" com α=0,05 e a Curva ROC (Receiver Operating Characteristics Curve. RESULTADOS: As amostras

  8. Can A Galaxy Redshift Survey Measure Dark Energy Clustering?

    CERN Document Server

    Takada, M

    2006-01-01

    (abridged) A wide-field galaxy redshift survey allows one to probe galaxy clustering at largest spatial scales, which carries an invaluable information on horizon-scale physics complementarily to the cosmic microwave background (CMB). Assuming the planned survey consisting of z~1 and z~3 surveys with areas of 2000 and 300 square degrees, respectively, we study the prospects for probing dark energy clustering from the measured galaxy power spectrum, assuming the dynamical properties of dark energy are specified in terms of the equation of state and the effective sound speed c_e in the context of an adiabatic cold dark matter (CDM) model. The dark energy clustering adds a power to the galaxy power spectrum amplitude at spatial scales greater than the sound horizon, and the enhancement is sensitive to redshift evolution of the net dark energy density, i.e. the equation of state. We find that the galaxy survey, when combined with Planck, can distinguish dark energy clustering from a smooth dark energy model such ...

  9. Validation of Portuguese version of Quality of Erection Questionnaire (QEQ) and comparison to International Index of Erectile Function (IIEF) and RAND 36-Item Health Survey.

    Science.gov (United States)

    Reis, Ana Luiza; Reis, Leonardo Oliveira; Saade, Ricardo Destro; Santos, Carlos Alberto; Lima, Marcelo Lopes de; Fregonesi, Adriano

    2015-01-01

    To validate the Quality of Erection Questionnaire (QEQ) considering Brazilian social-cultural aspects. To determine equivalence between the Portuguese and the English QEQ versions, the Portuguese version was back-translated by two professors who are native English speakers. After language equivalence had been determined, urologists considered the QEQ Portuguese version suitable. Men with self-reported erectile dysfunction (ED) and infertile men who had a stable sexual relationship for at least 6 months were invited to answer the QEQ, the International Index of Erectile Function (IIEF) and the RAND 36-Item Health Survey (RAND-36). The questionnaires were presented together and answered without help in a private room. Internal consistency (Cronbach's α), test-retest reliability (Spearman), convergent validity (Spearman correlation) coefficients and known-groups validity (the ability of the QEQ Portuguese version to differentiate erectile dysfunction severity groups) were assessed. We recruited 197 men (167 ED patients and 30 non-ED patients), mean age of 53.3 and median of 55.5 years (23-82 years). The Portuguese version of the QEQ had high internal consistency (Cronbach α=0.93), high stability between test and retest (ICC 0.83, with IC 95%: 0.76-0.88, pPortuguese version presented good psychometric properties and high convergent validity in relation to IIEF. The low correlations between the QEQ and the RAND-36, as well as between the IIEF and the RAND-36 indicated IIEF and QEQ specificity, which may have resulted from the patients' psychological adaptations that minimized the impact of ED on Quality of Life (QoL) and reestablished the well-being feeling.

  10. Development of Survey Scales for Measuring Exposure and Behavioral Responses to Disruptive Intraoperative Behavior.

    Science.gov (United States)

    Villafranca, Alexander; Hamlin, Colin; Rodebaugh, Thomas L; Robinson, Sandra; Jacobsohn, Eric

    2017-09-10

    Disruptive intraoperative behavior has detrimental effects to clinicians, institutions, and patients. How clinicians respond to this behavior can either exacerbate or attenuate its effects. Previous investigations of disruptive behavior have used survey scales with significant limitations. The study objective was to develop appropriate scales to measure exposure and responses to disruptive behavior. We obtained ethics approval. The scales were developed in a sequence of steps. They were pretested using expert reviews, computational linguistic analysis, and cognitive interviews. The scales were then piloted on Canadian operating room clinicians. Factor analysis was applied to half of the data set for question reduction and grouping. Item response analysis and theoretical reviews ensured that important questions were not eliminated. Internal consistency was evaluated using Cronbach α. Model fit was examined on the second half of the data set using confirmatory factor analysis. Content validity of the final scales was re-evaluated. Consistency between observed relationships and theoretical predictions was assessed. Temporal stability was evaluated on a subsample of 38 respondents. A total of 1433 and 746 clinicians completed the exposure and response scales, respectively. Content validity indices were excellent (exposure = 0.96, responses = 1.0). Internal consistency was good (exposure = 0.93, responses = 0.87). Correlations between the exposure scale and secondary measures were consistent with expectations based on theory. Temporal stability was acceptable (exposure = 0.77, responses = 0.73). We have developed scales measuring exposure and responses to disruptive behavior. They generate valid and reliable scores when surveying operating room clinicians, and they overcome the limitations of previous tools. These survey scales are freely available.

  11. Measuring the Impact of Active Learning in a Redesigned Large-enrollment Introductory Geoscience Survey Course

    Science.gov (United States)

    Slater, T. F.; Slater, S. J.; Lyons, D. J.; Manhart, K.; Wehunt, M.; Kapp, J.; Richardson, R. M.

    2008-12-01

    Over the past two years, faculty in the UA Geosciences engaged in a major course redesign effort with the goal of improving student learning and attitudes while, at the same time, dramatically reducing costs of offering such a course. The course serves as an undergraduate general education science requirement. Using a reformed teaching framework of learner-centered education, the large-enrollment, introductory geosciences survey course was overhauled to include interactive lectures, just-in-time online quizzes, a next- generation textbook, and weekly discussions lead by graduate students and undergraduate peer mentors. The Geosciences Concept Inventory (GCI) was given as a pre-test/post-test to students in the pre-modified course and the reformed course to compare student learning in terms of gain scores. Although the entire 78-items were administered, divided into three forms, only the 36-items that most directly related to course content were used for analysis. Students in the unmodified course had a pre-test GCI percentage correct of 28.67 (SD=11.45, n=96) which increased to 45.26 (SD=11.76, n=84) on the post-test. After course redesign, students had a pre-test GCI percentage correct of 38.62 (SD=8.42, n=144) which increased a post-test GCI score of 48.73 (SD=7.49, n=132). Although the gains from pre-test to post-test are statistically significant, the different in post-test GCI scores between the two groups is not. This is interpreted as students' knowledge levels, insofar as the GCI can measure, were equivalent in both courses. The Likert-style Attitudes Toward Science Survey was given as an end-of-class post-test to students in the pre-modified course and as a pre-test post-test the reformed course to compare student attitudes between the courses. The average ranking on a 5-point scale as a pre-test was 3.454 (SD=1.259, n=508) whereas the post-tests for the unmodified course was 3.49 (SD=1.05, n=101) and 3.50 (SD=1.127, n=369) for the reformed course. As is

  12. Performance of the Patient-Reported Outcomes Measurement Information System 29-Item Profile in Rheumatoid Arthritis, Osteoarthritis, Fibromyalgia, and Systemic Lupus Erythematosus.

    Science.gov (United States)

    Katz, Patricia; Pedro, Sofia; Michaud, Kaleb

    2017-09-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS) was developed to improve measurement of patient-reported outcomes. We examined performance of the 29-item PROMIS Profile (PROMIS-29) in persons with rheumatoid arthritis (RA), osteoarthritis (OA), fibromyalgia (FM), and systemic lupus erythematosus (SLE). Participants in the National Data Bank for Rheumatic Diseases completed the PROMIS-29, which includes 4-item forms for 7 PROMIS domains. Scales were scored and converted to T scores. Distributions of scale scores were examined, convergent and known-groups validity was tested, and differences in scores from online versus paper questionnaires were examined. Sample sizes were 4,346 for RA, 727 for OA, 241 for FM, and 240 for SLE. Participants were predominantly female, with a mean disease duration ≥20 years, and were ages ∼60 years. Large ceiling effects occurred for some PROMIS-29 scales. Correlations of PROMIS-29 scores with scales measuring similar constructs ranged from high to moderate for RA, OA, and SLE; correlations for FM were markedly lower for some scales. Consistent patterns of worsening PROMIS-29 scores with increasing disease severity or declining health status were observed. Differences in scores obtained by online versus paper questionnaires ranged from 0.3 to 2.2 points. Results provide guarded support for using the PROMIS-29 in these conditions. The PROMIS-29 4-item static forms appear to identify differences among levels of health and to measure constructs similar to those measured by legacy questionnaires. However, large ceiling effects suggest that measurement may be more precise at the "bad" ends of the scales, which may limit responsiveness, and differences by mode of administration appear to exist. © 2016, American College of Rheumatology.

  13. Development of six PROMIS pediatrics proxy-report item banks

    Directory of Open Access Journals (Sweden)

    Irwin Debra E

    2012-02-01

    Full Text Available Abstract Background Pediatric self-report should be considered the standard for measuring patient reported outcomes (PRO among children. However, circumstances exist when the child is too young, cognitively impaired, or too ill to complete a PRO instrument and a proxy-report is needed. This paper describes the development process including the proxy cognitive interviews and large-field-test survey methods and sample characteristics employed to produce item parameters for the Patient Reported Outcomes Measurement Information System (PROMIS pediatric proxy-report item banks. Methods The PROMIS pediatric self-report items were converted into proxy-report items before undergoing cognitive interviews. These items covered six domains (physical function, emotional distress, social peer relationships, fatigue, pain interference, and asthma impact. Caregivers (n = 25 of children ages of 5 and 17 years provided qualitative feedback on proxy-report items to assess any major issues with these items. From May 2008 to March 2009, the large-scale survey enrolled children ages 8-17 years to complete the self-report version and caregivers to complete the proxy-report version of the survey (n = 1548 dyads. Caregivers of children ages 5 to 7 years completed the proxy report survey (n = 432. In addition, caregivers completed other proxy instruments, PedsQL™ 4.0 Generic Core Scales Parent Proxy-Report version, PedsQL™ Asthma Module Parent Proxy-Report version, and KIDSCREEN Parent-Proxy-52. Results Item content was well understood by proxies and did not require item revisions but some proxies clearly noted that determining an answer on behalf of their child was difficult for some items. Dyads and caregivers of children ages 5-17 years old were enrolled in the large-scale testing. The majority were female (85%, married (70%, Caucasian (64% and had at least a high school education (94%. Approximately 50% had children with a chronic health condition, primarily

  14. A Survey of Advanced Microwave Frequency Measurement Techniques

    Directory of Open Access Journals (Sweden)

    Anand Swaroop Khare

    2012-06-01

    Full Text Available Microwaves are radio waves with wavelengths ranging from as long as one meter to as short as one millimeter, or equivalently, with frequencies between 300 MHz and 300 GHz. The science of photonics includes the generation, emission, modulation, signal processing, switching, transmission, amplification, detection and sensing of light. Microwave photonics has been introduced for achieving ultra broadband signal processing. Instantaneous Frequency Measurement (IFM receivers play an important role in electronic warfare. Technologies used for signal processing, include conventional direct Radio Frequency (RF techniques, digital techniques, intermediate frequency (IF techniques and photonic techniques. Direct RF techniques suffer an increased loss, high dispersion, and unwanted radiation problems in high frequencies. The systems that use traditional RF techniques can be bulky and often lack the agility required to perform advanced signal processing in rapidly changing environments. In this paper we discussed a survey of Microwave Frequency Measurement Techniques. The microwaves techniques are categorized based upon different approaches. This paper provides the major advancement in the Microwave Frequency MeasurementTechniques research; using these approaches the features and categories in the surveyed existing work.

  15. The Laboratory Course Assessment Survey: A Tool to Measure Three Dimensions of Research-Course Design.

    Science.gov (United States)

    Corwin, Lisa A; Runyon, Christopher; Robinson, Aspen; Dolan, Erin L

    2015-01-01

    Course-based undergraduate research experiences (CUREs) are increasingly being offered as scalable ways to involve undergraduates in research. Yet few if any design features that make CUREs effective have been identified. We developed a 17-item survey instrument, the Laboratory Course Assessment Survey (LCAS), that measures students' perceptions of three design features of biology lab courses: 1) collaboration, 2) discovery and relevance, and 3) iteration. We assessed the psychometric properties of the LCAS using established methods for instrument design and validation. We also assessed the ability of the LCAS to differentiate between CUREs and traditional laboratory courses, and found that the discovery and relevance and iteration scales differentiated between these groups. Our results indicate that the LCAS is suited for characterizing and comparing undergraduate biology lab courses and should be useful for determining the relative importance of the three design features for achieving student outcomes. © 2015 L. A. Corwin et al. CBE—Life Sciences Education © 2015 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).

  16. Aerial Measuring System (AMS) Baseline Surveys for Emergency Planning

    Energy Technology Data Exchange (ETDEWEB)

    Lyons, C

    2012-06-04

    Originally established in the 1960s to support the Nuclear Test Program, the AMS mission is to provide a rapid and comprehensive worldwide aerial measurement, analysis, and interpretation capability in response to a nuclear/radiological emergency. AMS provides a responsive team of individuals whose processes allow for a mission to be conducted and completed with results available within hours. This presentation slide-show reviews some of the history of the AMS, summarizes present capabilities and methods, and addresses the value of the surveys.

  17. The Advantage of the Second Military Survey in Fluvial Measures

    Science.gov (United States)

    Kovács, G.

    2009-04-01

    The Second Military Survey of the Habsburg Empire, completed in the 19th century, can be very useful in different scientific investigations owing to its accuracy and data content. The fact, that the mapmakers used geodetic projection, and the high accuracy of the survey guarantee that scientists can use these maps and the represented objects can be evaluated in retrospective studies. Among others, the hydrological information of the map sheets is valuable. The streams were drawn with very thin lines that also ascertain the high accuracy of their location, provided that the geodetic position of the sheet can be constructed with high accuracy. After geocoding these maps we faced the high accuracy of line elements. Not only the location of these lines but the form of the creeks are usually almost the same as recent shape. The goal of our study was the neotectonic evaluation of the western part of the Pannonian Basin, bordered by Pinka, Rába and Répce Rivers. The watercourses, especially alluvial ones, react very sensitively to tectonic forcing. However, the present-day course of the creeks and rivers are mostly regulated, therefore they are unsuitable for such studies. Consequently, the watercourses should be reconstructed from maps surveyed prior to the main water control measures. The Second Military Survey is a perfect source for such studies because it is the first survey has drawn in geodetic projection but the creeks haven't been regulated yet. The maps show intensive agricultural cultivation and silviculture in the study area. Especially grazing cultivation precincts of the streams is important for us. That phenomenon and data from other sources prove that the streams haven't been regulated in that time. The streams were able to meander, and flood its banks, and only natural levees are present. The general morphology south from the Kőszegi Mountains shows typical SSE slopes with low relief cut off by 30-60 meter high scarps followed by streams. That

  18. Fault Location Based on Synchronized Measurements: A Comprehensive Survey

    Directory of Open Access Journals (Sweden)

    A. H. Al-Mohammed

    2014-01-01

    Full Text Available This paper presents a comprehensive survey on transmission and distribution fault location algorithms that utilize synchronized measurements. Algorithms based on two-end synchronized measurements and fault location algorithms on three-terminal and multiterminal lines are reviewed. Series capacitors equipped with metal oxide varistors (MOVs, when set on a transmission line, create certain problems for line fault locators and, therefore, fault location on series-compensated lines is discussed. The paper reports the work carried out on adaptive fault location algorithms aiming at achieving better fault location accuracy. Work associated with fault location on power system networks, although limited, is also summarized. Additionally, the nonstandard high-frequency-related fault location techniques based on wavelet transform are discussed. Finally, the paper highlights the area for future research.

  19. Measurement model equivalence in web- and paper-based surveys

    African Journals Online (AJOL)

    participated in the survey; of these, 899 used paper questionnaires ... 9Confirmatory factor analysis (CFA) in a Structural Equation Modelling ... settings and, of course, the general survey and survey research industry. ...... Survey design features .... used to evaluate the tenability of a series of increasingly restrictive models.

  20. Measuring Knowledge with Items in Ch’ol and Spanish in Elementary School Children with a Bilingual Intercultural Educational Model

    Directory of Open Access Journals (Sweden)

    José Bastiani Gómez

    2013-04-01

    Full Text Available It is assumed that the preferential use of Spanish as language mediator in education, as opposed to an indigenous language, negatively impacts children’s learning. In this study we explore the learning problems that are engendered in children through the use of a language other than their mother tongue in school. A test was conducted in Spanish and Ch’ol, with ten items that focused on linguistic and cultural identity, logic, mathematics, Spanish, history, geography and geometry. Three possible answers were offered, only one of which was correct. The test was administered to 53 fifth-grade children and the same number of sixth-grade students in the Indigenous Education Schools of the Ch’ol region. Between 50 and 70% of the students in both grades obtained six or seven correct answers in both languages. The results suggest that there is a deficiency in the level of knowledge and while we conclude that language does not appear to be a major limitation to learning, nevertheless we cannot rule out that the use of the mother tongue as a means of communication during teaching processes could facilitate meaningful learning.

  1. Measuring vaccine hesitancy: The development of a survey tool.

    Science.gov (United States)

    Larson, Heidi J; Jarrett, Caitlin; Schulz, William S; Chaudhuri, Mohuya; Zhou, Yuqing; Dube, Eve; Schuster, Melanie; MacDonald, Noni E; Wilson, Rose

    2015-08-14

    In March 2012, the SAGE Working Group on Vaccine Hesitancy was convened to define the term "vaccine hesitancy", as well as to map the determinants of vaccine hesitancy and develop tools to measure and address the nature and scale of hesitancy in settings where it is becoming more evident. The definition of vaccine hesitancy and a matrix of determinants guided the development of a survey tool to assess the nature and scale of hesitancy issues. Additionally, vaccine hesitancy questions were piloted in the annual WHO-UNICEF joint reporting form, completed by National Immunization Managers globally. The objective of characterizing the nature and scale of vaccine hesitancy issues is to better inform the development of appropriate strategies and policies to address the concerns expressed, and to sustain confidence in vaccination. The Working Group developed a matrix of the determinants of vaccine hesitancy informed by a systematic review of peer reviewed and grey literature, and by the expertise of the working group. The matrix mapped the key factors influencing the decision to accept, delay or reject some or all vaccines under three categories: contextual, individual and group, and vaccine-specific. These categories framed the menu of survey questions presented in this paper to help diagnose and address vaccine hesitancy.

  2. Questionnaire survey, Indoor climate measurements and Energy consumption

    DEFF Research Database (Denmark)

    Knudsen, Henrik Nellemose; Thomsen, Kirsten Engelund; Mørck, Ove

    2012-01-01

    to be designed and constructed with a heating demand corresponding to the Danish low-energy standard referred to as "low-energy class 1" in a new settlement called Stenløse Syd. This means that the energy consumption is to be 50% lower than the requirement in BR08 (Danish Building Regulations 2008). 66 flats....... This report presents part of the results of an evaluation of the project that was performed in the settlement. The evaluation consisted of a questionnaire survey of occupant experiences and satisfaction in 35 single-family houses, measurements of energy consumption in 22 selected single-family houses and 58......The municipality of Egedal decided in 2006 to make use of the possibility in the Danish Planning Law for a municipality to tighten the energy requirements in the local plan for a new settlement to be erected in the municipality. During the years 2007-2011 a total of 442 dwellings were...

  3. SURVEY ON CLUSTERING ALGORITHM AND SIMILARITY MEASURE FOR CATEGORICAL DATA

    Directory of Open Access Journals (Sweden)

    S. Anitha Elavarasi

    2014-01-01

    Full Text Available Learning is the process of generating useful information from a huge volume of data. Learning can be either supervised learning (e.g. classification or unsupervised learning (e.g. Clustering Clustering is the process of grouping a set of physical objects into classes of similar object. Objects in real world consist of both numerical and categorical data. Categorical data are not analyzed as numerical data because of the absence of inherit ordering. This paper describes about ten different clustering algorithms, its methodology and the factors influencing its performance. Each algorithm is evaluated using real world datasets and its pro and cons are specified. The various similarity / dissimilarity measure applied to categorical data and its performance is also discussed. The time complexity defines the amount of time taken by an algorithm to perform the elementary operation. The time complexity of various algorithms are discussed and its performance on real world data such as mushroom, zoo, soya bean, cancer, vote, car and iris are measured. In this survey Cluster Accuracy and Error rate for four different clustering algorithm (K-modes, fuzzy K-modes, ROCK and Squeezer, two different similarity measure (DISC and Overlap and DILCA applied for hierarchy and partition algorithm are evaluated.

  4. PROBABILISTIC MEASURES FOR INTERESTINGNESS OF DEVIATIONS – A SURVEY

    Directory of Open Access Journals (Sweden)

    Adnan Masood

    2013-03-01

    Full Text Available Association rule mining has long being plagued with the problem of finding meaningful, actionable knowledge from the large set of rules. In this age of data deluge with modern computing capabilities, we gather, distribute, and store information in vast amounts from diverse data sources. With such data profusion, the core knowledge discovery problem becomes efficient data retrieval rather than simply finding heaps of information. The most common approach is to employ measures of rule interestingness to filter the results of the association rule generation process. However, study of literature suggests that interestingness is difficult to define quantitatively and can be best summarized as, a record or pattern is interesting if it suggests a change in an established model. Almost twenty years ago, Gregory Piatetsky-Shapiro and Christopher J. Matheus, in their paper, “The Interestingness of Deviations,” argued that deviations should be grouped together in a finding and that the interestingness of a finding is the estimated benefit from a possible action connected to it. Since then, this field has progressed and new data mining techniques have been introduced to address the subjective, objective, and semantic interestingness measures. In this brief survey, we review the current state of literature around interestingness of deviations, i.e. outliers with specific interest around probabilistic measures using Bayesian belief networks.

  5. Outcome measure for stress urinary incontinence treatment (OMIT): results of two society of urodynamics and female urology (SUFU) surveys.

    Science.gov (United States)

    Zimmern, Philippe; Kobashi, Kathleen; Lemack, Gary

    2010-06-01

    To reach some agreement on a minimum set of outcomes measures (OMs) for the post-operative evaluation of stress incontinent women, we applied the concept of "lower common denominator" to study which OM instruments are used amongst SUFU members. With SUFU approval, a short online, 11 items, email-based survey was prepared to assess what OMs current SUFU members are using in daily practice. The first survey administered after the annual SUFU meeting targeted recent SUFU meeting attendees. The same survey was redistributed later on to include all SUFU members. Each survey ran for a 10-day period. Response rate for the first survey was 50 (approximately 30%) and 106 (approximately 25%) for the second. Responders were geographically well distributed, had been in practice for 1-15 years (approximately 75%), performed 5-15 cases/month, and practiced in a university (56%) or group (30%) setting. Great consistency was noted between surveys for preferred questionnaires [UDI-6 (40-52%), UDI-6, and IIQ-7 (30-34%)], office tests [urinalysis and post-void residual (30-35%)], exam [Baden-Walker and/or POP-Q (38-55%), cough stress test (54-51%)], imaging (none), and urodynamics (none, unless complications). The most common "dislikes" in descending order were: 24 hr pad test, Q-tip test, bladder diary, long questionnaires, POP-Q. These two SUFU surveys did not explore what each physician thinks is the best OM but what members use regularly in their practices. Similar findings were noted in both surveys, supporting the concept that a minimum set of OM could be developed for reporting surgical outcomes of incontinence procedures in the future. (c) 2010 Wiley-Liss, Inc.

  6. Using automatic item generation to create multiple-choice test items.

    Science.gov (United States)

    Gierl, Mark J; Lai, Hollis; Turner, Simon R

    2012-08-01

    Many tests of medical knowledge, from the undergraduate level to the level of certification and licensure, contain multiple-choice items. Although these are efficient in measuring examinees' knowledge and skills across diverse content areas, multiple-choice items are time-consuming and expensive to create. Changes in student assessment brought about by new forms of computer-based testing have created the demand for large numbers of multiple-choice items. Our current approaches to item development cannot meet this demand. We present a methodology for developing multiple-choice items based on automatic item generation (AIG) concepts and procedures. We describe a three-stage approach to AIG and we illustrate this approach by generating multiple-choice items for a medical licensure test in the content area of surgery. To generate multiple-choice items, our method requires a three-stage process. Firstly, a cognitive model is created by content specialists. Secondly, item models are developed using the content from the cognitive model. Thirdly, items are generated from the item models using computer software. Using this methodology, we generated 1248 multiple-choice items from one item model. Automatic item generation is a process that involves using models to generate items using computer technology. With our method, content specialists identify and structure the content for the test items, and computer technology systematically combines the content to generate new test items. By combining these outcomes, items can be generated automatically. © Blackwell Publishing Ltd 2012.

  7. Correlation between bioaerosol microbial community characteristics and real-time measurable environmental items: A case study from KORUS-AQ pre-campaign in Seoul, Korea

    Science.gov (United States)

    Yoo, H.

    2015-12-01

    Due to global climate change, bioaerosols are more globally mixed with a more random manner. During a long-distance traveling dust event, the number of microbes significantly increases in bioaerosol, and the chance for bioaerosol to contain human pathogenic microorganisms may also increase. Recently, we have found that bioaerosol microbial community characteristics (copy number of total bacterial 16S rRNA genes, and population diversity and composition) are correlated with the quantitative detection of potential human pathogens. However, bioaerosol microbial community characteristics cannot be directly used in real-time monitoring because the DNA-based detection method requires at least couple days or a week to get reliable data. To circumvent this problem, a correlation of microbial community characteristics with real-time measurable environmental items (PM10, PM2.5, temperature, humidity, NOx, O3 etc.), if any, will be useful in frequent assessment of microbial risk from available real-time measured environmental data. In this work, we monitored bioaerosol microbial communities using a high-throughput DNA sequencing method (Mi-seq) during the KORUS-AQ (KoreaUS-Air Quality) pre-campaign (May to June, 2015) in Seoul, and investigated whether any correlation exists between the bioaerosol microbial community characteristics and the real-time measureable environmental items simultaneously attained during the pre-campaign period. At the pre-campaign site (Korea Institute of Science and Technology, Seoul), bioaerosol samples were collected using high volume air sampler, and their 16S rRNA gene based bacterial communities were analyzed by Miseq sequencing and bioinformatics. Simultaneously, atmosphere environmental items were monitored at the same site. Using Decision Tree, a non-linear multi-variant correlation was observed between the bioaerosol microbial community characteristics and the real-time measured atmosphere chemistry data, and a rule induction was developed

  8. Summarizing activity limitations in children with chronic illnesses living in the community: a measurement study of scales using supplemented interRAI items

    Directory of Open Access Journals (Sweden)

    Phillips Charles D

    2012-01-01

    Full Text Available Abstract Background To test the validity and reliability of scales intended to measure activity limitations faced by children with chronic illnesses living in the community. The scales were based on information provided by caregivers to service program personnel almost exclusively trained as social workers. The items used to measure activity limitations were interRAI items supplemented so that they were more applicable to activity limitations in children with chronic illnesses. In addition, these analyses may shed light on the possibility of gathering functional information that can span the life course as well as spanning different care settings. Methods Analyses included testing the internal consistency, predictive, concurrent, discriminant and construct validity of two activity limitation scales. The scales were developed using assessment data gathered in the United States of America (USA from over 2,700 assessments of children aged 4 to 20 receiving Medicaid Early and Periodic Screening, Diagnostic and Treatment (EPSDT services, specifically Personal Care Services to assist children in overcoming activity limitations. The Medicaid program in the USA pays for health care services provided to children in low-income households. Data were collected in a single, large state in the southwestern USA in late 2008 and early 2009. A similar sample of children was assessed in 2010, and the analyses were replicated using this sample. Results The two scales exhibited excellent internal consistency. Evidence on the concurrent, predictive, discriminant, and construct validity of the proposed scales was strong. Quite importantly, scale scores were not correlated with (confounded with a child's developmental stage or age. The results for these scales and items were consistent across the two independent samples. Conclusions Unpaid caregivers, usually parents, can provide assessors lacking either medical or nursing training with reliable and valid information

  9. The 12-item medical outcomes study short form health survey version 2.0 (SF-12v2: a population-based validation study from Tehran, Iran

    Directory of Open Access Journals (Sweden)

    Omidvari Sepideh

    2011-03-01

    Full Text Available Abstract Background The SF-12v2 is the improved version of the SF-12v1. This study aimed to validate the SF-12v2 in Iran. Methods A random sample of the general population aged 18 years and over living in Tehran, Iran completed the instrument. Reliability was estimated using internal consistency and validity was assessed using known-groups comparison and convergent validity. In addition the factor structure of the questionnaire was extracted by performing both exploratory and confirmatory factor analyses (EFA and CFA. Results In all, 3685 individuals were studied (1887male and 1798 female. Internal consistency for both summary measures was satisfactory. Cronbach's α for the Physical Component Summary (PCS-12 was 0.87 and for the Mental Component Summary (MCS-12 it was 0.82. Known-groups comparison showed that the SF-12v2 discriminated well between men and women and those who differed in age and educational status (P Conclusion Although the findings could not be generalized to the Iranian population, overall the findings suggest that the SF-12v2 is a reliable and valid measure of health related quality of life among Iranians and now could be used in future health outcome studies. However, further studies are recommended to establish its stability, responsiveness to change, and concurrent validity for this health survey in Iran.

  10. Optimal item pool design for computerized adaptive tests with polytomous items using GPCM

    Directory of Open Access Journals (Sweden)

    Xuechun Zhou

    2014-09-01

    Full Text Available Computerized adaptive testing (CAT is a testing procedure with advantages in improving measurement precision and increasing test efficiency. An item pool with optimal characteristics is the foundation for a CAT program to achieve those desirable psychometric features. This study proposed a method to design an optimal item pool for tests with polytomous items using the generalized partial credit model (G-PCM. It extended a method for approximating optimality with polytomous items being described succinctly for the purpose of pool design. Optimal item pools were generated using CAT simulations with and without practical constraints of content balancing and item exposure control. The performances of the item pools were evaluated against an operational item pool. The results indicated that the item pools designed with stratification based on discrimination parameters performed well with an efficient use of the less discriminative items within the target accuracy levels. The implications for developing item pools are also discussed.

  11. Performance of the Family Satisfaction with the End-of-Life Care (FAMCARE) measure in an ethnically diverse cohort: psychometric analyses using item response theory.

    Science.gov (United States)

    Teresi, Jeanne A; Ornstein, Katherine; Ocepek-Welikson, Katja; Ramirez, Mildred; Siu, Albert

    2014-02-01

    The Family Satisfaction with End-of-Life Care (FAMCARE) has been used widely among caregivers to individuals with cancer. The aim of this study was to evaluate the psychometric properties of this measure using item response theory (IRT). The analytic sample was comprised of caregivers to 1,983 patients with advanced cancer. Among the patients, 56 % were females, with mean age 59.9 years (s.d. = 11.8), 20 % were non-Hispanic Black. The majority were family members either living with (44 %) or not living with (35 %) the patient. Factor analyses and IRT were used to examine the dimensionality, information, and reliability of the FAMCARE. Although a bi-factor model fit the data slightly better than did a unidimensional model, the loadings on the group factors were very low. Thus, a unidimensional model appears to provide adequate representation for the item set. The reliability estimates, calculated along the satisfaction (theta) continuum, were adequate (>0.80) for all levels of theta for which subjects had scores. Examination of the category response functions from IRT showed overlap in the lower categories with little unique information provided; moreover, the categories were not observed to be interval. Based on these analyses, a three-response category format was recommended: very satisfied, satisfied, and not satisfied. Most information was provided in the range indicative of either dissatisfaction or high satisfaction. These analyses support the use of fewer response categories and provide item parameters that form a basis for developing shorter-form scales. Such a revision has the potential to reduce respondent burden.

  12. Fair and equitable measurement of student learning in MOOCs: an introduction to item response theory, scale linking, and score equating

    National Research Council Canada - National Science Library

    Meyer, J. Patrick; Zhu, Shi

    2013-01-01

      Massive open online courses (MOOCs) are playing an increasingly important role in higher education around the world, but despite their popularity, the measurement of student learning in these courses is hampered by cheating and other...

  13. Can patient safety be measured by surveys of patient experiences?

    Science.gov (United States)

    Solberg, Leif I; Asche, Stephen E; Averbeck, Beth M; Hayek, Anita M; Schmitt, Kay G; Lindquist, Tim C; Carlson, Richard R

    2008-05-01

    A study was conducted to test whether patient reports of medical errors via surveys could produce sufficiently accurate information to be used as a measure of patient safety. A survey mailed regularly by a large multispecialty medical group to recent patients to assess their satisfaction and error experiences was expanded to collect more details about the patient-perceived errors. Following an initial mailing to 3,109 patients and parents of child patients soon after they had office visits in June 2005, usable mailed or phone follow-up responses were obtained from 1,998 respondents (65.1% adjusted). Responses were reviewed through a two-stage process that included chart audits and implicit physician reviewer judgments. The analysis categorized the review results and compared patient-reported errors with satisfaction. Of the 1,998 respondents, 219 (11.0%) reported 247 separate incidents, for a rate of 12.4 errors per 100 patients. After complete review, only 5 (2.0%) of these incidents were judged to be real clinician errors. Most appeared to represent misunderstandings or behavior/communication problems, but 15.4% lacked sufficient information to categorize. Women, Hispanics, and those aged 41-60 years were most likely to report errors. Those respondents making error reports were much more likely to report visit dissatisfaction than those not reporting them (odds ratio [OR] = 13.8, p technical medical errors and patient safety reliably without added evaluation. This study's findings need to be replicated elsewhere before generalizing from one metropolitan region and a patient population that is about two-thirds members of one health plan.

  14. Sharing the cost of redundant items

    DEFF Research Database (Denmark)

    Hougaard, Jens Leth; Moulin, Hervé

    2014-01-01

    are network connectivity problems when an existing (possibly inefficient) network must be maintained. We axiomatize a family cost ratios based on simple liability indices, one for each agent and for each item, measuring the relative worth of this item across agents, and generating cost allocation rules......We ask how to share the cost of finitely many public goods (items) among users with different needs: some smaller subsets of items are enough to serve the needs of each user, yet the cost of all items must be covered, even if this entails inefficiently paying for redundant items. Typical examples...... additive in costs....

  15. DL-sQUAL: A Multiple-Item Scale for Measuring Service Quality of Online Distance Learning Programs

    Science.gov (United States)

    Shaik, Naj; Lowe, Sue; Pinegar, Kem

    2006-01-01

    Education is a service with multiplicity of student interactions over time and across multiple touch points. Quality teaching needs to be supplemented by consistent quality supporting services for programs to succeed under the competitive distance learning landscape. ServQual and e-SQ scales have been proposed for measuring quality of traditional…

  16. Item Randomized-Response Models for Measuring Noncompliance: Risk-Return Perceptions, Social Influences, and Self-Protective Responses

    Science.gov (United States)

    Bockenholt, Ulf; Van Der Heijden, Peter G. M.

    2007-01-01

    Randomized response (RR) is a well-known method for measuring sensitive behavior. Yet this method is not often applied because: (i) of its lower efficiency and the resulting need for larger sample sizes which make applications of RR costly; (ii) despite its privacy-protection mechanism the RR design may not be followed by every respondent; and…

  17. A New Extension of the Binomial Error Model for Responses to Items of Varying Difficulty in Educational Testing and Attitude Surveys.

    Directory of Open Access Journals (Sweden)

    James A Wiley

    Full Text Available We put forward a new item response model which is an extension of the binomial error model first introduced by Keats and Lord. Like the binomial error model, the basic latent variable can be interpreted as a probability of responding in a certain way to an arbitrarily specified item. For a set of dichotomous items, this model gives predictions that are similar to other single parameter IRT models (such as the Rasch model but has certain advantages in more complex cases. The first is that in specifying a flexible two-parameter Beta distribution for the latent variable, it is easy to formulate models for randomized experiments in which there is no reason to believe that either the latent variable or its distribution vary over randomly composed experimental groups. Second, the elementary response function is such that extensions to more complex cases (e.g., polychotomous responses, unfolding scales are straightforward. Third, the probability metric of the latent trait allows tractable extensions to cover a wide variety of stochastic response processes.

  18. A New Extension of the Binomial Error Model for Responses to Items of Varying Difficulty in Educational Testing and Attitude Surveys.

    Science.gov (United States)

    Wiley, James A; Martin, John Levi; Herschkorn, Stephen J; Bond, Jason

    2015-01-01

    We put forward a new item response model which is an extension of the binomial error model first introduced by Keats and Lord. Like the binomial error model, the basic latent variable can be interpreted as a probability of responding in a certain way to an arbitrarily specified item. For a set of dichotomous items, this model gives predictions that are similar to other single parameter IRT models (such as the Rasch model) but has certain advantages in more complex cases. The first is that in specifying a flexible two-parameter Beta distribution for the latent variable, it is easy to formulate models for randomized experiments in which there is no reason to believe that either the latent variable or its distribution vary over randomly composed experimental groups. Second, the elementary response function is such that extensions to more complex cases (e.g., polychotomous responses, unfolding scales) are straightforward. Third, the probability metric of the latent trait allows tractable extensions to cover a wide variety of stochastic response processes.

  19. Identification of developmentally appropriate screening items for disruptive behavior problems in preschoolers.

    Science.gov (United States)

    Studts, Christina R; van Zyl, Michiel A

    2013-08-01

    Screening preschool-aged children for disruptive behavior disorders is a key step in early intervention. The study goal was to identify screening items with excellent measurement properties at sub-clinical to clinical levels of disruptive behavior problems within the developmental context of preschool-aged children. Parents/caregivers of preschool-aged children (N = 900) were recruited from four pediatric primary care settings. Participants (mean age = 31, SD = 8) were predominantly female (87 %), either white (55 %) or African-American (42 %), and biological parents (88 %) of the target children. In this cross-sectional survey, participants completed a sociodemographic questionnaire and two parent-report behavioral rating scales: the PSC-17 and the BPI. Item response theory analyses provided item parameter estimates and information functions for 18 externalizing subscale items, revealing their quality of measurement along the continuum of disruptive behaviors in preschool-aged children. Of 18 investigated items, 5 items measured only low levels of disruptive behaviors among preschool-aged children. The remaining 13 items measured sub-clinical to clinical levels of disruptive behavior problems (i.e., >1.5 SD); however, 5 of these items offered less information, suggesting unreliable measurement. The remaining 8 items had high discrimination and difficulty parameters, offering considerable measurement information at sub-clinical to clinical levels of disruptive behavior problems. Behaviors measured by the 8 selected parent-report items were consistent with those identified in recent efforts to distinguish developmentally typical misbehaviors from clinically concerning behaviors among preschool-aged children. These items may have clinical utility in screening young children for disruptive behavior disorders.

  20. Generalized Full-Information Item Bifactor Analysis

    Science.gov (United States)

    Cai, Li; Yang, Ji Seung; Hansen, Mark

    2011-01-01

    Full-information item bifactor analysis is an important statistical method in psychological and educational measurement. Current methods are limited to single-group analysis and inflexible in the types of item response models supported. We propose a flexible multiple-group item bifactor analysis framework that supports a variety of…

  1. Measuring coverage in MNCH: total survey error and the interpretation of intervention coverage estimates from household surveys.

    Directory of Open Access Journals (Sweden)

    Thomas P Eisele

    Full Text Available Nationally representative household surveys are increasingly relied upon to measure maternal, newborn, and child health (MNCH intervention coverage at the population level in low- and middle-income countries. Surveys are the best tool we have for this purpose and are central to national and global decision making. However, all survey point estimates have a certain level of error (total survey error comprising sampling and non-sampling error, both of which must be considered when interpreting survey results for decision making. In this review, we discuss the importance of considering these errors when interpreting MNCH intervention coverage estimates derived from household surveys, using relevant examples from national surveys to provide context. Sampling error is usually thought of as the precision of a point estimate and is represented by 95% confidence intervals, which are measurable. Confidence intervals can inform judgments about whether estimated parameters are likely to be different from the real value of a parameter. We recommend, therefore, that confidence intervals for key coverage indicators should always be provided in survey reports. By contrast, the direction and magnitude of non-sampling error is almost always unmeasurable, and therefore unknown. Information error and bias are the most common sources of non-sampling error in household survey estimates and we recommend that they should always be carefully considered when interpreting MNCH intervention coverage based on survey data. Overall, we recommend that future research on measuring MNCH intervention coverage should focus on refining and improving survey-based coverage estimates to develop a better understanding of how results should be interpreted and used.

  2. Investigating Item Exposure Control Methods in Computerized Adaptive Testing

    Science.gov (United States)

    Ozturk, Nagihan Boztunc; Dogan, Nuri

    2015-01-01

    This study aims to investigate the effects of item exposure control methods on measurement precision and on test security under various item selection methods and item pool characteristics. In this study, the Randomesque (with item group sizes of 5 and 10), Sympson-Hetter, and Fade-Away methods were used as item exposure control methods. Moreover,…

  3. Exploring differential item functioning in the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC

    Directory of Open Access Journals (Sweden)

    Pollard Beth

    2012-12-01

    Full Text Available Abstract Background The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC is a widely used patient reported outcome in osteoarthritis. An important, but frequently overlooked, aspect of validating health outcome measures is to establish if items exhibit differential item functioning (DIF. That is, if respondents have the same underlying level of an attribute, does the item give the same score in different subgroups or is it biased towards one subgroup or another. The aim of the study was to explore DIF in the Likert format WOMAC for the first time in a UK osteoarthritis population with respect to demographic, social, clinical and psychological factors. Methods The sample comprised a community sample of 763 people with osteoarthritis who participated in the Somerset and Avon Survey of Health. The WOMAC was explored for DIF by gender, age, social deprivation, social class, employment status, distress, body mass index and clinical factors. Ordinal regression models were used to identify DIF items. Results After adjusting for age, two items were identified for the physical functioning subscale as having DIF with age identified as the DIF factor for 2 items, gender for 1 item and body mass index for 1 item. For the WOMAC pain subscale, for people with hip osteoarthritis one item was identified with age-related DIF. The impact of the DIF items rarely had a significant effect on the conclusions of group comparisons. Conclusions Overall, the WOMAC performed well with only a small number of DIF items identified. However, as DIF items were identified in for the WOMAC physical functioning subscale it would be advisable to analyse data taking into account the possible impact of the DIF items when weight, gender or especially age effects, are the focus of interest in UK-based osteoarthritis studies. Similarly for the WOMAC pain subscale in people with hip osteoarthritis it would be worthwhile to analyse data taking into account the

  4. Grouping of Items in Mobile Web Questionnaires

    Science.gov (United States)

    Mavletova, Aigul; Couper, Mick P.

    2016-01-01

    There is some evidence that a scrolling design may reduce breakoffs in mobile web surveys compared to a paging design, but there is little empirical evidence to guide the choice of the optimal number of items per page. We investigate the effect of the number of items presented on a page on data quality in two types of questionnaires: with or…

  5. Measurement of the severity of disability in community-dwelling adults and older adults: interval-level measures for accurate comparisons in large survey data sets

    Science.gov (United States)

    Buz, José; Cortés-Rodríguez, María

    2016-01-01

    Objectives To (1) create a single metric of disability using Rasch modelling to be used for comparing disability severity levels across groups and countries, (2) test whether the interval-level measures were invariant across countries, sociodemographic and health variables and (3) examine the gains in precision using interval-level measures relative to ordinal scores when discriminating between groups known to differ in disability. Design Cross-sectional, population-based study. Setting/participants Data were drawn from the Survey of Health, Ageing and Retirement in Europe (SHARE), including comparable data across 16 countries and involving 58 489 community-dwelling adults aged 50+. Main outcome measures A single metric of disability composed of self-care and instrumental activities of daily living (IADLs) and functional limitations. We examined the construct validity through the fit to the Rasch model and the know-groups method. Reliability was examined using person separation reliability. Results The single metric fulfilled the requirements of a strong hierarchical scale; was able to separate persons with different levels of disability; demonstrated invariance of the item hierarchy across countries; and was unbiased by age, gender and different health conditions. However, we found a blurred hierarchy of ADL and IADL tasks. Rasch-based measures yielded gains in relative precision (11–116%) in discriminating between groups with different medical conditions. Conclusions Equal-interval measures, with person-invariance and item-invariance properties, provide epidemiologists and researchers with the opportunity to gain better insight into the hierarchical structure of functional disability, and yield more reliable and accurate estimates of disability across groups and countries. Interval-level measures of disability allow parametric statistical analysis to confidently examine the relationship between disability and continuous measures so frequent in health sciences

  6. The 2-degree Field Lensing Survey: design and clustering measurements

    Science.gov (United States)

    Blake, Chris; Amon, Alexandra; Childress, Michael; Erben, Thomas; Glazebrook, Karl; Harnois-Deraps, Joachim; Heymans, Catherine; Hildebrandt, Hendrik; Hinton, Samuel R.; Janssens, Steven; Johnson, Andrew; Joudaki, Shahab; Klaes, Dominik; Kuijken, Konrad; Lidman, Chris; Marin, Felipe A.; Parkinson, David; Poole, Gregory B.; Wolf, Christian

    2016-11-01

    We present the 2-degree Field Lensing Survey (2dFLenS), a new galaxy redshift survey performed at the Anglo-Australian Telescope. 2dFLenS is the first wide-area spectroscopic survey specifically targeting the area mapped by deep-imaging gravitational lensing fields, in this case the Kilo-Degree Survey. 2dFLenS obtained 70 079 redshifts in the range z relevant algorithms for joint imaging and spectroscopic analysis. The redshift sample consists first of 40 531 Luminous Red Galaxies (LRGs), which enable analyses of galaxy-galaxy lensing, redshift-space distortion, and the overlapping source redshift distribution by cross-correlation. An additional 28 269 redshifts form a magnitude-limited (r values are consistent with those obtained from LRGs in the Baryon Oscillation Spectroscopic Survey. 2dFLenS data products will be released via our website http://2dflens.swin.edu.au.

  7. Screening Test Items for Differential Item Functioning

    Science.gov (United States)

    Longford, Nicholas T.

    2014-01-01

    A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…

  8. Measuring genetic knowledge: a brief survey instrument for adolescents and adults.

    Science.gov (United States)

    Fitzgerald-Butt, S M; Bodine, A; Fry, K M; Ash, J; Zaidi, A N; Garg, V; Gerhardt, C A; McBride, K L

    2016-02-01

    Basic knowledge of genetics is essential for understanding genetic testing and counseling. The lack of a written, English language, validated, published measure has limited our ability to evaluate genetic knowledge of patients and families. Here, we begin the psychometric analysis of a true/false genetic knowledge measure. The 18-item measure was completed by parents of children with congenital heart defects (CHD) (n = 465) and adolescents and young adults with CHD (age: 15-25, n = 196) with a mean total correct score of 12.6 [standard deviation (SD) = 3.5, range: 0-18]. Utilizing exploratory factor analysis, we determined that one to three correlated factors, or abilities, were captured by our measure. Through confirmatory factor analysis, we determined that the two factor model was the best fit. Although it was necessary to remove two items, the remaining items exhibited adequate psychometric properties in a multidimensional item response theory analysis. Scores for each factor were computed, and a sum-score conversion table was derived. We conclude that this genetic knowledge measure discriminates best at low knowledge levels and is therefore well suited to determine a minimum adequate amount of genetic knowledge. However, further reliability testing and validation in diverse research and clinical settings is needed.

  9. What questionnaires to use when measuring quality of life in sacral tumor patients: the updated sacral tumor survey.

    Science.gov (United States)

    van Wulfften Palthe, Olivier D R; Janssen, Stein J; Wunder, Jay S; Ferguson, Peter C; Wei, Guo; Rose, Peter S; Yaszemski, Micheal J; Sim, Franklin H; Boland, Patrick J; Healey, John H; Hornicek, Francis J; Schwab, Joseph H

    2017-05-01

    Patient-reported outcomes are becoming increasingly important when investigating results of patient and disease management. In sacral tumor, the symptoms of patients can vary substantially; therefore, no single questionnaire can adequately account for the full spectrum of symptoms and disability. The purpose of this study is to analyze redundancy within the current sacral tumor survey and make a recommendation for an updated version based on the results and patient and expert opinions. A survey study from a tertiary care orthopedic oncology referral center was used. The patient sample included 70 patients with sacral tumors (78% chordoma). The following 10 questionnaires included in the current sacral tumor survey were evaluated: the Patient-Reported Outcomes Measurement Information System (PROMIS) Global Item short form, PROMIS Pain Intensity short form, PROMIS Pain Interference short form, PROMIS Neuro-QOL v1.0 Lower Extremity Function short form, PROMIS v1.0 Anxiety short form, the PROMIS v1.0 Depression short form, the International Continence Society Male short form, the Modified Obstruction-Defecation Syndrome questionnaire, the PROMIS Sexual Function Profile v1.0, and the Stoma Quality of Life tool. We performed an exploratory factor analysis to calculate the possible underlying latent traits. Spearman rank correlation coefficients were used to measure to what extent the questionnaires converged. We hypothesized the existence of six domains based on current literature: mental health, physical health, pain, gastrointestinal symptoms, sexual function, and urinary incontinence. To assess content validity, we surveyed 32 patients, 9 orthopedic oncologists, 1 medical oncologist, 1 radiation oncologist, and 1 orthopedic oncology nurse practitioner with experience in treating sacral tumor patients on the relevance of the domains. Reliability as measured by Cronbach alpha ranged from 0.65 to 0.96. Coverage measured by floor and ceiling effects ranged from 0% to 52

  10. Measurement of thermal conductivity and diffusivity in situ: Literature survey and theoretical modelling of measurements

    Energy Technology Data Exchange (ETDEWEB)

    Kukkonen, I.; Suppala, I. [Geological Survey of Finland, Espoo (Finland)

    1999-01-01

    In situ measurements of thermal conductivity and diffusivity of bedrock were investigated with the aid of a literature survey and theoretical simulations of a measurement system. According to the surveyed literature, in situ methods can be divided into `active` drill hole methods, and `passive` indirect methods utilizing other drill hole measurements together with cutting samples and petrophysical relationships. The most common active drill hole method is a cylindrical heat producing probe whose temperature is registered as a function of time. The temperature response can be calculated and interpreted with the aid of analytical solutions of the cylindrical heat conduction equation, particularly the solution for an infinite perfectly conducting cylindrical probe in a homogeneous medium, and the solution for a line source of heat in a medium. Using both forward and inverse modellings, a theoretical measurement system was analysed with an aim at finding the basic parameters for construction of a practical measurement system. The results indicate that thermal conductivity can be relatively well estimated with borehole measurements, whereas thermal diffusivity is much more sensitive to various disturbing factors, such as thermal contact resistance and variations in probe parameters. In addition, the three-dimensional conduction effects were investigated to find out the magnitude of axial `leak` of heat in long-duration experiments. The radius of influence of a drill hole measurement is mainly dependent on the duration of the experiment. Assuming typical conductivity and diffusivity values of crystalline rocks, the measurement yields information within less than a metre from the drill hole, when the experiment lasts about 24 hours. We propose the following factors to be taken as basic parameters in the construction of a practical measurement system: the probe length 1.5-2 m, heating power 5-20 Wm{sup -1}, temperature recording with 5-7 sensors placed along the probe, and

  11. Measuring baryon acoustic oscillations with future SKA surveys

    CERN Document Server

    Bull, Philip; Raccanelli, Alvise; Blake, Chris; Ferreira, Pedro G; Santos, Mario G; Schwarz, Dominik J

    2015-01-01

    The imprint of baryon acoustic oscillations (BAO) in large-scale structure can be used as a standard ruler for mapping out the cosmic expansion history, and hence for testing cosmological models. In this article we briefly describe the scientific background to the BAO technique, and forecast the potential of the Phase 1 and 2 SKA telescopes to perform BAO surveys using both galaxy catalogues and intensity mapping, assessing their competitiveness with current and future optical galaxy surveys. We find that a 25,000 sq. deg. intensity mapping survey on a Phase 1 array will preferentially constrain the radial BAO, providing a highly competitive 2% constraint on the expansion rate at z ~ 2. A 30,000 sq. deg. galaxy redshift survey on SKA2 will outperform all other planned experiments for z < 1.4.

  12. Bayesian tests of measurement invariance.

    Science.gov (United States)

    Verhagen, A J; Fox, J P

    2013-11-01

    Random item effects models provide a natural framework for the exploration of violations of measurement invariance without the need for anchor items. Within the random item effects modelling framework, Bayesian tests (Bayes factor, deviance information criterion) are proposed which enable multiple marginal invariance hypotheses to be tested simultaneously. The performance of the tests is evaluated with a simulation study which shows that the tests have high power and low Type I error rate. Data from the European Social Survey are used to test for measurement invariance of attitude towards immigrant items and to show that background information can be used to explain cross-national variation in item functioning.

  13. Measuring Nurses' Value, Implementation, and Knowledge of Evidence-Based Practice: Further Psychometric Testing of the Quick-EBP-VIK Survey.

    Science.gov (United States)

    Connor, Linda; Paul, Fiona; McCabe, Margaret; Ziniel, Sonja

    2017-02-01

    The Quick-EBP-VIK is a new instrument for measuring nurses' value, implementation, and knowledge of EBP. Psychometric testing was conducted in two parts. Part 1 describes the tool development and validity testing which resulted in the development of a 25-item survey after receiving ≥0.80 Item-Level Content Validity Index for both clarity and relevance. Part 2 describes psychometric testing was necessary to assess additional types of validity and reliability. The purpose of this paper is to further describe the psychometric testing of the Quick-EBP-VIK survey instrument. This descriptive study was designed to assess test-retest reliability, internal consistency and construct validity via a web-based survey. The survey instrument was e-mailed to all nurses at the study hospital. Nurses who responded to the first survey (Wave 1) received another e-mail invitation to complete the survey instrument again (Wave 2) for the purpose of assessing the test-retest reliability of the instrument. A total of 1,177 deliverable e-mails were sent to all nursing staff at one free standing pediatric hospital with Magnet(®) designation in the northeast. A total of 382 nurses returned completed surveys, indicating a 32.5% response rate for Wave 1. A total of 131 nurses responded to Wave 2 indicating a response rate of 34.3%. The intraclass correlation coefficients for the items included in the final instrument ranged from 0.43 to 0.80 and were deemed sufficient. These represent a sufficient intraclass correlation coefficient. The Cronbach's Alpha values for each of the three domains are all higher than 0.7 indicating that the items of each of the measurement dimension are internally consistent. However, the composite reliability of the third domain was slightly lower than 0.7 when using Raykov's Rho. The Quick-EBP-VIK instrument has gone through rigorous comprehensive testing and has demonstrated good psychometric properties. © 2016 Sigma Theta Tau International.

  14. The social and community opportunities profile social inclusion measure: Structural equivalence and differential item functioning in community mental health residents in Hong Kong and the United Kingdom.

    Science.gov (United States)

    Huxley, Peter John; Chan, Kara; Chiu, Marcus; Ma, Yanni; Gaze, Sarah; Evans, Sherrill

    2016-03-01

    China's future major health problem will be the management of chronic diseases - of which mental health is a major one. An instrument is needed to measure mental health inclusion outcomes for mental health services in Hong Kong and mainland China as they strive to promote a more inclusive society for their citizens and particular disadvantaged groups. To report on the analysis of structural equivalence and item differentiation in two mentally unhealthy and one healthy sample in the United Kingdom and Hong Kong. The mental health sample in Hong Kong was made up of non-governmental organisation (NGO) referrals meeting the selection/exclusion criteria (being well enough to be interviewed, having a formal psychiatric diagnosis and living in the community). A similar sample in the United Kingdom meeting the same selection criteria was obtained from a community mental health organisation, equivalent to the NGOs in Hong Kong. Exploratory factor analysis and logistic regression were conducted. The single-variable, self-rated 'overall social inclusion' differs significantly between all of the samples, in the way we would expect from previous research, with the healthy population feeling more included than the serious mental illness (SMI) groups. In the exploratory factor analysis, the first two factors explain between a third and half of the variance, and the single variable which enters into all the analyses in the first factor is having friends to visit the home. All the regression models were significant; however, in Hong Kong sample, only one-fifth of the total variance is explained. The structural findings imply that the social and community opportunities profile-Chinese version (SCOPE-C) gives similar results when applied to another culture. As only one-fifth of the variance of 'overall inclusion' was explained in the Hong Kong sample, it may be that the instrument needs to be refined using different or additional items within the structural domains of inclusion.

  15. The Servant Leadership Survey: Development and Validation of a Multidimensional Measure

    NARCIS (Netherlands)

    D. van Dierendonck (Dirk); I.A.P.M. Nuijten (Inge)

    2011-01-01

    textabstractPurpose: The purpose of this paper is to describe the development and validation of a multi-dimensional instrument to measure servant leadership. Design/Methodology/Approach Based on an extensive literature review and expert judgment, 99 items were formulated. In three steps, using ei

  16. The School Science Attitude Survey: A New Instrument for Measuring Attitudes towards School Science

    Science.gov (United States)

    Kennedy, JohnPaul; Quinn, Frances; Taylor, Neil

    2016-01-01

    There have been many attempts over the last five decades to measure students' attitudes towards school science. Many of these studies investigated attitudes towards limited aspects of science and utilized large numbers of items to draw snapshot summaries of the educational landscape. An understanding of attitudes towards science, and how these…

  17. Measuring outcomes in allergic rhinitis: psychometric characteristics of a Spanish version of the congestion quantifier seven-item test (CQ7

    Directory of Open Access Journals (Sweden)

    Mullol Joaquim

    2011-03-01

    Full Text Available Abstract Background No control tools for nasal congestion (NC are currently available in Spanish. This study aimed to adapt and validate the Congestion Quantifier Seven Item Test (CQ7 for Spain. Methods CQ7 was adapted from English following international guidelines. The instrument was validated in an observational, prospective study in allergic rhinitis patients with NC (N = 166 and a control group without NC (N = 35. Participants completed the CQ7, MOS sleep questionnaire, and a measure of psychological well-being (PGWBI. Clinical data included NC severity rating, acoustic rhinometry, and total symptom score (TSS. Internal consistency was assessed using Cronbach's alpha and test-retest reliability using the intraclass correlation coefficient (ICC. Construct validity was tested by examining correlations with other outcome measures and ability to discriminate between groups classified by NC severity. Sensitivity and specificity were assessed using Area under the Receiver Operating Curve (AUC and responsiveness over time using effect sizes (ES. Results Cronbach's alpha for the CQ7 was 0.92, and the ICC was 0.81, indicating good reliability. CQ7 correlated most strongly with the TSS (r = 0.60, p Conclusions The Spanish version of the CQ7 is appropriate for detecting, measuring, and monitoring NC in allergic rhinitis patients.

  18. Faculty development on item writing substantially improves item quality.

    Science.gov (United States)

    Naeem, Naghma; van der Vleuten, Cees; Alfaris, Eiad Abdelmohsen

    2012-08-01

    The quality of items written for in-house examinations in medical schools remains a cause of concern. Several faculty development programs are aimed at improving faculty's item writing skills. The purpose of this study was to evaluate the effectiveness of a faculty development program in item development. An objective method was developed and used to assess improvement in faculty's competence to develop high quality test items. This was a quasi experimental study with a pretest-midtest-posttest design. A convenience sample of 51 faculty members participated. Structured checklists were used to assess the quality of test items at each phase of the study. Group scores were analyzed using repeated measures analysis of variance. The results showed a significant increase in participants' mean scores on Multiple Choice Questions, Short Answer Questions and Objective Structured Clinical Examination checklists from pretest to posttest (p development are generally lacking in quality. It also provides evidence of the value of faculty development in improving the quality of items generated by faculty.

  19. Measuring self-rated social health of Iranians: a population based survey in three cities

    Directory of Open Access Journals (Sweden)

    Kambiz Abachizadeh

    2014-08-01

    Full Text Available Abstract:Background and objectives: Social health as third dimension of health, along with physical and mental health, has drawn more attention in recent years among policy makers and health system managers. No other study, to our knowledge, has documented measuring individual-level social health in Iran. In response to this need, our study tends to assess Iranians self-rated social health through conducting a survey in 3 cities of Iran. Methods: We conducted a survey using cross sectional method in three cities of Iran included people more than 18 years old. We use a random sample size of 800 people. The scale provides a total score of social health and three sub-scores. Total score was calculated by summing all 33 items, so the range was between 33 to 165, considering that higher score indicating better social health. Psychometric parameters of scale were acceptable. To interpret scores, respondents were categorized into five ordered groups as quintiles for amount of social health. To compare social health scores in different demographic groups multiple linear regression was employed to interpret association between demographic variables and social health score. Results: From a pool of 800 persons, 794 (99% agreed to participate and filled out the questionnaire completely.  The mean of self-rated social health score was 105.0 (95% confidence interval, 103.8 to 106.2. 50% of participants had medium level of social health. social health score was higher for those who live in Urmia as a small city in comparison with big cities- Tehran and Isfahan (P V< 0.001 and was lower for unemployed people (PV= 0.029. There was no association between social health score and other factors such as sex, age and educational level (PV>0.05 Conclusion:This study may be considered as the first step in evidence-based policy-making in the field of social health in Iran. Certainly, it is necessary to conduct more studies to measure social health and its

  20. Priorities for Standards and Measurements to Accelerate Innovations in Nano-Electrotechnologies: Analysis of the NIST-Energetics-IEC TC 113 Survey.

    Science.gov (United States)

    Bennett, Herbert S; Andres, Howard; Pellegrino, Joan; Kwok, Winnie; Fabricius, Norbert; Chapin, J Thomas

    2009-01-01

    In 2008, the National Institute of Standards and Technology and Energetics Incorporated collaborated with the International Electrotechnical Commission Technical Committee 113 (IEC TC 113) on nano-electrotechnologies to survey members of the international nanotechnologies community about priorities for standards and measurements to accelerate innovations in nano-electrotechnologies. In this paper, we analyze the 459 survey responses from 45 countries as one means to begin building a consensus on a framework leading to nano-electrotechnologies standards development by standards organizations and national measurement institutes. The distributions of priority rankings from all 459 respondents are such that there are perceived distinctions with statistical confidence between the relative international priorities for the several items ranked in each of the following five Survey category types: 1) Nano-electrotechnology Properties, 2) Nano-electrotechnology Taxonomy: Products, 3) Nano-electrotechnology Taxonomy: Cross-Cutting Technologies, 4) IEC General Discipline Areas, and 5) Stages of the Linear Economic Model. The global consensus prioritizations for ranked items in the above five category types suggest that the IEC TC 113 should focus initially on standards and measurements for electronic and electrical properties of sensors and fabrication tools that support performance assessments of nano-technology enabled sub-assemblies used in energy, medical, and computer products.

  1. Development and measurement properties of the Orthotics and Prosthetics Users' Survey (OPUS): a comprehensive set of clinical outcome instruments.

    Science.gov (United States)

    Heinemann, A W; Bode, R K; O'Reilly, C

    2003-12-01

    The need to measure and evaluate orthotics and prosthetics (O&P) practice has received growing recognition in the past several years. Reliable and valid self-report instruments are needed that can help facilities evaluate patient outcomes. The objective of this project was to develop a set of self-report instruments that assess functional status, quality of life, and satisfaction with devices and services that can be used in an orthotics and prosthetics clinic. Selecting items from a variety of existing instruments, the authors developed and revised four instruments that differentiate patients with varying levels of lower limb function, quality of life, and satisfaction with devices and services. Evidence of construct validity is provided by hierarchies of item difficulty that are consistent with clinical experience. For example, with the lower limb function instrument, running one block was much more difficult than walking indoors. The instruments demonstrate adequate internal consistency (0.88 for lower limb function, 0.88 for quality of life, 0.74 for service satisfaction, 0.78 for device satisfaction). The next steps in their research programme are to evaluate sensitivity and construct validity. The Orthotics and Prosthetics Users' Survey (OPUS) is a promising self-report instrument which may, with further development, allow orthotic and prosthetic practitioners to evaluate the quality and effectiveness of their services as required by accreditation standards such as those of the American Board for Certification in Orthotics and Prosthetics that mandate quality assessment.

  2. Measuring personal beliefs and perceived norms about intimate partner violence: Population-based survey experiment in rural Uganda.

    Science.gov (United States)

    Tsai, Alexander C; Kakuhikire, Bernard; Perkins, Jessica M; Vořechovská, Dagmar; McDonough, Amy Q; Ogburn, Elizabeth L; Downey, Jordan M; Bangsberg, David R

    2017-05-01

    Demographic and Health Surveys (DHS) conducted throughout sub-Saharan Africa indicate there is widespread acceptance of intimate partner violence, contributing to an adverse health risk environment for women. While qualitative studies suggest important limitations in the accuracy of the DHS methods used to elicit attitudes toward intimate partner violence, to date there has been little experimental evidence from sub-Saharan Africa that can be brought to bear on this issue. We embedded a randomized survey experiment in a population-based survey of 1,334 adult men and women living in Nyakabare Parish, Mbarara, Uganda. The primary outcomes were participants' personal beliefs about the acceptability of intimate partner violence and perceived norms about intimate partner violence in the community. To elicit participants' personal beliefs and perceived norms, we asked about the acceptability of intimate partner violence in five different vignettes. Study participants were randomly assigned to one of three survey instruments, each of which contained varying levels of detail about the extent to which the wife depicted in the vignette intentionally or unintentionally violated gendered standards of behavior. For the questions about personal beliefs, the mean (standard deviation) number of items where intimate partner violence was endorsed as acceptable was 1.26 (1.58) among participants assigned to the DHS-style survey variant (which contained little contextual detail about the wife's intentions), 2.74 (1.81) among participants assigned to the survey variant depicting the wife as intentionally violating gendered standards of behavior, and 0.77 (1.19) among participants assigned to the survey variant depicting the wife as unintentionally violating these standards. In a partial proportional odds regression model adjusting for sex and village of residence, with participants assigned to the DHS-style survey variant as the referent group, participants assigned the survey variant

  3. An investigation of the generalizability and dependability of direct behavior rating single item scales (DBR-SIS) to measure academic engagement and disruptive behavior of middle school students.

    Science.gov (United States)

    Chafouleas, Sandra M; Briesch, Amy M; Riley-Tillman, T Chris; Christ, Theodore J; Black, Anne C; Kilgus, Stephen P

    2010-06-01

    A total of 4 raters, including 2 teachers and 2 research assistants, used Direct Behavior Rating Single Item Scales (DBR-SIS) to measure the academic engagement and disruptive behavior of 7 middle school students across multiple occasions. Generalizability study results for the full model revealed modest to large magnitudes of variance associated with persons (students), occasions of measurement (day), and associated interactions. However, an unexpectedly low proportion of the variance in DBR data was attributable to the facet of rater, as well as a negligible variance component for the facet of rating occasion nested within day (10-min interval within a class period). Results of a reduced model and subsequent decision studies specific to individual rater and rater type (research assistant and teacher) suggested degree of reliability-like estimates differed substantially depending on rater. Overall, findings supported previous recommendations that in the absence of estimates of rater reliability and firm recommendations regarding rater training, ratings obtained from DBR-SIS, and subsequent analyses, be conducted within rater. Additionally, results suggested that when selecting a teacher rater, the person most likely to substantially interact with target students during the specified observation period may be the best choice.

  4. Examining item difficulty and response time on perceptual ability test items.

    Science.gov (United States)

    Yang, Chien-Lin; O'Neill, Thomas R; Kramer, Gene A

    2002-01-01

    This study examined item calibration stability in relation to response time and the levels of item difficulty between different response time groups on a sample of 389 examinees responding to six different subtest items of the Perceptual Ability Test (PAT). The results indicated that no Differential Item Functioning (DIF) was found and a significant correlation coefficient of item difficulty was formed between slow and fast responders. Three distinct levels of difficulty emerged among the six subtests across groups. Slow responders spent significantly more time than fast responders on the four most difficult subtests. A positive significant relationship was found between item difficulty and response time across groups on the overall perceptual ability test items. Overall, this study found that: 1) the same underlying construct is being measured across groups, 2) the PAT scores were equally useful across groups, 3) different sources of item difficulty may exist among the six subtests, and 4) more difficult test items may require more time to answer.

  5. No galaxy left behind: accurate measurements with the faintest objects in the Dark Energy Survey

    CERN Document Server

    Suchyta, E; Aleksić, J; Melchior, P; Jouvel, S; MacCrann, N; Crocce, M; Gaztanaga, E; Honscheid, K; Leistedt, B; Peiris, H V; Ross, A J; Rykoff, E S; Sheldon, E; Abbott, T; Abdalla, F B; Allam, S; Banerji, M; Benoit-Lévy, A; Bertin, E; Brooks, D; Burke, D L; Rosell, A Carnero; Kind, M Carrasco; Carretero, J; Cunha, C E; D'Andrea, C B; da Costa, L N; DePoy, D L; Desai, S; Diehl, H T; Dietrich, J P; Doel, P; Eifler, T F; Estrada, J; Evrard, A E; Flaugher, B; Fosalba, P; Frieman, J; Gerdes, D W; Gruen, D; Gruendl, R A; James, D J; Jarvis, M; Kuehn, K; Kuropatkin, N; Lahav, O; Lima, M; Maia, M A G; March, M; Marshall, J L; Miller, C J; Miquel, R; Neilsen, E; Nichol, R C; Nord, B; Ogando, R; Percival, W J; Reil, K; Roodman, A; Sako, M; Sanchez, E; Scarpine, V; Sevilla-Noarbe, I; Smith, R C; Soares-Santos, M; Sobreira, F; Swanson, M E C; Tarle, G; Thaler, J; Thomas, D; Vikram, V; Walker, A R; Wechsler, R H; Zhang, Y

    2015-01-01

    Accurate statistical measurement with large imaging surveys has traditionally required throwing away a sizable fraction of the data. This is because most measurements have have relied on selecting nearly complete samples, where variations in the composition of the galaxy population with seeing, depth, or other survey characteristics are small. We introduce a new measurement method that aims to minimize this wastage, allowing precision measurement for any class of stars or galaxies detectable in an imaging survey. We have implemented our proposal in Balrog, a software package which embeds fake objects in real imaging in order to accurately characterize measurement biases. We demonstrate this technique with an angular clustering measurement using Dark Energy Survey (DES) data. We first show that recovery of our injected galaxies depends on a wide variety of survey characteristics in the same way as the real data. We then construct a flux-limited sample of the faintest galaxies in DES, chosen specifically for th...

  6. 1999 Customer Satisfaction Survey Report: How Do We Measure Up?

    Science.gov (United States)

    Salvucci, Sameena; Parker, Albert C. E.; Cash, R. William; Thurgood, Lori

    2001-01-01

    Summarizes results of a 1999 survey regarding the satisfaction of various groups with publications, databases, and services of the National Center for Education Statistics. Groups studied were federal, state, and local policymakers; academic researchers; and journalists. Compared 1999 results with 1997 results. (Author/SLD)

  7. The 2-degree Field Lensing Survey: design and clustering measurements

    CERN Document Server

    Blake, Chris; Childress, Michael; Erben, Thomas; Glazebrook, Karl; Harnois-Deraps, Joachim; Heymans, Catherine; Hildebrandt, Hendrik; Hinton, Samuel R; Janssens, Steven; Johnson, Andrew; Joudaki, Shahab; Klaes, Dominik; Kuijken, Konrad; Lidman, Chris; Marin, Felipe A; Parkinson, David; Poole, Gregory B; Wolf, Christian

    2016-01-01

    We present the 2-degree Field Lensing Survey (2dFLenS), a new galaxy redshift survey performed at the Anglo-Australian Telescope. 2dFLenS is the first wide-area spectroscopic survey specifically targeting the area mapped by deep-imaging gravitational lensing fields, in this case the Kilo-Degree Survey. 2dFLenS obtained 70,079 redshifts in the range z < 0.9 over an area of 731 sq deg, and is designed to extend the datasets available for testing gravitational physics and promote the development of relevant algorithms for joint imaging and spectroscopic analysis. The redshift sample consists first of 40,531 Luminous Red Galaxies (LRGs), which enable analyses of galaxy-galaxy lensing, redshift-space distortion, and the overlapping source redshift distribution by cross-correlation. An additional 28,269 redshifts form a magnitude-limited (r < 19.5) nearly-complete sub-sample, allowing direct source classification and photometric-redshift calibration. In this paper, we describe the motivation, target selection...

  8. The case for survey-based comparative measures of crime

    NARCIS (Netherlands)

    van Dijk, Jan

    2015-01-01

    The author argues that statistics of police-recorded crimes have limited utility for cross-country analyses of crime, due to varying legal definitions, reporting patterns and recording practices. In his view stand alone national victimisation surveys, with their varying methodologies and questionnai

  9. Conceptualization and measurement of homosexuality in sex surveys: a critical review

    OpenAIRE

    Michaels Stuart; Lhomond Brigitte

    2006-01-01

    This article reviews major national population sex surveys that have asked questions about homosexuality focusing on conceptual and methodological issues, including the definitions of sex, the measured aspects of homosexuality, sampling and interviewing technique, and questionnaire design. Reported rates of major measures of same-sex attraction, behavior, partners, and sexual identity from surveys are also presented and compared. The study of homosexuality in surveys has been shaped by the re...

  10. The Long-Term Conditions Questionnaire: conceptual framework and item development

    Science.gov (United States)

    Peters, Michele; Potter, Caroline M; Kelly, Laura; Hunter, Cheryl; Gibbons, Elizabeth; Jenkinson, Crispin; Coulter, Angela; Forder, Julien; Towers, Ann-Marie; A’Court, Christine; Fitzpatrick, Ray

    2016-01-01

    Purpose To identify the main issues of importance when living with long-term conditions to refine a conceptual framework for informing the item development of a patient-reported outcome measure for long-term conditions. Materials and methods Semi-structured qualitative interviews (n=48) were conducted with people living with at least one long-term condition. Participants were recruited through primary care. The interviews were transcribed verbatim and analyzed by thematic analysis. The analysis served to refine the conceptual framework, based on reviews of the literature and stakeholder consultations, for developing candidate items for a new measure for long-term conditions. Results Three main organizing concepts were identified: impact of long-term conditions, experience of services and support, and self-care. The findings helped to refine a conceptual framework, leading to the development of 23 items that represent issues of importance in long-term conditions. The 23 candidate items formed the first draft of the measure, currently named the Long-Term Conditions Questionnaire. Conclusion The aim of this study was to refine the conceptual framework and develop items for a patient-reported outcome measure for long-term conditions, including single and multiple morbidities and physical and mental health conditions. Qualitative interviews identified the key themes for assessing outcomes in long-term conditions, and these underpinned the development of the initial draft of the measure. These initial items will undergo cognitive testing to refine the items prior to further validation in a survey. PMID:27621678

  11. Measuring temporary employment. Do survey or register data tell the truth?

    NARCIS (Netherlands)

    Pavlopoulos, D.; Vermunt, J.K.

    2015-01-01

    One of the main variables in the Dutch Labour Force Survey is the variable measuring whether a respondent has a permanent or a temporary job. The aim of our study is to determine the measurement error in this variable by matching the information obtained by the longitudinal part of this survey with

  12. The Effects of Survey Administration on Disclosure Rates to Sensitive Items Among Men: A Comparison of an Internet Panel Sample with a RDD Telephone Sample.

    Science.gov (United States)

    Hines, Denise A; Douglas, Emily M; Mahmood, Sehar

    2010-11-01

    Research using Internet surveys is an emerging field, yet research on the legitimacy of using Internet studies, particularly those targeting sensitive topics, remains under-investigated. The current study builds on the existing literature by exploring the demographic differences between Internet panel and RDD telephone survey samples, as well as differences in responses with regard to experiences of intimate partner violence perpetration and victimization, alcohol and substance use/abuse, PTSD symptomatology, and social support. Analyses indicated that after controlling for demographic differences, there were few differences between the samples in their disclosure of sensitive information, and that the online sample was more socially isolated than the phone sample. Results are discussed in terms of their implications for using Internet samples in research on sensitive topics.

  13. Measuring impairments of functioning and health in patients with axial spondyloarthritis by using the ASAS Health Index and the Environmental Item Set: translation and cross-cultural adaptation into 15 languages

    Science.gov (United States)

    Kiltz, U; van der Heijde, D; Boonen, A; Bautista-Molano, W; Burgos-Vargas, R; Chiowchanwisawakit, P; Duruoz, T; El-Zorkany, B; Essers, I; Gaydukova, I; Géher, P; Gossec, L; Grazio, S; Gu, J; Khan, M A; Kim, T J; Maksymowych, W P; Marzo-Ortega, H; Navarro-Compán, V; Olivieri, I; Patrikos, D; Pimentel-Santos, F M; Schirmer, M; van den Bosch, F; Weber, U; Zochling, J; Braun, J

    2016-01-01

    Introduction The Assessments of SpondyloArthritis international society Health Index (ASAS HI) measures functioning and health in patients with spondyloarthritis (SpA) across 17 aspects of health and 9 environmental factors (EF). The objective was to translate and adapt the original English version of the ASAS HI, including the EF Item Set, cross-culturally into 15 languages. Methods Translation and cross-cultural adaptation has been carried out following the forward–backward procedure. In the cognitive debriefing, 10 patients/country across a broad spectrum of sociodemographic background, were included. Results The ASAS HI and the EF Item Set were translated into Arabic, Chinese, Croatian, Dutch, French, German, Greek, Hungarian, Italian, Korean, Portuguese, Russian, Spanish, Thai and Turkish. Some difficulties were experienced with translation of the contextual factors indicating that these concepts may be more culturally-dependent. A total of 215 patients with axial SpA across 23 countries (62.3% men, mean (SD) age 42.4 (13.9) years) participated in the field test. Cognitive debriefing showed that items of the ASAS HI and EF Item Set are clear, relevant and comprehensive. All versions were accepted with minor modifications with respect to item wording and response option. The wording of three items had to be adapted to improve clarity. As a result of cognitive debriefing, a new response option ‘not applicable’ was added to two items of the ASAS HI to improve appropriateness. Discussion This study showed that the items of the ASAS HI including the EFs were readily adaptable throughout all countries, indicating that the concepts covered were comprehensive, clear and meaningful in different cultures. PMID:27752358

  14. AN ITEM RESPONSE MODEL WITH SINGLE PEAKED ITEM CHARACTERISTIC CURVES - THE PARELLA MODEL

    NARCIS (Netherlands)

    HOIJTINK, H; MOLENAAR, [No Value

    In this paper an item response model (the PARELLA model) designed specifically for the measurement of attitudes and preferences will be introduced. In contrast with the item response models currently used (e.g. the Rasch model and, the two and three parameter logistic model) the item characteristic

  15. Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory

    Science.gov (United States)

    Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi

    2016-01-01

    High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…

  16. 1997 Customer Satisfaction Survey Report: How Do We Measure Up? Technical Report. Survey Report, 1997.

    Science.gov (United States)

    Thurgood, Lori; Fink, Steven; Bureika, Rita; Scott, Julie; Salvucci, Sameena

    The 1997 National Center for Education Statistics (NCES) Customer Satisfaction survey was conducted to find out whether the NCES as an agency was responding to the needs of customers and to identify areas for improvement. Federal, state, and local education officials and academic researchers were asked about their satisfaction with NCES products…

  17. Air Force Materiel Command: A Survey of Performance Measures

    National Research Council Canada - National Science Library

    Leonard, Marcia

    2004-01-01

    ... and Transformation all represent efforts to find and implement effective answers (RAND, 2003:ix). And, while there appears to be a consensus that better performance measures are needed, there is little agreement on exactly what should be measured, and how...

  18. Creation of New Items and Forms for the Project A Assembling Objects Test

    Science.gov (United States)

    1994-08-01

    correct; D, a measure of item discrimination - the rpbi , between the item score (correct or incorrect) and the total score on the 36 original items...D2 another measure of item discrimination - the rpbis between the item score and the total score on the 18 original items of the same type (marked

  19. Using the Consumer Expenditure Survey to Teach Poverty Measurement

    Science.gov (United States)

    Diduch, Amy McCormick

    2012-01-01

    Poverty measurement is often controversial, but good public policy relies crucially on a broadly supported and understood poverty measure. In 2010, the U.S. Census Bureau announced it would begin regular reporting of a new supplemental poverty measure in October 2011. The present article provides background information for a student exercise…

  20. Using the Consumer Expenditure Survey to Teach Poverty Measurement

    Science.gov (United States)

    Diduch, Amy McCormick

    2012-01-01

    Poverty measurement is often controversial, but good public policy relies crucially on a broadly supported and understood poverty measure. In 2010, the U.S. Census Bureau announced it would begin regular reporting of a new supplemental poverty measure in October 2011. The present article provides background information for a student exercise…

  1. Cultural Resources Survey of Gretna Phase 2 Levee Enlargement Item M-99.4 to 95.5-R, Jefferson Parish, Louisiana

    Science.gov (United States)

    1990-01-01

    1111ijor A. LACARRIERF. LATOUR , C, 1 1 _j JAw priacipal Ynxia r 7(h Xliurt District, U. Army, 1815. Scale f Miles C I P R E S S S ll- t ’if A N.. J" ri...Pierre Ste. Pe in 1815 (Bezou 1986:vi). The 1815 LaTour map shows this canal on the Derbigny property (Figure 8). The 1834 Zimpel plan (Figure 2) shows...New Orleans area, such as J.J. Krebs and Sons, and Gandolpho, Kuhn , Luecke and Associates, also have historic surveys of the Gretna project area. These

  2. The Use of PCs, Smartphones, and Tablets in a Probability-Based Panel Survey : Effects on Survey Measurement Error

    NARCIS (Netherlands)

    Lugtig, Peter; Toepoel, Vera

    2016-01-01

    Respondents in an Internet panel survey can often choose which device they use to complete questionnaires: a traditional PC, laptop, tablet computer, or a smartphone. Because all these devices have different screen sizes and modes of data entry, measurement errors may differ between devices. Using

  3. The Use of PCs, Smartphones, and Tablets in a Probability-Based Panel Survey : Effects on Survey Measurement Error

    OpenAIRE

    Lugtig, Peter; Toepoel, Vera

    2016-01-01

    Respondents in an Internet panel survey can often choose which device they use to complete questionnaires: a traditional PC, laptop, tablet computer, or a smartphone. Because all these devices have different screen sizes and modes of data entry, measurement errors may differ between devices. Using data from the Dutch Longitudinal Internet Study for the Social sciences panel, we evaluate which devices respondents use over time. We study the measurement error associated with each device and sho...

  4. Development of the Oxford Participation and Activities Questionnaire: constructing an item pool

    Directory of Open Access Journals (Sweden)

    Kelly L

    2015-05-01

    Full Text Available Laura Kelly, Crispin Jenkinson, Sarah Dummett, Jill Dawson, Ray Fitzpatrick, David Morley Health Services Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK Purpose: The Oxford Participation and Activities Questionnaire is a patient-reported outcome measure in development that is grounded on the World Health Organization International Classification of Functioning, Disability, and Health (ICF. The study reported here aimed to inform and generate an item pool for the new measure, which is specifically designed for the assessment of participation and activity in patients experiencing a range of health conditions. Methods: Items were informed through in-depth interviews conducted with 37 participants spanning a range of conditions. Interviews aimed to identify how their condition impacted their ability to participate in meaningful activities. Conditions included arthritis, cancer, chronic back pain, diabetes, motor neuron disease, multiple sclerosis, Parkinson's disease, and spinal cord injury. Transcripts were analyzed using the framework method. Statements relating to ICF themes were recast as questionnaire items and shown for review to an expert panel. Cognitive debrief interviews (n=13 were used to assess items for face and content validity. Results: ICF themes relevant to activities and participation in everyday life were explored, and a total of 222 items formed the initial item pool. This item pool was refined by the research team and 28 generic items were mapped onto all nine chapters of the ICF construct, detailing activity and participation. Cognitive interviewing confirmed the questionnaire instructions, items, and response options were acceptable to participants. Conclusion: Using a clear conceptual basis to inform item generation, 28 items have been identified as suitable to undergo further psychometric testing. A large-scale postal survey will follow in order to refine the instrument further and

  5. Measuring primordial non-Gaussianity with weak-lensing surveys

    CERN Document Server

    Hilbert, Stefan; Smith, Robert E; Desjacques, Vincent

    2012-01-01

    We study the ability of future weak lensing (WL) surveys to constrain primordial non-Gaussianity of the local type. We use a large ensemble of simulated WL maps with survey specifications relevant to Euclid and LSST. The simulations assume Cold Dark Matter cosmologies that vary certain parameters around fiducial values: the non-Gaussianity parameter f_NL, the matter density parameter Omega_m, the amplitude of the matter power spectrum sigma_8, the spectral index of the primordial power spectrum n_s, and the dark-energy equation-of-state parameter w_0. We assess the sensitivity of the cosmic shear correlation functions, the third-order aperture mass statistics, and the abundance of shear peaks to these parameters. We find that each of the considered probes provides unmarginalized constraints of Delta f_NL ~ 20 on f_NL. Marginalized constraints from any individual WL probe are much weaker due to strong correlations between parameters. However, the parameter errors can be substantially reduced by combining infor...

  6. Measuring Impact of Stabilization Initiatives Survey Data (MISTI)

    Data.gov (United States)

    US Agency for International Development — The raw data from the Measuring Impact of Stabilization Initiatives (MISTI) project is the largest and most comprehensive evaluations of stabilization interventions...

  7. Writing better test items.

    Science.gov (United States)

    Aucoin, Julia W

    2005-01-01

    Professional development specialists have had little opportunity to learn how to write test items to meet the expectations of today's graduate nurse. Schools of nursing have moved away from knowledge-level test items and have had to develop more application and analysis items to prepare graduates for the National Council Licensure Examination (NCLEX). This same type of question can be used effectively to support a competence assessment system and document critical thinking skills.

  8. Reversed item bias: an integrative model.

    Science.gov (United States)

    Weijters, Bert; Baumgartner, Hans; Schillewaert, Niels

    2013-09-01

    In the recent methodological literature, various models have been proposed to account for the phenomenon that reversed items (defined as items for which respondents' scores have to be recoded in order to make the direction of keying consistent across all items) tend to lead to problematic responses. In this article we propose an integrative conceptualization of three important sources of reversed item method bias (acquiescence, careless responding, and confirmation bias) and specify a multisample confirmatory factor analysis model with 2 method factors to empirically test the hypothesized mechanisms, using explicit measures of acquiescence and carelessness and experimentally manipulated versions of a questionnaire that varies 3 item arrangements and the keying direction of the first item measuring the focal construct. We explain the mechanisms, review prior attempts to model reversed item bias, present our new model, and apply it to responses to a 4-item self-esteem scale (N = 306) and the 6-item Revised Life Orientation Test (N = 595). Based on the literature review and the empirical results, we formulate recommendations on how to use reversed items in questionnaires.

  9. Development of a Survey Instrument to Measure TEFL Academics' Perceptions about, Individual and Workplace Characteristics for Conducting Research

    Science.gov (United States)

    Bai, Li; Hudson, Peter; Millwater, Jan; Tones, Megan

    2013-01-01

    A 30-item survey was devised to determine Chinese TEFL (Teaching English as a Foreign Language) academics' potential for conducting research. A five-part Likert scale was used to gather data from 182 academics on four factors: (1) perceptions on teaching-research nexus, (2) personal perspectives for conducting research, (3) predispositions for…

  10. 2004 Workplace and Gender Relations Survey of Reserve Component Members: Report on Scales and Measures

    Science.gov (United States)

    2005-03-01

    scale (FS; Swan, 1997).28 Originally a 15-item scale, the FS was adapted from an emotions scale by Folkman and Lazarus (1985) and measures the extent...Members: Statistical methodology report (Report No. 2004-019). Arlington, VA: DMDC. Folkman , S., & Lazarus , R.S. (1985). If it changes it must be a process...sexual harassment research: Historical perspectives and new initiatives. Military Psychology, 11, 219-231. Lazarus , R. S., & Folkman , S. (1984

  11. Aplicação da TRI em uma medida de avaliação da compreensão de leitura Use of the item response theory on a measure for reading comprehension assessment

    Directory of Open Access Journals (Sweden)

    Lucas de Francisco Carvalho

    2013-01-01

    Full Text Available Este trabalho objetivou verificar os parâmetros dos itens e dos sujeitos, por meio da Teoria de Resposta ao Item (TRI, em uma medida de avaliação da compreensão de leitura, englobando análises quantitativas e qualitativas do mapa de itens, assim como investigar a presença de funcionamento diferencial dos itens (DIF. Participaram 518 crianças do 3º, 4º e 5º anos do ensino fundamental, com idades entre 6 e 16 anos, de escolas particular e pública de Belo Horizonte. Utilizou-se um texto elaborado de acordo com a técnica de Cloze. Foi confirmada a unidimensionalidade do instrumento; verificou-se média de theta maior que a média de dificuldade dos itens; e, a presença de DIF foi observada em alguns itens de acordo com os anos de ensino. Tais resultados demonstraram evidências de validade para o instrumento e são discutidos no trabalho.The objective of the present study was to verify the parameters of items and people by using the Item Response Theory (IRT in a reading comprehension measurement, including quantitative and qualitative analyses of the items map as well as to investigate the presence of Differential Item Functioning (DIF. The sample consisted of 518 children from the 3rd, 4th and 5th grades, aged from 6 to 16, from private and public schools in the city of Belo Horizonte-MG. The instrument was a text prepared according to Cloze technique. The data confirmed the unidimensionality of the instrument; showed average theta higher than the average of items; and, the presence of DIF was observed in some items in relation to the school grades. The results demonstrated validity evidence for the instrument and are discussed in this paper.

  12. Mixture randomized item-response modeling: a smoking behavior validation study.

    Science.gov (United States)

    Fox, J-P; Avetisyan, M; van der Palen, J

    2013-11-30

    Misleading response behavior is expected in medical settings where incriminating behavior is negatively related to the recovery from a disease. In the present study, lung patients feel social and professional pressure concerning smoking and experience questions about smoking behavior as sensitive and tend to conceal embarrassing or threatening information. The randomized item-response survey method is expected to improve the accuracy of self-reports as individual item responses are masked and only randomized item responses are observed. We explored the validation of the randomized item-response technique in a unique experimental study. Therefore, we administered a new multi-item measure assessing smoking behavior by using a treatment-control design (randomized response (RR) or direct questioning). After the questionnaire, we administered a breath test by using a carbon monoxide (CO) monitor to determine the smoking status of the patient. We used the response data to measure the individual smoking behavior by using a mixture item-response model. It is shown that the detected smokers scored significantly higher in the RR condition compared with the directly questioned condition. We proposed a Bayesian latent variable framework to evaluate the diagnostic test accuracy of the questionnaire using the randomized-response technique, which is based on the posterior densities of the subject's smoking behavior scores together with the breath test measurements. For different diagnostic test thresholds, we obtained moderate posterior mean estimates of sensitivity and specificity by observing a limited number of discrete randomized item responses. Copyright © 2013 John Wiley & Sons, Ltd.

  13. Patient experience and satisfaction with inpatient service: development of short form survey instrument measuring the core aspect of inpatient experience.

    Directory of Open Access Journals (Sweden)

    Eliza L Y Wong

    Full Text Available Patient experience reflects quality of care from the patients' perspective; therefore, patients' experiences are important data in the evaluation of the quality of health services. The development of an abbreviated, reliable and valid instrument for measuring inpatients' experience would reflect the key aspect of inpatient care from patients' perspective as well as facilitate quality improvement by cultivating patient engagement and allow the trends in patient satisfaction and experience to be measured regularly. The study developed a short-form inpatient instrument and tested its ability to capture a core set of inpatients' experiences. The Hong Kong Inpatient Experience Questionnaire (HKIEQ was established in 2010; it is an adaptation of the General Inpatient Questionnaire of the Care Quality Commission created by the Picker Institute in United Kingdom. This study used a consensus conference and a cross-sectional validation survey to create and validate a short-form of the Hong Kong Inpatient Experience Questionnaire (SF-HKIEQ. The short-form, the SF-HKIEQ, consisted of 18 items derived from the HKIEQ. The 18 items mainly covered relational aspects of care under four dimensions of the patient's journey: hospital staff, patient care and treatment, information on leaving the hospital, and overall impression. The SF-HKIEQ had a high degree of face validity, construct validity and internal reliability. The validated SF-HKIEQ reflects the relevant core aspects of inpatients' experience in a hospital setting. It provides a quick reference tool for quality improvement purposes and a platform that allows both healthcare staff and patients to monitor the quality of hospital care over time.

  14. LITERATURE SURVEY ON ISOTOPIC ABUNDANCE RATIO MEASUREMENTS - 2001-2005

    Energy Technology Data Exchange (ETDEWEB)

    HOLDEN, N.E.

    2005-08-13

    Along with my usual weekly review of the published literature for new nuclear data, I also search for new candidates for best measurements of isotopic abundances from a single source. Most of the published articles, that I previously had found in the Research Library at the Brookhaven Lab, have already been sent to the members of the Atomic Weights Commission, by either Michael Berglund or Thomas Walczyk. In the last few days, I checked the published literature for any other articles in the areas of natural variations in isotopic abundance ratios, measurements of isotopic abundance ratios on samples of extra-terrestrial material and isotopic abundance ratio measurements performed using ICPMS instruments. Hopefully this information will be of interest to members of the Commission, the sub-committee on isotopic abundance measurements (SIAM), members of the former sub-committee on natural isotopic fractionation (SNIF), the sub-committee on extra-terrestrial isotope ratios (SETIR), the RTCE Task Group and the Guidelines Task Group, who are dealing with ICPMS and TIMS comparisons. In the following report, I categorize the publications in one of four areas. Measurements performed using either positive or negative ions with Thermal Ionization Mass Spectrometer, TIMS, instruments; measurements performed on Inductively Coupled Plasma Mass Spectrometer, ICPMS, instruments; measurements of natural variations of the isotopic abundance ratios; and finally measurements on extra-terrestrial samples with instrumentation of either type. There is overlap in these areas. I selected out variations and ET results first and then categorized the rest of the papers by TIMS and ICPMS.

  15. Understanding CMMI Measurement Capabilities & Impact on Performance: Results from the 2007 SEI State of the Measurement Practice Survey

    Science.gov (United States)

    2016-06-30

    Practice Survey New this year • Screening question to identify respondents whose organizations develop software but rarely if ever do measurement...Questions about — Resources & infrastructure devoted to measurement — Practices to ensure data quality & integrity — Value added by doing measurement — The...University Project & Organizational Measurement Results Reported2 Business Growth & Profitability ML1&DK ML2 ML3 ML4&5 N = 70 N = 55 N = 45 N = 51

  16. Implementation of the forced answering option within online surveys: Do higher item response rates come at the expense of participation and answer quality?

    Directory of Open Access Journals (Sweden)

    Décieux Jean Philippe

    2015-01-01

    Full Text Available Online surveys have become a popular method for data gathering for many reasons, including low costs and the ability to collect data rapidly. However, online data collection is often conducted without adequate attention to implementation details. One example is the frequent use of the forced answering option, which forces the respondent to answer each question in order to proceed through the questionnaire. The avoidance of missing data is often the idea behind the use of the forced answering option. However, we suggest that the costs of a reactance effect in terms of quality reduction and unit nonresponse may be high because respondents typically have plausible reasons for not answering questions. The objective of the study reported in this paper was to test the influence of forced answering on dropout rates and data quality. The results show that requiring participants answer every question increases dropout rates and decreases quality of answers. Our findings suggest that the desire for a complete data set has to be balanced against the consequences of reduced data quality.

  17. Psychometric analysis of the Ten-Item Perceived Stress Scale.

    Science.gov (United States)

    Taylor, John M

    2015-03-01

    Although the 10-item Perceived Stress Scale (PSS-10) is a popular measure, a review of the literature reveals 3 significant gaps: (a) There is some debate as to whether a 1- or a 2-factor model best describes the relationships among the PSS-10 items, (b) little information is available on the performance of the items on the scale, and (c) it is unclear whether PSS-10 scores are subject to gender bias. These gaps were addressed in this study using a sample of 1,236 adults from the National Survey of Midlife Development in the United States II. Based on self-identification, participants were 56.31% female, 77% White, 17.31% Black and/or African American, and the average age was 54.48 years (SD = 11.69). Findings from an ordinal confirmatory factor analysis suggested the relationships among the items are best described by an oblique 2-factor model. Item analysis using the graded response model provided no evidence of item misfit and indicated both subscales have a wide estimation range. Although t tests revealed a significant difference between the means of males and females on the Perceived Helplessness Subscale (t = 4.001, df = 1234, p < .001), measurement invariance tests suggest that PSS-10 scores may not be substantially affected by gender bias. Overall, the findings suggest that inferences made using PSS-10 scores are valid. However, this study calls into question inferences where the multidimensionality of the PSS-10 is ignored. 2015 APA, all rights reserved

  18. Measuring redshift-space distortions with future SKA surveys

    CERN Document Server

    Raccanelli, Alvise; Camera, Stefano; Bacon, David; Blake, Chris; Dore, Olivier; Ferreira, Pedro; Maartens, Roy; Santos, Mario; Viel, Matteo; Zhao, Gong-bo

    2015-01-01

    The peculiar motion of galaxies can be a particularly sensitive probe of gravitational collapse. As such, it can be used to measure the dynamics of dark matter and dark energy as well the nature of the gravitational laws at play on cosmological scales. Peculiar motions manifest themselves as an overall anisotropy in the measured clustering signal as a function of the angle to the line-of-sight, known as redshift-space distortion (RSD). Limiting factors in this measurement include our ability to model non-linear galaxy motions on small scales and the complexities of galaxy bias. The anisotropy in the measured clustering pattern in redshift-space is also driven by the unknown distance factors at the redshift in question, the Alcock-Paczynski distortion. This weakens growth rate measurements, but permits an extra geometric probe of the Hubble expansion rate. In this chapter we will briefly describe the scientific background to the RSD technique, and forecast the potential of the SKA phase 1 and the SKA2 to measu...

  19. GPS survey in long baseline neutrino-oscillation measurement

    CERN Document Server

    Noumi, H; Inagaki, T; Hasegawa, T; Katoh, Y; Kohama, M; Kurodai, M; Kusano, E; Maruyama, T; Minakawa, M; Nakamura, K; Nishikawa, K; Sakuda, M; Suzuki, Y; Takasaki, M; Tanaka, K H; Yamanoi, Y; 10.1109/TNS.2004.836042

    2004-01-01

    We made a series of surveys to obtain neutrino beam line direction toward SuperKamiokande (SK) at a distance of 250 km for the long- baseline neutrino oscillation experiment at KEK. We found that the beam line is directed to SK within 0.03 mr and 0.09 mr (in sigma) in the horizontal and vertical directions, respectively. During beam operation, we monitored the muon distribution from secondary pions produced at the target and collected by a magnetic horn system. We found that the horn system functions like a lens of a point-to- parallel optics with magnification of approximately -100 and the focal length of 2.3 m. Namely, a small displacement of the primary beam position at the target is magnified about a factor -100 at the muon centroid, while the centroid position is almost stable against a change of the incident angle of the primary beam. Therefore, the muon centroid can be a useful monitor of the neutrino beam direction. We could determine the muon centroid within 6 mm and 12 mm in horizontal and vertical ...

  20. Measurement properties of a novel survey to assess stages of organizational readiness for evidence-based interventions in community chronic disease prevention settings

    Directory of Open Access Journals (Sweden)

    Stamatakis Katherine A

    2012-07-01

    Full Text Available Abstract Background There is a great deal of variation in the existing capacity of primary prevention programs and policies addressing chronic disease to deliver evidence-based interventions (EBIs. In order to develop and evaluate implementation strategies that are tailored to the appropriate level of capacity, there is a need for an easy-to-administer tool to stage organizational readiness for EBIs. Methods Based on theoretical frameworks, including Rogers’ Diffusion of Innovations, we developed a survey instrument to measure four domains representing stages of readiness for EBI: awareness, adoption, implementation, and maintenance. A separate scale representing organizational climate as a potential mediator of readiness for EBIs was also included in the survey. Twenty-three questions comprised the four domains, with four to nine items each, using a seven-point response scale. Representatives from obesity, asthma, diabetes, and tobacco prevention programs serving diverse populations in the United States were surveyed (N = 243; test-retest reliability was assessed with 92 respondents. Results Confirmatory factor analysis (CFA was used to test and refine readiness scales. Test-retest reliability of the readiness scales, as measured by intraclass correlation, ranged from 0.47–0.71. CFA found good fit for the five-item adoption and implementation scales and resulted in revisions of the awareness and maintenance scales. The awareness scale was split into two two-item scales, representing community and agency awareness. The maintenance scale was split into five- and four-item scales, representing infrastructural maintenance and evaluation maintenance, respectively. Internal reliability of scales (Cronbach’s α ranged from 0.66–0.78. The model for the final revised scales approached good fit, with most factor loadings >0.6 and all >0.4. Conclusions The lack of adequate measurement tools hinders progress in dissemination and implementation

  1. The development of a measure of correlates of child sexual abuse: the Traumatic Sexualization Survey.

    Science.gov (United States)

    Matorin, A I; Lynn, S J

    1998-04-01

    The present research developed an instrument which assesses cognitive and behavioral factors purportedly associated with child sexual abuse histories. Finkelhor and Browne's construct of traumatic sexualization served as a guide for item selection. The study resulted in a 38-item reliable measure consisting of four subscales: Avoidance and Fear of Sexual and Physical Intimacy, Thoughts About Sex, Role of Sex in Relationships, and Attraction/Interest and Sexuality. Construct validity was established using a variety of self-report instruments associated with the dimensions of traumatic sexualization. Sexually abused women scored higher than nonabused women on three TSS factors. Physically abused women differed from nonabused women on only one factor. Sexually abused women did not score significantly higher than physically abused women on any factors.

  2. Measuring neutrino masses with a future galaxy survey

    DEFF Research Database (Denmark)

    Hamann, Jan; Hannestad, Steen; Wong, Yvonne Y. Y.

    2012-01-01

    that the minimum mass sum of sum m_nu ~ 0.06 eV in the normal hierarchy can be detected at 1.5 sigma to 2.5 sigma significance, depending on the model complexity, using a combination of galaxy and cosmic shear power spectrum measurements in conjunction with CMB temperature and polarisation observations from Planck....... With better knowledge of the galaxy bias, the significance of the detection could potentially reach 5.4 sigma. Interestingly, neither Planck+shear nor Planck+galaxy alone can achieve this level of sensitivity; it is the combined effect of galaxy and cosmic shear power spectrum measurements that breaks...... the persistent degeneracies between the neutrino mass, the physical matter density, and the Hubble parameter. Notwithstanding this remarkable sensitivity to sum m_nu, Euclid-like shear and galaxy data will not be sensitive to the exact mass spectrum of the neutrino sector; no significant bias (sigma...

  3. A Survey of Advanced Microwave Frequency Measurement Techniques

    OpenAIRE

    Anand Swaroop Khare,

    2012-01-01

    Microwaves are radio waves with wavelengths ranging from as long as one meter to as short as one millimeter, or equivalently, with frequencies between 300 MHz and 300 GHz. The science of photonics includes the generation, emission, modulation, signal processing, switching, transmission, amplification, detection and sensing of light. Microwave photonics has been introduced for achieving ultra broadband signal processing. Instantaneous Frequency Measurement (IFM) receivers play an important ro...

  4. Teacher Collective Bargaining in Washington State: Assessing the Internal Validity of Partial Independence Item Response Measures of Contract Restrictiveness. CEDR Working Paper No. 2012 3.0

    Science.gov (United States)

    Goldhaber, Dan; Lavery, Lesley; Theobald, Roddy; D'Entremont, Dylan; Fang, Yangru

    2012-01-01

    Recent research (Strunk and Reardon forthcoming) applies Partial Independence Item Response (PIIR) models to teacher bargaining agreements in California to calculate the latent restrictiveness of these contracts. Further research (Strunk and Grissom 2010; Strunk forthcoming) tests the external validity of these estimates. Given that much research…

  5. Measuring impairments of functioning and health in patients with axial spondyloarthritis by using the ASAS Health Index and the Environmental Item Set

    DEFF Research Database (Denmark)

    Kiltz, U; van der Heijde, D; Boonen, A

    2016-01-01

    , were included. RESULTS: The ASAS HI and the EF Item Set were translated into Arabic, Chinese, Croatian, Dutch, French, German, Greek, Hungarian, Italian, Korean, Portuguese, Russian, Spanish, Thai and Turkish. Some difficulties were experienced with translation of the contextual factors indicating...

  6. The Effect of the Probability of Correct Response on the Variability of Measures of Differential Item Functioning. Program Statistics Research Technical Report No. 94-4.

    Science.gov (United States)

    Zwick, Rebecca

    The Mantel Haenszel (MH; 1959) approach of Holland and Thayer (1988) is a well-established method for assessing differential item functioning (DIF). The formula for the variance of the MH DIF statistic is based on work by Phillips and Holland (1987) and Robins, Breslow, and Greenland (1986). Recent simulation studies showed that the MH variances…

  7. Improving the Reliability of Student Scores from Speeded Assessments: An Illustration of Conditional Item Response Theory Using a Computer-Administered Measure of Vocabulary

    Science.gov (United States)

    Petscher, Yaacov; Mitchell, Alison M.; Foorman, Barbara R.

    2015-01-01

    A growing body of literature suggests that response latency, the amount of time it takes an individual to respond to an item, may be an important factor to consider when using assessment data to estimate the ability of an individual. Considering that tests of passage and list fluency are being adapted to a computer administration format, it is…

  8. The validity of self-rated health as a measure of health status among young military personnel: evidence from a cross-sectional survey

    Directory of Open Access Journals (Sweden)

    Vander Weg Mark W

    2006-08-01

    Full Text Available Abstract Background Single item questions about self ratings of overall health status are widely used in both military and civilian surveys. Limited information is available to date that examines what relationships exist between self-rated health, health status and health related behaviors among relatively young, healthy individuals. Methods The current study uses the population of active duty United States Air Force recruits (N = 31,108. Participants completed surveys that asked about health behaviors and health states and were rated their health on a continuum from poor to excellent. Results Ratings of health were consistently lower for those who used tobacco (F = 241.7, p Conclusion Given the consistent relationship between self-rated overall health and factors important to military health and fitness, self-rated health appears to be a valid measure of health status among young military troops.

  9. The ACS Virgo Cluster Survey IV: Data Reduction Procedures for Surface Brightness Fluctuation Measurements with the Advanced Camera for Surveys

    CERN Document Server

    Mei, S; Tonry, J L; Jordan, A; Peng, E W; Côté, P; Ferrarese, L; Merritt, D; Milosavljevic, M; West, M J; Mei, Simona; Blakeslee, John P.; Tonry, John L.; Jordan, Andres; Peng, Eric W.; Cote, Patrick; Ferrarese, Laura; Merritt, David; Milosavljevic, Milos; West, Michael J.

    2005-01-01

    The Advanced Camera for Surveys (ACS) Virgo Cluster Survey is a large program to image 100 early-type Virgo galaxies using the F475W and F850LP bandpasses of the Wide Field Channel of the ACS instrument on the Hubble Space Telescope (HST). The scientific goals of this survey include an exploration of the three-dimensional structure of the Virgo Cluster and a critical examination of the usefulness of the globular cluster luminosity function as a distance indicator. Both of these issues require accurate distances for the full sample of 100 program galaxies. In this paper, we describe our data reduction procedures and examine the feasibility of accurate distance measurements using the method of surface brightness fluctuations (SBF) applied to the ACS Virgo Cluster Survey F850LP imaging. The ACS exhibits significant geometrical distortions due to its off-axis location in the HST focal plane; correcting for these distortions by resampling the pixel values onto an undistorted frame results in pixel correlations tha...

  10. Development and Validation of the Poverty Attributions Survey

    Science.gov (United States)

    Bennett, Robert M.; Raiz, Lisa; Davis, Tamara S.

    2016-01-01

    This article describes the process of developing and testing the Poverty Attribution Survey (PAS), a measure of poverty attributions. The PAS is theory based and includes original items as well as items from previously tested poverty attribution instruments. The PAS was electronically administered to a sample of state-licensed professional social…

  11. The Effects of Survey Timing on Student Evaluation of Teaching Measures Obtained Using Online Surveys

    Science.gov (United States)

    Estelami, Hooman

    2015-01-01

    Teaching evaluations are an important measurement tool used by business schools in gauging the level of student satisfaction with the educational services delivered by faculty. The growing use of online teaching evaluations has enabled educational administrators to expand the time period during which student evaluation of teaching (SET) surveys…

  12. RCPAQAP First Combined Measurement and Reference Interval Survey.

    Science.gov (United States)

    Jones, Graham Rd; Koetsier, Sabrina DA

    2014-11-01

    Reference intervals are commonly considered to allow for between-laboratory bias. The RCPAQAP Liquid Serum Chemistry Program has collected data on laboratory measurements as well as reference intervals. This allows assessment of the between-laboratory variation in results, reference intervals and the information transmitted by the combination of these factors. For the majority of common chemistry analytes, the between-laboratory variation in reference intervals is greater than the variation in results. Additionally the reference interval variation is generally not related to bias between the results. Use of common reference intervals, either as an average of the current intervals in use, or the intervals proposed by the AACB Harmonisation Group, improved the variation seen in the information produced by different laboratories.

  13. Determining the Measurement Quality of a Montessori High School Teacher Evaluation Survey

    Directory of Open Access Journals (Sweden)

    Anthony Philip Setari

    2017-05-01

    Full Text Available The purpose of this study was to conduct a psychometric validation of a course evaluation instrument, known as a student evaluation of teaching (SET, implemented in a Montessori high school. The authors demonstrate to the Montessori community how to rigorously examine the measurement and assessment quality of instruments used within Montessori schools. The Montessori high school community needs an SET that has been rigorously examined for measurement issues. The examined SET was developed by a Montessori high school, and the sample data were collected from Montessori high school students. Using a Rasch partial credit model, the results of the analysis identified several measurement issues, including multidimensionality, misfit items, and inappropriate item difficulty levels. A revised version of the SET underwent the same analysis procedure, and the results indicated that measurement issues persisted. The authors suggest several ways to improve the overall measurement quality of the instrument while keeping the Montessori foundation. Additional validation studies with a revised version of the SET will be needed before the instrument can be endorsed for full implementation in a Montessori setting.

  14. Measuring stigma among abortion providers: assessing the Abortion Provider Stigma Survey instrument.

    Science.gov (United States)

    Martin, Lisa A; Debbink, Michelle; Hassinger, Jane; Youatt, Emily; Eagen-Torkko, Meghan; Harris, Lisa H

    2014-01-01

    We explored the psychometric properties of 15 survey questions that assessed abortion providers' perceptions of stigma and its impact on providers' professional and personal lives referred to as the Abortion Provider Stigma Survey (APSS). We administered the survey to a sample of abortion providers recruited for the Providers' Share Workshop (N = 55). We then completed analyses using Stata SE/12.0. Exploratory factor analysis, which resulted in 13 retained items and identified three subscales: disclosure management, resistance and resilience, and discrimination. Stigma was salient in abortion provider's lives: they identified difficulties surrounding disclosure (66%) and felt unappreciated by society (89%). Simultaneously, workers felt they made a positive contribution to society (92%) and took pride in their work (98%). Paired t-test analyses of the pre- and post-Workshop APSS scores showed no changes in the total score. However, the Disclosure Management subscale scores were significantly lower (indicating decreased stigma) for two subgroups of participants: those over the age of 30 and those with children. This analysis is a promising first step in the development of a quantitative tool for capturing abortion providers' experiences of and responses to pervasive abortion stigma.

  15. Measuring teamwork in health care settings: a review of survey instruments.

    Science.gov (United States)

    Valentine, Melissa A; Nembhard, Ingrid M; Edmondson, Amy C

    2015-04-01

    Teamwork in health care settings is widely recognized as an important factor in providing high-quality patient care. However, the behaviors that comprise effective teamwork, the organizational factors that support teamwork, and the relationship between teamwork and patient outcomes remain empirical questions in need of rigorous study. To identify and review survey instruments used to assess dimensions of teamwork so as to facilitate high-quality research on this topic. We conducted a systematic review of articles published before September 2012 to identify survey instruments used to measure teamwork and to assess their conceptual content, psychometric validity, and relationships to outcomes of interest. We searched the ISI Web of Knowledge database, and identified relevant articles using the search terms team, teamwork, or collaboration in combination with survey, scale, measure, or questionnaire. We found 39 surveys that measured teamwork. Surveys assessed different dimensions of teamwork. The most commonly assessed dimensions were communication, coordination, and respect. Of the 39 surveys, 10 met all of the criteria for psychometric validity, and 14 showed significant relationships to nonself-report outcomes. Evidence of psychometric validity is lacking for many teamwork survey instruments. However, several psychometrically valid instruments are available. Researchers aiming to advance research on teamwork in health care should consider using or adapting one of these instruments before creating a new one. Because instruments vary considerably in the behavioral processes and emergent states of teamwork that they capture, researchers must carefully evaluate the conceptual consistency between instrument, research question, and context.

  16. Testing General Relativity with Growth rate measurement from Sloan Digital Sky Survey Baryon Oscillations Spectroscopic Survey galaxies

    CERN Document Server

    Alam, Shadab; Vargas-Magaña, Mariana; Schneider, Donald P

    2015-01-01

    The measured redshift ($z$) of an astronomical object is a combination of Hubble recession, gravitational redshift and peculiar velocity. In particular, the line of sight distance to a galaxy inferred from redshift is affected by the peculiar velocity component of galaxy redshift, which can also be observed as an anisotropy in the correlation function. This anisotropy allows us to measure the linear growth rate of matter ($f\\sigma_8$). In this paper, we measure the linear growth rate of matter ($f\\sigma_8$) at $z=0.57$ using the CMASS sample from Data Release 11 of Sloan Digital Sky Survey III (SDSS III) Baryon Oscillations Spectroscopic Survey (BOSS). The galaxy sample consists of 690,826 Luminous Red Galaxies (LRGs) in the redshift range 0.43 to 0.7 covering 8498 deg$^2$. Here we report the first measurement of $f\\sigma_8$ and cosmology using Convolution Lagrangian Perturbation Theory (CLPT) with Gaussian streaming model (GSRSD). We arrive at a constraint of $f\\sigma_8=0.462\\pm0.041$ (9\\% accuracy) at effec...

  17. [Several common biases and control measures during sampling survey of eye diseases in China].

    Science.gov (United States)

    Guan, Huai-jin

    2008-06-01

    Bias is a common artificial error during sampling survey in eye diseases, and is a major impact factor for validity and reliability of the survey. The causes and the control measures of several biases regarding current sampling survey of eye diseases in China were analyzed and discussed, including the sampling bias, non-respondent bias, and diagnostic bias. This review emphasizes that controlling bias is the key to ensure quality of sampling survey. Random sampling, sufficient sample quantity, careful examination and taking history, improving examination rate, accurate diagnosis, strict training and preliminary study, as well as quality control can eliminate or minimize biases and improve the sampling survey quality of eye diseases in China

  18. 某三甲医院手术科室麻醉自费项目现状调查%Survey of Anesthetic Items at Patients' Own Expense in Operation Departments of a Top Grade-three Hospital

    Institute of Scientific and Technical Information of China (English)

    曾俊群; 安玉蓉; 高彤

    2012-01-01

    Objective To study on the proportion of the anesthetic costs at their own expense to total costs and anesthetic items at their own expense in operation departments of a certain top grade-three hospital in Beijing. Methods The status survey on anesthetic items at patients' own expense was done in 15 operation departments of a certain top grade-three hospital in Beijing from March 1st, 2011 to March 7th, 2011. Results The ratio of anesthetic costs to total costs at their own expense in the operation departments was 12.04%. The top 5 proportions of costs of anesthetic items at their own expense was Flurbiprofen Axetil Injection (43.2%), Remifentanil Injection (17.43%), Analgesic Pump (10.04%), Cisatracurium Besylate Injection (8.2%), Ulinastatin Injection (8.14%), respectively. Ratio at one' sown expense has positive correlation with anesthetic costs at one's own expense and negative correlation with proportion of costs of anesthetic items at one's own expense. Conclusion Only resolving reimbursement issues of Flurbiprofen Axetil Injection, Analgesic Pump and Remifentanil Injection, etc can basically control anesthetic fees at their own expense.%目的 探讨现阶段北京某三甲医院手术科室住院医疗保险项目付费自费费用中麻醉自费费用所占比例以及麻醉自费项目发生情况.方法 对北京某三甲医院2011年3月1-7日一周的15个手术科室医保出院病人麻醉自费项目进行现状调查.结果 北京某三甲医院手术科室麻醉自费费用占总自费费用比例为12.04%,麻醉自费项目费用构成排名居前5位的是凯纷注射液(占43.20%)、瑞芬太尼注射液(占17.43%)、镇痛泵(占10.04%)、赛机宁注射液(占8.20%)、天普洛安粉针(占8.14%),自费比例与麻醉自费费用正相关,与麻醉自费项目费用构成负相关.结论 只有解决凯纷注射液、瑞芬太尼注射液和镇痛泵等的报销问题,才能从根本上控制麻醉自费费用.

  19. Funcionamento diferencial dos itens (DIF: estudo com analogias para medir o raciocínio verbal Differential items functioning (DIF: study with analogies for measurement the verbal reasoning

    Directory of Open Access Journals (Sweden)

    Wagner Bandeira Andriola

    2000-01-01

    Full Text Available Este estudo objetivou determinar o funcionamento diferencial de 30 analogias destinadas à avaliação do raciocínio verbal, considerando a variável sexo. Utilizou-se uma amostra de 730 alunos do Ensino Médio, com idade média de 17,74 anos (dp= 3,12 anos. A maioria procedia de escolas públicas (58,5% e era do sexo feminino (53,2%. Os grupos organizados para a investigação foram compostos por homens (n=342 e mulheres (n=388. Os parâmetros métricos dos itens foram determinados pelo modelo TRI de dois parâmetros logísticos. Para a verificação do DIF foram comparados os parâmetros métricos dos itens. Os resultados indicaram a presença de cinco itens com DIF.This research aimed the determination of the differential item functioning (DIF in 30 analogies used for the verbal reasoning assessment in students, taking into account the sex variable. A sample of 730 high school students, whose average age was 17,74 years (sd = 3,12 years was used. The majority was composed by students from public schools (58,4% and females (53,3%. The groups which participated in the study of DIF were composed by men (n= 342 and women (n= 388. The metric parameters of the items were determined according to the TRI model of two logistics parameters. For the determination of the DIF the method of comparation of the metric parameters of the items was used. The results indicated the presence of five items with DIF.

  20. Can i just check...? Effects of edit check questions on measurement error and survey estimates

    NARCIS (Netherlands)

    Lugtig, Peter; Jäckle, Annette

    2014-01-01

    Household income is difficult to measure, since it requires the collection of information about all potential income sources for each member of a household.Weassess the effects of two types of edit check questions on measurement error and survey estimates: within-wave edit checks use responses to

  1. Characteristics of physical measurement consent in a population-based survey of older adults.

    Science.gov (United States)

    Sakshaug, Joseph W; Couper, Mick P; Ofstedal, Mary Beth

    2010-01-01

    Collecting physical measurements in population-based health surveys has increased in recent years, yet little is known about the characteristics of those who consent to these measurements. To examine the characteristics of persons who consent to physical measurements across several domains, including one's demographic background, health status, resistance behavior toward the survey interview, and interviewer characteristics. We conducted a secondary data analysis of the 2006 Health and Retirement Study, a nationally-representative panel survey of older adults aged 51 and older. We performed multilevel logistic regressions on a sample of 7457 respondents who were eligible for physical measurements. The primary outcome measure was consent to all physical measurements. Seventy-nine percent (unweighted) of eligible respondents consented to all physical measurements. In weighted multilevel logistic regressions controlling for respondent demographics, current health status, survey resistance indicators, and interviewer characteristics, the propensity to consent was significantly greater among Hispanic respondents matched with bilingual Hispanic interviewers, patients with diabetes, and those who visited a doctor in the past 2 years. The propensity to consent was significantly lower among younger respondents, those who have several Nagi functional limitations and infrequently participate in "mildly vigorous" activities, and those interviewed by black interviewers. Survey resistance indicators, such as number of contact attempts and interviewer observations of resistant behavior in prior wave iterations of the Health and Retirement Study were also negatively associated with physical measurement consent. The propensity to consent was unrelated to prior medical diagnoses, including high blood pressure, cancer (excluding skin), lung disease, heart abnormalities, stroke, and arthritis, and matching of interviewer and respondent on race and gender. Physical measurement consent

  2. Development and testing of a survey instrument to measure benefits of a nursing information system.

    Science.gov (United States)

    Abdrbo, Amany A; Zauszniewski, Jaclene A; Hudak, Christine A; Anthony, Mary K

    2011-01-01

    Information systems (IS) benefits for nurses are outcomes related to the tangible products or improvements that nurses realize from using IS. This study examined the development and psychometric testing of a measure of nurses' benefits from IS. A random sample of 570 nurses working in hospitals, providing direct patient care, and using IS completed the study questionnaire. The internal consistency reliability of the results was .97. Exploratory factor analysis, using principal components extraction and varimax rotation, revealed items loaded on four factors (saving time and efficiency, quality of care, charting, and professional practice) that were confirmed by confirmatory factor analysis. Continued refinement of the instrument is needed with more diverse samples of nurses.

  3. Hot Big Planets Kepler Survey: Measuring the Repopulation Rate of the Shortest-Period Planets

    OpenAIRE

    Taylor, Stuart F.

    2013-01-01

    By surveying new fields for the shortest-period "big" planets, the Kepler spacecraft could provide the statistics to more clearly measure the occurrence distributions of giant and medium planets. This would allow separate determinations for giant and medium planets of the relationship between the inward rate of tidal migration of planets and the strength of the stellar tidal dissipation (as expressed by the tidal quality factor Q). We propose a "Hot Big Planets Survey" to find new big planets...

  4. Will kinematic Sunyaev-Zel'dovich measurements enhance the science return from galaxy redshift surveys?

    Science.gov (United States)

    Sugiyama, Naonori S.; Okumura, Teppei; Spergel, David N.

    2017-01-01

    Yes. Future CMB experiments such as Advanced ACTPol and CMB-S4 should achieve measurements with S/N of > 0.1 for the typical host halo of galaxies in redshift surveys. These measurements will provide complementary measurements of the growth rate of large scale structure f and the expansion rate of the Universe H to galaxy clustering measurements. This paper emphasizes that there is significant information in the anisotropy of the relative pairwise kSZ measurements. We expand the relative pairwise kSZ power spectrum in Legendre polynomials and consider up to its octopole. Assuming that the noise in the filtered maps is uncorrelated between the positions of galaxies in the survey, we derive a simple analytic form for the power spectrum covariance of the relative pairwise kSZ temperature in redshift space. While many previous studies have assumed optimistically that the optical depth of the galaxies τT in the survey is known, we marginalize over τT, to compute constraints on the growth rate f and the expansion rate H. For realistic survey parameters, we find that combining kSZ and galaxy redshift survey data reduces the marginalized 1-σ errors on H and f to ~50-70% compared to the galaxy-only analysis.

  5. An emotional functioning item bank of 24 items for computerized adaptive testing (CAT) was established

    DEFF Research Database (Denmark)

    Petersen, Morten Aa.; Gamper, Eva-Maria; Costantini, Anna

    2016-01-01

    OBJECTIVE: To improve measurement precision, the European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group is developing an item bank for computerized adaptive testing (CAT) of emotional functioning (EF). The item bank will be within the conceptual framework...... of the widely used EORTC Quality of Life questionnaire (QLQ-C30). STUDY DESIGN AND SETTING: On the basis of literature search and evaluations by international samples of experts and cancer patients, 38 candidate items were developed. The psychometric properties of the items were evaluated in a large...... international sample of cancer patients. This included evaluations of dimensionality, item response theory (IRT) model fit, differential item functioning (DIF), and of measurement precision/statistical power. RESULTS: Responses were obtained from 1,023 cancer patients from four countries. The evaluations showed...

  6. Clowning as a supportive measure in paediatrics - a survey of clowns, parents and nursing staff

    Science.gov (United States)

    2013-01-01

    Background Hospital clowns, also known as clown doctors, can help paediatric patients with the stress of a hospitalization and to circumvent the accompanying feelings of fear, helplessness and sadness, thus supporting the healing process. The objectives of the present study were to clarify the structural and procedural conditions of paediatric clowning in Germany and to document the evaluations of hospital clowns, parents and hospital staff. Methods A nationwide online survey of hospital clowns currently active in paediatric departments and an accompanying field evaluation in Hamburg hospitals with surveys of parents and hospital staff were conducted. In addition to items developed specifically for the study regarding general conditions, procedures, assessments of effects and attitudes, the Work Satisfaction Scale was used. The sample included n = 87 hospital clowns, 37 parents and 43 hospital staff members. Results The online survey showed that the hospital clowns are well-trained, motivated and generally satisfied with their work. By their own estimate, they primarily boost morale and promote imagination in the patients. However, hospital clowns also desire better interdisciplinary collaboration and financial security as well as more recognition of their work. The Hamburg field study confirmed the positive results of the clown survey. According to the data, a clown intervention boosts morale and reduces stress in the patients. Moreover, there are practically no side effects. Both parents and hospital staff stated that the patients as well as they themselves benefited from the intervention. Conclusions The results match those of previous studies and give a very positive picture of hospital clowning, so that its routine use and expansion thereof can be recommended. Furthermore, the intervention should be subject to the rules of evidence-based medicine like other medical treatments. PMID:24112744

  7. Item response theory - A first approach

    Science.gov (United States)

    Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

    2017-07-01

    The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.

  8. National Hospice Item Set (HIS) data

    Data.gov (United States)

    U.S. Department of Health & Human Services — This data set includes the national averages (mean) for quality measure scores of Medicare-certified hospice agencies calculated from the Hospice Item Set (HIS) for...

  9. Confirmatory Factor Analysis of the M5-50: An Implementation of the International Personality Item Pool Item Set

    Science.gov (United States)

    Socha, Alan; Cooper, Christopher A.; McCord, David M.

    2010-01-01

    Goldberg's International Personality Item Pool (IPIP; Goldberg, 1999) provides researchers with public-domain, free-access personality measurement scales that are proxies of well-established published scales. One of the more commonly used IPIP sets employs 50 items to measure the 5 broad domains of the 5-factor model, with 10 items per factor. The…

  10. Measuring Ocean Literacy: What teens understand about the ocean using the Survey of Ocean Literacy and Engagement (SOLE)

    Science.gov (United States)

    Greely, T. M.; Lodge, A.

    2009-12-01

    Ocean issues with conceptual ties to science and global society have captured the attention, imagination, and concern of an international audience. Climate change, over fishing, marine pollution, freshwater shortages and alternative energy sources are a few ocean issues highlighted in our media and casual conversations. The ocean plays a role in our life in some way everyday, however, disconnect exists between what scientists know and the public understands about the ocean as revealed by numerous ocean and coastal literacy surveys. While the public exhibits emotive responses through care, concern and connection with the ocean, there remains a critical need for a baseline of ocean knowledge. However, knowledge about the ocean must be balanced with understanding about how to apply ocean information to daily decisions and actions. The present study analyzed underlying factors and patterns contributing to ocean literacy and reasoning within the context of an ocean education program, the Oceanography Camp for Girls. The OCG is designed to advance ocean conceptual understanding and decision making by engagement in a series of experiential learning and stewardship activities from authentic research settings in the field and lab. The present study measured a) what understanding teens currently hold about the ocean (content), b) how teens feel toward the ocean environment (environmental attitudes and morality), and c) how understanding and feelings are organized when reasoning about ocean socioscientific issues (e.g. climate change, over fishing, energy). The Survey of Ocean Literacy and Engagement (SOLE), was used to measure teens understanding about the ocean. SOLE is a 57-item survey instrument aligned with the Essential Principles and Fundamental Concepts of Ocean Literacy (NGS, 2007). Rasch analysis was used to refine and validate SOLE as a reasonable measure of ocean content knowledge (reliability, 0.91). Results revealed that content knowledge and environmental

  11. Calorimetry of low mass Pu239 items

    Energy Technology Data Exchange (ETDEWEB)

    Cremers, Teresa L [Los Alamos National Laboratory; Sampson, Thomas E [Los Alamos National Laboratory

    2010-01-01

    Calorimetric assay has the reputation of providing the highest precision and accuracy of all nondestructive assay measurements. Unfortunately, non-destructive assay practitioners and measurement consumers often extend, inappropriately, the high precision and accuracy of calorimetric assay to very low mass items. One purpose of this document is to present more realistic expectations for the random uncertainties associated with calorimetric assay for weapons grade plutonium items with masses of 200 grams or less.

  12. Teoria da Resposta ao Item Teoria de la respuesta al item Item response theory

    Directory of Open Access Journals (Sweden)

    Eutalia Aparecida Candido de Araujo

    2009-12-01

    Full Text Available A preocupação com medidas de traços psicológicos é antiga, sendo que muitos estudos e propostas de métodos foram desenvolvidos no sentido de alcançar este objetivo. Entre os trabalhos propostos, destaca-se a Teoria da Resposta ao Item (TRI que, a princípio, veio completar limitações da Teoria Clássica de Medidas, empregada em larga escala até hoje na medida de traços psicológicos. O ponto principal da TRI é que ela leva em consideração o item particularmente, sem relevar os escores totais; portanto, as conclusões não dependem apenas do teste ou questionário, mas de cada item que o compõe. Este artigo propõe-se a apresentar esta Teoria que revolucionou a teoria de medidas.La preocupación con las medidas de los rasgos psicológicos es antigua y muchos estudios y propuestas de métodos fueron desarrollados para lograr este objetivo. Entre estas propuestas de trabajo se incluye la Teoría de la Respuesta al Ítem (TRI que, en principio, vino a completar las limitaciones de la Teoría Clásica de los Tests, ampliamente utilizada hasta hoy en la medida de los rasgos psicológicos. El punto principal de la TRI es que se tiene en cuenta el punto concreto, sin relevar las puntuaciones totales; por lo tanto, los resultados no sólo dependen de la prueba o cuestionario, sino que de cada ítem que lo compone. En este artículo se propone presentar la Teoría que revolucionó la teoría de medidas.The concern with measures of psychological traits is old and many studies and proposals of methods were developed to achieve this goal. Among these proposed methods highlights the Item Response Theory (IRT that, in principle, came to complete limitations of the Classical Test Theory, which is widely used until nowadays in the measurement of psychological traits. The main point of IRT is that it takes into account the item in particular, not relieving the total scores; therefore, the findings do not only depend on the test or questionnaire

  13. Item Parameter Estimation for Multidimensional Measurement: Comparisons of SEM and MIRT Based Methods%多维测验项目参数的估计:基于SEM与MIRT方法的比较

    Institute of Scientific and Technical Information of China (English)

    刘红云; 骆方; 王玥; 张玉

    2012-01-01

    ) estimates converted to SEM parameters, the WLSMV, MLR, and MCMC results are strikingly similar. But in small sample size and long test, weighted least squares for categorical data (WLSc) did not obtain the convergence parameter estimations, although in short test, WLSc estimates have been obtained, the estimates are consistently more discrepant than those produced by the other estimation techniques. (2) The precision of the estimators enhances as the quantity of the sample increases, and the differences between WLSMV and MLR are very trivial, and the precisions of WLSMV and MLR methods are not worse than that of the MCMC method in most conditions. (3) The precision of item factor loading and of item difficulty parameter is influenced by the test length, and the precision of item discrimination and of item difficulty parameter is influenced by the number of test dimension. (4) The precision of the estimators decreases as the number of dimensions measured by the item increases, especially for item discrimination and item factor loading parameter.Both SEM and IRT can be used for factor analysis of dichotomous item responses. In this case, the measurement models of both approaches are formally equivalent. They were refined within and across different disciplines, and make complementary contributions to central measurement problems encountered in almost all empirical social science research fields. The authors conclude with considerations for categorical item factor analysis and give some advice for applied researchers.

  14. Measuring Model-Based High School Science Instruction: Development and Application of a Student Survey

    Science.gov (United States)

    Fulmer, Gavin W.; Liang, Ling L.

    2013-02-01

    This study tested a student survey to detect differences in instruction between teachers in a modeling-based science program and comparison group teachers. The Instructional Activities Survey measured teachers' frequency of modeling, inquiry, and lecture instruction. Factor analysis and Rasch modeling identified three subscales, Modeling and Reflecting, Communicating and Relating, and Investigative Inquiry. As predicted, treatment group teachers engaged in modeling and inquiry instruction more than comparison teachers, with effect sizes between 0.55 and 1.25. This study demonstrates the utility of student report data in measuring teachers' classroom practices and in evaluating outcomes of a professional development program.

  15. Indoor Environment and Energy Use in Historic Buildings - Comparing Survey Results with Measurements and Simulations

    DEFF Research Database (Denmark)

    Rohdin, P.; Dalewski, M.; Moshfegh, B.

    2012-01-01

    Increasing demand for energy efficiency places new requirements on energy use in historic buildings. Efficient energy use is essential if a historic building is to be used and preserved, especially buildings with conventional uses such as residential buildings and offices. This paper presents...... results which combine energy auditing with building energy simulation and an indoor environment survey among the occupants of the building. Both when comparing simulations with measurements as well as with survey results good agreement was found. The two efficiency measures that are predicted to increase...... energy and thermal performance the most for this group of buildings were reduced infiltration and increasing heat-exchanger efficiency....

  16. The Measurement Invariance of the Student Opinion Survey across English and non-English Language Learner Students within the Context of Low- and High-Stakes Assessments

    Directory of Open Access Journals (Sweden)

    Jason C. Immekus

    2016-09-01

    Full Text Available Student effort on large-scale assessments has important implications on the interpretation and use of scores to guide decisions. Within the United States, English Language Learners (ELLs generally are outperformed on large-scale assessments by non-ELLs, prompting research to examine factors associated with test performance. There is a gap in the literature regarding the test-taking motivation of ELLs compared to non-ELLs and whether existing measures have similar psychometric properties across groups. The Student Opinion Survey (SOS; Sundre, 2007 was designed to be administered after completion of a large-scale assessment to operationalize students’ test-taking motivation. Based on data obtained on 5,257 (41.8% ELL 10th grade students, study purpose was to test the measurement invariance of the SOS across ELLs and non-ELLs based on completion of low- and high-stakes assessments. Preliminary item analyses supported the removal of two SOS items (Items 3 and 7 that resulted in improved internal consistency for each of the two SOS subscales: Importance, Effort. A subsequent multi-sample confirmatory factor analysis (MCFA supported the measurement invariance of the scale’s two-factor model across language groups, indicating it met strict factorial invariance (Meredith 1993. A follow-up latent means analysis found that ELLs had higher effort on both the low- and high-stakes assessment with a small effect size. Effect size estimates indicated negligible differences on the importance factor. Although the instrument can be expected to function similarly across diverse language groups, which may have direct utility of test users and research into factors associated with large-scale test performance, continued research is recommended. Implications for SOS use in applied and research settings are discussed.

  17. The Role of Item Models in Automatic Item Generation

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2012-01-01

    Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…

  18. Assessing the Psychometric Properties of Alternative Items for Certification.

    Science.gov (United States)

    Krogh, Mary Anne; Muckle, Timothy

    Alternative items were added as scored items to the National Certification Examination for Nurse Anesthetists (NCE) in 2010. A common concern related to the new items has been their measurement attributes. This study was undertaken to evaluate the psychometric impact of adding these items to the examination. Candidates had a significantly higher ability estimate in alternative items than in multiple choice questions and 6.7 percent of test candidates performed significantly differently in alternative item formats. The ability estimates of multiple choice questions correlated at r = .58. The alternative items took significantly longer time to answer than standard multiple choice questions and discriminated to a higher degree than MCQs. The alternative items exhibited unidimensionality to the same degree as MCQs and the BIC confirmed the Rasch model as acceptable for scoring. The new item types were found to have acceptable attributes for inclusion in the certification program.

  19. Standardization of physical measurements in European health examination surveys-experiences from the site visits.

    Science.gov (United States)

    Tolonen, Hanna; Mäki-Opas, Johanna; Mindell, Jennifer S; Trichopoulou, Antonia; Naska, Androniki; Männistö, Satu; Giampaoli, Simona; Kuulasmaa, Kari; Koponen, Päivikki

    2017-01-23

    Health examination surveys (HESs) provide valuable data on health and its determinants at the population level. Comparison of HES results within and between countries and over time requires measurements which are free of bias due to differences in or adherence to measurement procedures and/or measurement devices. In the European HES (EHES) Pilot Project, 12 countries conducted a pilot HES in 2010-11 using standardized measurement protocols and centralized training. External evaluation visits (site visits) were performed by the EHES Reference Centre staff to evaluate the success of standardization and quality of data collection. In general, standardized EHES protocols were followed adequately in all the pilot surveys. Small deviations were observed in the posture of participants during the blood pressure and height measurement; in the use of a tourniquet when drawing blood samples; and in the calibration of measurement devices. Occasionally, problems with disturbing noise from outside or people coming into the room during the measurements were observed. In countries with an ongoing national HES or a long tradition of conducting national HESs at regular intervals, it was more difficult to modify national protocols to fulfil EHES requirements. The EHES protocols to standardize HES measurements and procedures for collection of blood samples are feasible in cross-country settings. The prerequisite for successful standardization is adequate training. External and internal evaluation activities during the survey fieldwork are also needed to monitor compliance to standards. © The Author 2017. Published by Oxford University Press on behalf of the European Public Health Association. All rights reserved.

  20. Psychometric Changes on Item Difficulty Due to Item Review by Examinees

    Directory of Open Access Journals (Sweden)

    Elena C. Papanastasiou

    2015-01-01

    Full Text Available If good measurement depends in part on the estimation of accurate item characteristics, it is essential that test developers become aware of discrepancies that may exist on the item parameters before and after item review. The purpose of this study was to examine the answer changing patterns of students while taking paper-and-pencil multiple choice exams, and to examine how these changes affect the estimation of item difficulty parameters. The results of this study have shown that item review by examinees does produce some changes to the examinee ability estimates and to the item difficulty parameters. In addition, these effects are more pronounced in shorter tests than in longer tests. In turn, these small changes produce larger effects when estimating the changes in the information values of each student's test score.

  1. Evaluating construct validity of the second version of the Copenhagen Psychosocial Questionnaire through analysis of differential item functioning and differential item effect

    DEFF Research Database (Denmark)

    Bjorner, Jakob Bue; Pejtersen, Jan Hyld

    2010-01-01

    AIMS: To evaluate the construct validity of the Copenhagen Psychosocial Questionnaire II (COPSOQ II) by means of tests for differential item functioning (DIF) and differential item effect (DIE). METHODS: We used a Danish general population postal survey (n = 4,732 with 3,517 wage earners) with a ...... shortform measures and to improve the conceptual framework, items and scales of the COPSOQ II. CONCLUSIONS: We conclude that tests of DIF and DIE are useful for evaluating construct validity.......) with a one-year register based follow up for long-term sickness absence. DIF was evaluated against age, gender, education, social class, public/private sector employment, and job type using ordinal logistic regression. DIE was evaluated against job satisfaction and self-rated health (using ordinal logistic...

  2. Development of a brief survey to measure nursing home residents' perceptions of pain management.

    Science.gov (United States)

    Teno, Joan M; Dosa, David; Rochon, Therese; Casey, Virginia; Mor, Vincent

    2008-12-01

    Persistent severe pain in nursing home residents remains an important public health problem. One major key to quality improvement efforts is the development of tools to assist in auditing and monitoring the quality of health care delivery to these patients. A qualitative synthesis of existing pain guidelines, and input from focus groups and an expert panel, were used to develop a 10-item instrument, the Resident Assessment of Pain Management (RAPM). The psychometric properties of the RAPM were examined in a sample of 107 (82% female, average age 85) cognitively intact nursing home residents living in six Rhode Island nursing homes. Reliability and internal consistency were evaluated with test-retest and Cronbach's alpha, respectively, and validity was examined against independent assessment of pain management by research nurses. After comparing the results of RAPM with the independent pain assessment and examining a frequency distribution and factor analysis, five of the 10 items were retained. Internal reliability of the final instrument was 0.55. The rate of reported concerns ranged from 8% stating that they were not receiving enough pain medication to 43% stating that pain interfered with their sleep. The median pain problem score (i.e., the count of the number of opportunities to improve) was 1, with 23% of residents reporting three or more concerns. Overall, RAPM was moderately correlated (Spearman correlation coefficient r=0.43) with an independent expert nurse assessment of the quality of pain management. Evidence of construct validity for RAPM is based on the correlation of the pain problem score with nursing home resident satisfaction with pain management (r=0.26), reported average pain intensity (r=0.41), research nurse completion of the Minimum Data Set pain items (r=0.52), and the quality of pain documentation in the medical record (r=0.28). In conclusion, RAPM is a brief survey tool easily administered to nursing home residents that identifies

  3. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate and massive objects require a longer procedure and will therefore take longer.

  4. BUSINESS SURVEY LIQUIDITY MEASURE AS A LEADING INDICATOR OF CROATIAN INDUSTRIAL PRODUCTION

    Directory of Open Access Journals (Sweden)

    Mirjana Čižmešija

    2012-12-01

    Full Text Available Business survey liquidity measure is one of the modifications of the uniform EU business survey methodology applied in Croatia. Consequent liquidity problem have been, since socialist times, one of the major problem for Croatia's business. The problem rapidly increased between 1995 and 2000 and now it again represents the main difficulty for the Croatian economy. In order to improve the forecasting properties of business survey liquidity measure, some econometric models ware applied. Based on the regression analysis we concluded that the changes in the liquidity variable can predict the direction of changes in industrial production with one quarter lead. The results also show that liquidity can be a proxy of the Industrial Confidence Indicator in the observed period. The empirical analysis was performed using quarterly data covering the period from the first quarter 2005 to the fourth quarter 2011. The data sources were Privredni vjesnik (a business magazine in Croatia and the Croatian Bureau of Statistics.

  5. Subjectivity of LiDAR-Based Offset Measurements: Results from a Public Online Survey

    Science.gov (United States)

    Salisbury, J. B.; Arrowsmith, R.; Rockwell, T. K.; Haddad, D. E.; Zielke, O.; Madden, C.

    2012-12-01

    Geomorphic features (e.g., stream channels) that are offset in an earthquake can be measured to determine slip at that location. Analysis of these and other offset features can provide useful information for generating fault slip distributions. Remote analyses of active fault zones using high-resolution LiDAR data have recently been pursued in several studies, but there is a lack of consistency between users both for data analysis and results reporting. Individual investigators typically make offset measurements in a particular study area with their own protocols for measurement, assessing uncertainty, and quality rating, yet there is no coherent understanding of the reliability and repeatability of the measurements from observer to observer. We invited the participation of colleagues, interested geoscience communities, and the general public to measure ten geomorphic offsets from active faults in western North America using remote measurement methods that span a range of complexity (e.g., paper image and scale, the Google Earth ruler tool, and a MATLAB GUI for calculating backslip required to properly restore tectonic deformation) to explore the subjectivity involved with measuring geomorphic offsets. We provided a semi-quantitative quality-rating rubric for a description of offset quality, but there was a general lack of quality rating/offset uncertainty reporting. Survey responses (including mapped fault traces and piercing lines) were anonymously submitted along with user experience information. We received 11 paper-, 28 Google Earth-, and 16 MATLAB-based survey responses, though not all individuals measured every feature provided. For all survey methods, the majority of responses are in close agreement. However, large discrepancies arise where users interpret landforms differently, specifically the pre-earthquake morphologies and total offset accumulation of geomorphic features. Experienced users make more consistent measurements, whereas beginners less

  6. Comparison on Computed Tomography using industrial items

    DEFF Research Database (Denmark)

    Angel, Jais Andreas Breusch; De Chiffre, Leonardo

    2014-01-01

    In a comparison involving 27 laboratories from 8 countries, measurements on two common industrial items, a polymer part and a metal part, were carried out using X-ray Computed Tomography. All items were measured using coordinate measuring machines before and after circulation, with reference...... measurement uncertainties in the range 1.5–5.5 μm, showing a good stability over the 6 months of the circulation. The comparison has shown that CT measurements on the industrial parts used lie in the range 6–53 μm, with maximum values up to 158 μm....

  7. Item Banking with Embedded Standards

    Science.gov (United States)

    MacCann, Robert G.; Stanley, Gordon

    2009-01-01

    An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…

  8. Item Banking with Embedded Standards

    Science.gov (United States)

    MacCann, Robert G.; Stanley, Gordon

    2009-01-01

    An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…

  9. Policy interventions related to medicines: Survey of measures taken in European countries during 2010-2015.

    Science.gov (United States)

    Vogler, Sabine; Zimmermann, Nina; de Joncheere, Kees

    2016-12-01

    Policy-makers can use a menu of pharmaceutical policy options. This study aimed to survey these measures that were implemented in European countries between 2010 and 2015. We did bi-annual surveys with competent authorities of the Pharmaceutical Pricing and Reimbursement Information network. Additionally, we consulted posters produced by members of this network as well as further published literature. Information on 32 European countries (all European Union Member States excluding Luxembourg; Iceland, Norway, Serbia, Switzerland, Turkey) was included. 557 measures were reported between January 2010 and December 2015. The most frequently mentioned measure was price reductions and price freezes, followed by changes in patient co-payments, modifications related to the reimbursement lists and changes in distribution remuneration. Most policy measures were identified in Portugal, Greece, Belgium, France, the Czech Republic, Iceland, Spain and Germany. 22% of the measures surveyed could be classified as austerity. Countries that were strongly hit by the financial crisis implemented most policy changes, usually aiming to generate savings and briefly after the emergence of the crisis. Improvements in the economic situation tended to lead to an easing of austerity measures. Countries also implemented policies that aimed to enhance enforcement of existing measures and increase efficiency. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  10. Consumer satisfaction and item response theory: creating a measurement scale Avaliação do nível de satisfação de alunos de uma instituição de ensino superior: uma aplicação da teoria da resposta ao item

    Directory of Open Access Journals (Sweden)

    Silvana Ligia Vincenzi Bortolotti

    2012-01-01

    Full Text Available Today, people have increasingly demanded more from the state and enterprises. Consumer satisfaction is not an organizational option, but rather a matter of survival for any institution. The quest for measurement of consumer satisfaction has been ongoing in many areas of research, and researchers have concentrated efforts to demonstrate the psychometric quality of their measurements. However, the techniques employed by these commitments have not kept pace with the advances in psychometric theory and methods. The Item Response Theory (IRT is an approach used for assessing latent trait. It is commonly used in educational and psychological tests and provides additional information beyond that obtained from classic psychometric techniques. This article presents a model of cumulative application of item response theory to measure the extent of students' satisfaction with their courses by creating a measurement scale. The Graded Response Model was used. The results demonstrate the effectiveness of this theory in measuring satisfaction since it places both items as individuals on the same scale. This theory may be valuable in the evaluation of customer satisfaction and many other organizational phenomena. The findings may help the decision maker of an enterprise with the correction of flows, processes, and procedures, and, consequently, it may help generate increased efficiency and effectiveness in daily tasks and in event management business. Finally, the information obtained from the analysis can play a role in the development and/or evaluation of institutional planning.O tema deste trabalho é a utilização da Teoria da Resposta ao Item (TRI como ferramenta de avaliação de aspectos organizacionais específicos. O objetivo é aplicar um modelo cumulativo da TRI para criar uma medida de satisfação de alunos com seus cursos, avaliando também a satisfação no ensino e criando uma escala de medida. Muito utilizada nas áreas educacional e psicol

  11. SURVEY

    DEFF Research Database (Denmark)

    SURVEY er en udbredt metode og benyttes inden for bl.a. samfundsvidenskab, humaniora, psykologi og sundhedsforskning. Også uden for forskningsverdenen er der mange organisationer som f.eks. konsulentfirmaer og offentlige institutioner samt marketingsafdelinger i private virksomheder, der arbejder...... med surveys. Denne bog gennemgår alle surveyarbejdets faser og giver en praktisk indføring i: • design af undersøgelsen og udvælgelse af stikprøver, • formulering af spørgeskemaer samt indsamling og kodning af data, • metoder til at analysere resultaterne...

  12. Measuring cosmic velocities with 21cm intensity mapping and galaxy redshift survey cross-correlation dipoles

    CERN Document Server

    Hall, Alex

    2016-01-01

    We investigate the feasibility of measuring the effects of peculiar velocities in large-scale structure using the dipole of the redshift-space cross-correlation function. We combine number counts of galaxies with brightness-temperature fluctuations from 21cm intensity mapping, demonstrating that the dipole may be measured at modest significance ($\\lesssim 2\\sigma$) by combining the upcoming radio survey CHIME with the future redshift surveys of DESI and Euclid. More significant measurements ($\\lesssim~10\\sigma$) will be possible by combining intensity maps from the SKA with these of DESI or Euclid, and an even higher significance measurement ($\\lesssim 100\\sigma$) may be made by combining observables completely internally to the SKA. We account for effects such as contamination by wide-angle terms, interferometer noise and beams in the intensity maps, non-linear enhancements to the power spectrum, stacking multiple populations, sensitivity to the magnification slope, and the possibility that number counts and...

  13. Car-borne survey measurements with a 3x3` NaI detector

    Energy Technology Data Exchange (ETDEWEB)

    Larsen, E.; Ugletveit, F.; Floe, L.; Mikkelborg, O. [Norwegian Radiation Protection Authority, Oesteraas (Norway)

    1997-12-31

    The Norwegian Radiation Protection Authority (NRPA) took part in the international survey measurement exercise RESUME95 that was arranged in Finland in August 1995. NRPA performed measurements with a simple car-borne measuring system based on standard equipment, a 3x3` NaI detector, an MCA and a GPS connected to a portable PC. The results show substantial variations in dose rate inside areas of a few square kilometres. Spectrum analysis shows that a major part of these differences are caused by variations in deposition of {sup 137}Cs. Our results show that even standard 3x3` NaI detectors can be used for car based survey measurements in fall out situations and search for sources. The detection limits are higher than for larger detectors, but the main limiting factor seem to be the timing capabilities of the acquisition system. (au).

  14. The Development of an Emotional Response to Writing Measure: The Affective Cognition Writing Survey

    Science.gov (United States)

    Fischer, Ronald G.; Fischer, Jerome M.; Jain, Sachin

    2010-01-01

    This study was designed to develop and initiate the validation of the Affective Cognition Writing Survey (ACWS), a psychological instrument used to measure emotional expression through writing. Procedures for development and validation of the instrument are reported. Subsequently, factor analysis extracted six factors: Positive Processing,…

  15. American Healthy Homes Survey: A National Study of Residential Phthalates Measured from Floor Wipes

    Science.gov (United States)

    The United States Environmental Protection Agency (U.S. EPA), in collaboration with the U.S. Department of Housing and Urban Development (HUD), conducted a survey measuring phthalates in randomly selected residential homes throughout the U.S. Multistage sampling with clustering w...

  16. Proposing a survey instrument for measuring operational, formal, information and strategic Internet skills

    NARCIS (Netherlands)

    Deursen, van A.J.A.M.; Dijk, van J.A.G.M.; Peters, O.

    2012-01-01

    Observational studies prove to be very suitable to provide a realistic view of people's Internet skills. However, their cost and time are a strong limitation for large-scale data gathering. A useful addition to the measurement of Internet skills would be the development of survey questions for measu

  17. Measurement error in earnings data : Using a mixture model approach to combine survey and register data

    NARCIS (Netherlands)

    Meijer, E.; Rohwedder, S.; Wansbeek, T.J.

    2012-01-01

    Survey data on earnings tend to contain measurement error. Administrative data are superior in principle, but are worthless in case of a mismatch. We develop methods for prediction in mixture factor analysis models that combine both data sources to arrive at a single earnings figure. We apply the me

  18. American Healthy Homes Survey: A National Study of Residential Phthalates Measured from Floor Wipes

    Science.gov (United States)

    The United States Environmental Protection Agency (U.S. EPA), in collaboration with the U.S. Department of Housing and Urban Development (HUD), conducted a survey measuring phthalates in randomly selected residential homes throughout the U.S. Multistage sampling with clustering w...

  19. The Sloan Lens ACS Survey. VII. Elliptical galaxy scaling laws from direct observational mass measurements

    NARCIS (Netherlands)

    Bolton, Adam S.; Treu, Tommaso; Koopmans, Leon V. E.; Gavazzi, Raphael; Moustakas, Leonidas A.; Burles, Scott; Schlegel, David J.; Wayth, Randall

    2008-01-01

    We use a sample of 53 massive early-type strong gravitational lens galaxies with well-measured redshifts (ranging from z = 0.06 to 0.36) and stellar velocity dispersions (between 175 and 400 km s(-1)) from the Sloan Lens ACS (SLACS) Survey to derive numerous empirical scaling relations. The ratio be

  20. Development and testing of the Survey of Family Environment (SFE): a novel instrument to measure family functioning and needs for family support.

    Science.gov (United States)

    Hohashi, Naohiro; Honda, Junko

    2012-01-01

    Hohashi's Concentric Sphere Family Environment Model (CSFEM; Hohashi & Honda, 2011) is a newly proposed family nursing theory for holistically understanding the family environment that acts on family well-being. The purpose of this article is to develop and psychometrically test the Japanese version of the Survey of Family Environment (SFE-J), grounded in the CSFEM, for measuring family's perceived family functioning and family's perceived needs for family support. The SFE-J is a 30-item self-administered instrument that assesses five domains (suprasystem, macrosystem, microsystem, family internal environment system, and chronosystem) and has been subjected to rigorous reliability and validity investigations among paired partners in child-rearing families (N of family = 1,990). Internal consistency reliability was high as measured by Cronbach's alpha coefficients. Temporal stability over a 2-week interval was supported by high (substantial or perfect) and significant intraclass correlation coefficients. The total score for the SFE-J was significantly correlated with the Japanese version of the Feetham Family Functioning Survey (FFFS-J), indicating an acceptable concurrent validity. Construct validity was supported by a confirmatory factor analysis that evaluated the five-factor structure to measure the concept of CSFEM. Results also demonstrate that the SFE-J family functioning scores show no significant differences between paired partners. The SFE-J is a reliable and valid instrument to assess not only intrafamily functioning but also interfamily functioning and, by identifying items/domains with high requirements for family support, serves to facilitate the providing of appropriate support to families.

  1. The Development of Practical Item Analysis Program for Indonesian Teachers

    Science.gov (United States)

    Muhson, Ali; Lestari, Barkah; Supriyanto; Baroroh, Kiromim

    2017-01-01

    Item analysis has essential roles in the learning assessment. The item analysis program is designed to measure student achievement and instructional effectiveness. This study was aimed to develop item-analysis program and verify its feasibility. This study uses a Research and Development (R & D) model. The procedure includes designing and…

  2. Influence of Item Direction on Student Responses in Attitude Assessment.

    Science.gov (United States)

    Campbell, Noma Jo; Grissom, Stephen

    To investigate the effects of wording in attitude test items, a five-point Likert-type rating scale was administered to 173 undergraduate education majors. The test measured attitudes toward college and self, and contained 38 positively-worded items. Thirty-eight negatively-worded items were also written to parallel the positive statements.…

  3. 41 CFR 101-28.306-6 - Sensitive items.

    Science.gov (United States)

    2010-07-01

    ... for embarrassment of GSA and customer agencies, the level of customer complaints, and control as an accountable item of personal property. Each customer activity shall take all appropriate measures necessary to... 28.3-Customer Supply Centers § 101-28.306-6 Sensitive items. Many items stocked by the CSCs may...

  4. Item validity of the Myers-Briggs Type Indicator.

    Science.gov (United States)

    Tzeng, O C; Outcalt, D; Boyer, S L; Ware, R; Landis, D

    1984-06-01

    The present study presents a brief summary of four extensive psychometric analyses of the Myers-Briggs Type Indicator (MBTI) items. Positive empirical evidence supports the MBTI item validity. However, several measurement issues on item construction were raised to caution the future users.

  5. Mitigating Systematic Errors in Angular Correlation Function Measurements from Wide Field Surveys

    CERN Document Server

    Morrison, Christopher Brian

    2015-01-01

    We present an investigation into the effects of survey systematics such as varying depth, point spread function (PSF) size, and extinction on the galaxy selection and correlation in photometric, multi-epoch, wide area surveys. We take the Canada-France-Hawaii Telescope Lensing Survey (CFHTLenS) as an example. Variations in galaxy selection due to systematics are found to cause density fluctuations of up to 10% for some small fraction of the area for most galaxy redshift slices and as much as 50% for some extreme cases of faint high-redshift samples. This results in correlations of galaxies against survey systematics of order $\\sim$1% when averaged over the survey area. We present an empirical method for mitigating these systematic correlations from measurements of angular correlation functions using weighted random points. These weighted random catalogs are estimated from the observed galaxy over densities by mapping these to survey parameters. We are able to model and mitigate the effect of systematic correl...

  6. Conceptualization and measurement of homosexuality in sex surveys: a critical review

    Directory of Open Access Journals (Sweden)

    Michaels Stuart

    2006-01-01

    Full Text Available This article reviews major national population sex surveys that have asked questions about homosexuality focusing on conceptual and methodological issues, including the definitions of sex, the measured aspects of homosexuality, sampling and interviewing technique, and questionnaire design. Reported rates of major measures of same-sex attraction, behavior, partners, and sexual identity from surveys are also presented and compared. The study of homosexuality in surveys has been shaped by the research traditions and questions ranging from sexology to the epidemiology of HIV/AIDS. Sexual behavior has been a central topic at least since Kinsey. Issues of sexual attraction and/or orientation and sexual identity have emerged more recently. Differences in the treatment of men and women in the design and analysis of surveys as well as in the reported rates in different surveys, in different countries and time periods are also presented and discussed. We point out the importance of the consideration of both methodological and social change issues in assessing such differences.

  7. Conceptualization and measurement of homosexuality in sex surveys: a critical review

    Directory of Open Access Journals (Sweden)

    Stuart Michaels

    Full Text Available This article reviews major national population sex surveys that have asked questions about homosexuality focusing on conceptual and methodological issues, including the definitions of sex, the measured aspects of homosexuality, sampling and interviewing technique, and questionnaire design. Reported rates of major measures of same-sex attraction, behavior, partners, and sexual identity from surveys are also presented and compared. The study of homosexuality in surveys has been shaped by the research traditions and questions ranging from sexology to the epidemiology of HIV/AIDS. Sexual behavior has been a central topic at least since Kinsey. Issues of sexual attraction and/or orientation and sexual identity have emerged more recently. Differences in the treatment of men and women in the design and analysis of surveys as well as in the reported rates in different surveys, in different countries and time periods are also presented and discussed. We point out the importance of the consideration of both methodological and social change issues in assessing such differences.

  8. Conceptualization and measurement of homosexuality in sex surveys: a critical review.

    Science.gov (United States)

    Michaels, Stuart; Lhomond, Brigitte

    2006-07-01

    This article reviews major national population sex surveys that have asked questions about homosexuality focusing on conceptual and methodological issues, including the definitions of sex, the measured aspects of homosexuality, sampling and interviewing technique, and questionnaire design. Reported rates of major measures of same-sex attraction, behavior, partners, and sexual identity from surveys are also presented and compared. The study of homosexuality in surveys has been shaped by the research traditions and questions ranging from sexology to the epidemiology of HIV/AIDS. Sexual behavior has been a central topic at least since Kinsey. Issues of sexual attraction and/or orientation and sexual identity have emerged more recently. Differences in the treatment of men and women in the design and analysis of surveys as well as in the reported rates in different surveys, in different countries and time periods are also presented and discussed. We point out the importance of the consideration of both methodological and social change issues in assessing such differences.

  9. Detecting Local Item Dependence in Polytomous Adaptive Data

    Science.gov (United States)

    Mislevy, Jessica L.; Rupp, Andre A.; Harring, Jeffrey R.

    2012-01-01

    A rapidly expanding arena for item response theory (IRT) is in attitudinal and health-outcomes survey applications, often with polytomous items. In particular, there is interest in computer adaptive testing (CAT). Meeting model assumptions is necessary to realize the benefits of IRT in this setting, however. Although initial investigations of…

  10. The Academic Medical Center Linear Disability Score (ALDS) item bank: item response theory analysis in a mixed patient population.

    Science.gov (United States)

    Holman, Rebecca; Weisscher, Nadine; Glas, Cees A W; Dijkgraaf, Marcel G W; Vermeulen, Marinus; de Haan, Rob J; Lindeboom, Robert

    2005-12-29

    Currently, there is a lot of interest in the flexible framework offered by item banks for measuring patient relevant outcomes. However, there are few item banks, which have been developed to quantify functional status, as expressed by the ability to perform activities of daily life. This paper examines the measurement properties of the Academic Medical Center linear disability score item bank in a mixed population. This paper uses item response theory to analyse data on 115 of 170 items from a total of 1002 respondents. These were: 551 (55%) residents of supported housing, residential care or nursing homes; 235 (23%) patients with chronic pain; 127 (13%) inpatients on a neurology ward following a stroke; and 89 (9%) patients suffering from Parkinson's disease. Of the 170 items, 115 were judged to be clinically relevant. Of these 115 items, 77 were retained in the item bank following the item response theory analysis. Of the 38 items that were excluded from the item bank, 24 had either been presented to fewer than 200 respondents or had fewer than 10% or more than 90% of responses in the category 'can carry out'. A further 11 items had different measurement properties for younger and older or for male and female respondents. Finally, 3 items were excluded because the item response theory model did not fit the data. The Academic Medical Center linear disability score item bank has promising measurement characteristics for the mixed patient population described in this paper. Further studies will be needed to examine the measurement properties of the item bank in other populations.

  11. The Academic Medical Center Linear Disability Score (ALDS) item bank: item response theory analysis in a mixed patient population

    Science.gov (United States)

    Holman, Rebecca; Weisscher, Nadine; Glas, Cees AW; Dijkgraaf, Marcel GW; Vermeulen, Marinus; de Haan, Rob J; Lindeboom, Robert

    2005-01-01

    Background Currently, there is a lot of interest in the flexible framework offered by item banks for measuring patient relevant outcomes. However, there are few item banks, which have been developed to quantify functional status, as expressed by the ability to perform activities of daily life. This paper examines the measurement properties of the Academic Medical Center linear disability score item bank in a mixed population. Methods This paper uses item response theory to analyse data on 115 of 170 items from a total of 1002 respondents. These were: 551 (55%) residents of supported housing, residential care or nursing homes; 235 (23%) patients with chronic pain; 127 (13%) inpatients on a neurology ward following a stroke; and 89 (9%) patients suffering from Parkinson's disease. Results Of the 170 items, 115 were judged to be clinically relevant. Of these 115 items, 77 were retained in the item bank following the item response theory analysis. Of the 38 items that were excluded from the item bank, 24 had either been presented to fewer than 200 respondents or had fewer than 10% or more than 90% of responses in the category 'can carry out'. A further 11 items had different measurement properties for younger and older or for male and female respondents. Finally, 3 items were excluded because the item response theory model did not fit the data. Conclusion The Academic Medical Center linear disability score item bank has promising measurement characteristics for the mixed patient population described in this paper. Further studies will be needed to examine the measurement properties of the item bank in other populations. PMID:16381611

  12. The Academic Medical Center Linear Disability Score (ALDS item bank: item response theory analysis in a mixed patient population

    Directory of Open Access Journals (Sweden)

    Vermeulen Marinus

    2005-12-01

    Full Text Available Abstract Background Currently, there is a lot of interest in the flexible framework offered by item banks for measuring patient relevant outcomes. However, there are few item banks, which have been developed to quantify functional status, as expressed by the ability to perform activities of daily life. This paper examines the measurement properties of the Academic Medical Center linear disability score item bank in a mixed population. Methods This paper uses item response theory to analyse data on 115 of 170 items from a total of 1002 respondents. These were: 551 (55% residents of supported housing, residential care or nursing homes; 235 (23% patients with chronic pain; 127 (13% inpatients on a neurology ward following a stroke; and 89 (9% patients suffering from Parkinson's disease. Results Of the 170 items, 115 were judged to be clinically relevant. Of these 115 items, 77 were retained in the item bank following the item response theory analysis. Of the 38 items that were excluded from the item bank, 24 had either been presented to fewer than 200 respondents or had fewer than 10% or more than 90% of responses in the category 'can carry out'. A further 11 items had different measurement properties for younger and older or for male and female respondents. Finally, 3 items were excluded because the item response theory model did not fit the data. Conclusion The Academic Medical Center linear disability score item bank has promising measurement characteristics for the mixed patient population described in this paper. Further studies will be needed to examine the measurement properties of the item bank in other populations.

  13. A Comparison of Anchor-Item Designs for the Concurrent Calibration of Large Banks of Likert-Type Items

    Science.gov (United States)

    Garcia-Perez, Miguel A.; Alcala-Quintana, Rocio; Garcia-Cueto, Eduardo

    2010-01-01

    Current interest in measuring quality of life is generating interest in the construction of computerized adaptive tests (CATs) with Likert-type items. Calibration of an item bank for use in CAT requires collecting responses to a large number of candidate items. However, the number is usually too large to administer to each subject in the…

  14. Prospects for clustering and lensing measurements with forthcoming intensity mapping and optical surveys

    CERN Document Server

    Pourtsidou, Alkistis; Crittenden, Robert; Metcalf, R Benton

    2015-01-01

    We explore the potential of using intensity mapping surveys (MeerKAT, SKA) and optical galaxy surveys (DES, LSST) to detect HI clustering and weak gravitational lensing of 21cm emission in auto- and cross-correlation. Our forecasts show that high precision measurements of the clustering and lensing signals can be made in the near future using the intensity mapping technique. Such studies can be used to test the intensity mapping method, and constrain parameters such as the HI density $\\Omega_{\\rm HI}$, the HI bias $b_{\\rm HI}$ and the galaxy-HI correlation coefficient $r_{\\rm HI-g}$.

  15. Prospects for clustering and lensing measurements with forthcoming intensity mapping and optical surveys

    Science.gov (United States)

    Pourtsidou, A.; Bacon, D.; Crittenden, R.; Metcalf, R. B.

    2016-06-01

    We explore the potential of using intensity mapping surveys (MeerKAT, SKA) and optical galaxy surveys (DES, LSST) to detect H I clustering and weak gravitational lensing of 21 cm emission in auto- and cross-correlation. Our forecasts show that high-precision measurements of the clustering and lensing signals can be made in the near future using the intensity mapping technique. Such studies can be used to test the intensity mapping method, and constrain parameters such as the H I density Ω _{H I}, the H I bias b_{H I} and the galaxy-H I correlation coefficient r_{H I-g}.

  16. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate, preparation of the package and related paperwork). Large and massive objects require a longer procedure and will therefore take longer.

  17. Prediction of objectively measured physical activity and sedentariness among blue-collar workers using survey questionnaires

    DEFF Research Database (Denmark)

    Gupta, Nidhi; Heiden, Marina; Mathiassen, Svend Erik;

    2016-01-01

    OBJECTIVES: We aimed at developing and evaluating statistical models predicting objectively measured occupational time spent sedentary or in physical activity from self-reported information available in large epidemiological studies and surveys. METHODS: Two-hundred-and-fourteen blue-collar workers...... responded to a questionnaire containing information about personal and work related variables, available in most large epidemiological studies and surveys. Workers also wore accelerometers for 1-4 days measuring time spent sedentary and in physical activity, defined as non-sedentary time. Least......-squares linear regression models were developed, predicting objectively measured exposures from selected predictors in the questionnaire. RESULTS: A full prediction model based on age, gender, body mass index, job group, self-reported occupational physical activity (OPA), and self-reported occupational sedentary...

  18. Will Kinematic Sunyaev-Zel'dovich Measurements Enhance the Science Return from Galaxy Redshift Surveys?

    CERN Document Server

    Sugiyama, Naonori S; Spergel, David N

    2016-01-01

    Yes. Future CMB experiments such as Advanced ACTPol and CMB-S4 should achieve measurements with S/N of $> 0.1$ for the typical galaxies in redshift surveys. These measurements will provide complementary measurements of the growth rate of large scale structure $f$ and the expansion rate of the Universe $H$ to galaxy clustering measurements. This paper emphasizes that there is significant information in the anisotropy of the relative pairwise kSZ measurements. We expand the relative pairwise kSZ power spectrum in Legendre polynomials and consider up to its octopole. Assuming that the noise in the filtered maps is uncorrelated between the positions of galaxies in the survey, we derive a simple analytic form for the power spectrum covariance of the relative pairwise kSZ temperature in redshift space. While many previous studies have assumed optimistically that the optical depth of the galaxies $\\tau_{\\rm T}$ in the survey is known, we marginalize over $\\tau_{\\rm T}$, to compute constraints on the growth rate $f$ ...

  19. Reliability and validity of the medical outcomes study, a 36-item short-form health survey, (MOS SF-36 after one-year hospital discharge of hip fracture patient in a public hospital

    Directory of Open Access Journals (Sweden)

    Anan Udombhornprabha

    2012-07-01

    Full Text Available Introduction: There is scarce of data in terms of health-related quality of life for hip fracture patients in Thailand due to the following: (i lack of epidemiological aspects of hip fracture (eg. the relative incidence of osteoporosis, falls fractures and repeat fractures in particular subgroups, (ii lack of health status and quality of life aspects of both illness itself and the availability of different treatment options especially for elderly people. (iii a substantial variation in terms of outcomes of care for patients and service for hip fractures. Objective: Hip fracture is a major healthcare burden in Thailand. This study explores quality of life for hip fracture patients from perspective of (i Reliability of patient-reported outcomes (ii Some clinical and demographic characteristics related to patient-reported outcomes Method: Pre-hospital discharge 201 hip fracture patients were screened and follow-up over one year. Mail survey by a self-rated Medical Outcomes Study, a 36-item Short-Form Health Survey (Thai dispatched for follow-up, other clinical and demographic characteristics were collected through direct interviews from patients or caregivers during recruitment with simultaneous crosschecking from medical records. A descriptive cross-sectional analysis was performed.Result: Mails responder represented by 59.2% (N=119, with 36.1% (N=43 and 63.8% (N=76 for patient and caregiver rated outcomes. Mean(SD, [95% CI] score for physical, mental and global health of patient and care-giver rated outcomes of 36.2(10.6[32.9-39.4], 54.5(10.0[51.4-57.5],43.9(9.3[41.0-46.7], and 34.6(12.3[31.7-37.4], 52.5(12.3[49.6-55.3],42.7(11.1[40.1-45.2] were not statistically difference with p-value at 0.630,0.330 and 0.788 respectively. Respecting Cronbach’s alpha reliability coefficient by patients versus caregivers rated of the MOS SF-36 were 0.90 vs 0.91, 0.78 vs 0.84 and 0.90 vs 0.92. The presence of comorbidity significantly explains differences for

  20. Comparing two survey methods of measuring health-related indicators: Lot Quality Assurance Sampling and Demographic Health Surveys.

    Science.gov (United States)

    Anoke, Sarah C; Mwai, Paul; Jeffery, Caroline; Valadez, Joseph J; Pagano, Marcello

    2015-12-01

    Two common methods used to measure indicators for health programme monitoring and evaluation are the demographic and health surveys (DHS) and lot quality assurance sampling (LQAS); each one has different strengths. We report on both methods when utilised in comparable situations. We compared 24 indicators in south-west Uganda, where data for prevalence estimations were collected independently for the two methods in 2011 (LQAS: n = 8876; DHS: n = 1200). Data were stratified (e.g. gender and age) resulting in 37 comparisons. We used a two-sample two-sided Z-test of proportions to compare both methods. The average difference between LQAS and DHS for 37 estimates was 0.062 (SD = 0.093; median = 0.039). The average difference among the 21 failures to reject equality of proportions was 0.010 (SD = 0.041; median = 0.009); among the 16 rejections, it was 0.130 (SD = 0.010, median = 0.118). Seven of the 16 rejections exhibited absolute differences of 0.10 and 0.20 (mean = 0.261, SD = 0.083). There is 75.7% agreement across the two surveys. Both methods yield regional results, but only LQAS provides information at less granular levels (e.g. the district level) where managerial action is taken. The cost advantage and localisation make LQAS feasible to conduct more frequently, and provides the possibility for real-time health outcomes monitoring. © 2015 The Authors. Tropical Medicine & International Health Published by John Wiley & Sons Ltd.

  1. Differential item functioning due to gender between depression and anxiety items among Chilean adolescents.

    Science.gov (United States)

    Bares, Cristina; Andrade, Fernando; Delva, Jorge; Grogan-Kaylor, Andrew; Kamata, Akihito

    2012-07-01

    Although much is known about the higher prevalence of anxiety and depressive disorders among adolescent females, less is known about the differential item endorsement due to gender in items of scales commonly used to measure anxiety and depression. We conducted a study to examine if adolescent males and females from Chile differed on how they endorsed the items of the Youth Self Report (YSR) anxious/depressed problem scale. We used data from a cross-sectional sample consisting of 925 participants (mean age = 14, SD 1.3, 49% females) of low to lower-middle socioeconomic status. A two-parameter logistic (2PL) IRT DIF model was fit. s revealed differential item functioning (DIF) by gender for six of the 13 items, with adolescent females being more likely to endorse a depression item while males were found more likely to endorse anxiety items. Findings suggest that items found in commonly used measures of anxiety and depression symptoms may not equally capture the true levels of these behavioural problems in adolescent males and females. Given the high levels of mental disorders in Chile and the surrounding countries, further attention should be focused on increasing the number of empirical studies examining potential gender differences in the assessment of mental health problems among Latin American populations to better aid our understanding of the phenomenology and determinants of these problems in the region.

  2. Measuring galaxy [OII] emission line doublet with future ground-based wide-field spectroscopic surveys

    CERN Document Server

    Comparat, Johan; Bacon, Roland; Mostek, Nick J; Newman, Jeffrey A; Schlegel, David J; Yèche, Christophe

    2013-01-01

    The next generation of wide-field spectroscopic redshift surveys will map the large-scale galaxy distribution in the redshift range 0.7< z<2 to measure baryonic acoustic oscillations (BAO). The primary optical signature used in this redshift range comes from the [OII] emission line doublet, which provides a unique redshift identification that can minimize confusion with other single emission lines. To derive the required spectrograph resolution for these redshift surveys, we simulate observations of the [OII] (3727,3729) doublet for various instrument resolutions, and line velocities. We foresee two strategies about the choice of the resolution for future spectrographs for BAO surveys. For bright [OII] emitter surveys ([OII] flux ~30.10^{-17} erg /cm2/s like SDSS-IV/eBOSS), a resolution of R~3300 allows the separation of 90 percent of the doublets. The impact of the sky lines on the completeness in redshift is less than 6 percent. For faint [OII] emitter surveys ([OII] flux ~10.10^{-17} erg /cm2/s like ...

  3. Measuring galaxy environment with the synergy of future photometric and spectroscopic surveys

    CERN Document Server

    Cucciati, O; Cimatti, A; Merson, A I; Norberg, P; Pozzetti, L; Baugh, C M; Branchini, E

    2016-01-01

    [Abridged] We exploit the synergy between low-resolution spectroscopy and photometric redshifts to study environmental effects on galaxy evolution in slitless spectroscopic surveys from space. As a test case, we consider the future Euclid Deep survey (~40deg$^2$), which combines a slitless spectroscopic survey limited at H$\\alpha$ flux $\\leq5\\times 10^{-17}$ erg cm$^{-2}$ s$^{-1}$ and a photometric survey limited in H-band ($H\\leq26$). To test the power of the method, we use Euclid-like galaxy mock catalogues, in which we anchor the photometric redshifts to the 3D galaxy distribution of the available spectroscopic redshifts. We then estimate the local density contrast by counting objects in cylindrical cells with radius ranging from 1 to 10 h$^{-1}$Mpc over the redshift range 0.9survey (H=26) but without redshift measurement errors. We find that our method is successful in separating hi...

  4. Measure and category a survey of the analogies between topological and measure spaces

    CERN Document Server

    Oxtoby, John C

    1980-01-01

    In this edition, a set of Supplementary Notes and Remarks has been added at the end, grouped according to chapter. Some of these call attention to subsequent developments, others add further explanation or additional remarks. Most of the remarks are accompanied by a briefly indicated proof, which is sometimes different from the one given in the reference cited. The list of references has been expanded to include many recent contributions, but it is still not intended to be exhaustive. John C. Oxtoby Bryn Mawr, April 1980 Preface to the First Edition This book has two main themes: the Baire category theorem as a method for proving existence, and the "duality" between measure and category. The category method is illustrated by a variety of typical applications, and the analogy between measure and category is explored in all of its ramifications. To this end, the elements of metric topology are reviewed and the principal properties of Lebesgue measure are derived. It turns out that Lebesgue integration is not es...

  5. A method of measuring the [α/Fe] ratios from the spectra of the LAMOST survey

    Science.gov (United States)

    Li, Ji; Han, Chen; Xiang, Mao-Sheng; Shi, Jian-Rong; Zhao, Jing-Kun; Liu, Xiao-Wei; Zhang, Hua-Wei; Yuan, Hai-Bo; Ci, Xuan; Zhang, Xiao-Feng; Wang, Yue-Xiang; Huang, Yang; Zhang, Yong; Hou, Yong-Hui; Wang, Yue-Fei; Cao, Zi-Huang

    2016-07-01

    The [α/Fe] ratios in stars are good tracers to probe the formation history of stellar populations and the chemical evolution of the Galaxy. The spectroscopic survey of LAMOST provides a good opportunity to determine [α/Fe] of millions of stars in the Galaxy. We present a method of measuring the [α/Fe] ratios from LAMOST spectra using the template-matching technique of the LSP3 pipeline. We use three test samples of stars selected from the ELODIE and MILES libraries, as well as the LEGUE survey to validate our method. Based on the test results, we conclude that our method is valid for measuring [α/Fe] from low-resolution spectra acquired by the LAMOST survey. Within the range of the stellar parameters T eff = [5000, 7500] K, log g = [1.0, 5.0] dex and [Fe/H]= [-1.5, +0.5] dex, our [α/Fe] measurements are consistent with values derived from high-resolution spectra, and the accuracy of our [α/Fe] measurements from LAMOST spectra is better than 0.1 dex with spectral signal-to-noise higher than 20.

  6. Measuring high-sensitivity cardiac troponin T blood concentration in population surveys

    Science.gov (United States)

    Mindell, Jennifer S.

    2017-01-01

    Introduction The blood test for high-sensitivity cardiac troponin T (HS-CTnT) has been proposed as a marker of cardiovascular risk in the general population, as it is associated with subsequent incidence of cardiovascular events and mortality. We aimed at evaluating the feasibility of HS-CTnT testing within large nationally-representative population surveys in which blood samples are collected during household visits, shipped using the standard civil postal service, and then frozen for subsequent analyses. Methods The Health Survey for England (HSE) consists of a series of annual surveys beginning in 1991. It is designed to provide regular information on various aspects of the nation’s health and risk factors. We measured HS-CTnT in the blood of 200 people from the HSE 2016 wave, then froze and stored their blood samples at -40°C for 5–10 weeks, and then thawed and retested them to appreciate the extent of within-person agreement or test-retest reliability of the two measurements. Results The Cronbach's Alpha (Scale Reliability Coefficient) and the Interclass Correlation Coefficient (two-way mixed-effects model for consistency of agreement at individual level) were 0.97 (95%CI = 0.96–0.99) and 0.95 (95%CI = 0.94–0.96) respectively. The time delay from blood withdrawal to analysis and storage (1–4 days) did not affect the results, nor did the freezing time before the retest (5–10 weeks). Conclusion The measurement of HS-CTnT plasma concentration within large nationally-representative surveys such as the Health Survey for England is feasible. PMID:28141863

  7. Coexistence of DTT and Mobile Broadband: A Survey and Guidelines for Field Measurements

    Directory of Open Access Journals (Sweden)

    Juha Kalliovaara

    2017-01-01

    Full Text Available This article provides a survey and a general methodology for coexistence studies between digital terrestrial television (DTT and mobile broadband (MBB systems in the ultra high frequency (UHF broadcasting band. The methodology includes characterization of relevant field measurement scenarios and gives a step-by-step guideline on how to obtain reliable field measurement results to be used in conjunction with link budget analyses, laboratory measurements, and simulations. A survey of potential European coexistence scenarios and regulatory status is given to determine feasible future use scenarios for the UHF television (TV broadcasting band. The DTT reception system behavior and performance are also described as they greatly affect the amount of spectrum potentially available for MBB use and determine the relevant coexistence field measurement scenarios. Simulation methods used in determining broadcast protection criteria and in coexistence studies are briefly described to demonstrate how the information obtained from field measurements can be used to improve their accuracy. The presented field measurement guidelines can be applied to any DTT-MBB coexistence scenarios and to a wide range of spectrum sharing and cognitive radio system coexistence measurements.

  8. Pitfalls with weight for height measurements in surveys of acute malnutrition.

    Science.gov (United States)

    Soeters, R

    1986-10-01

    A combined survey of weight for height measurement was done by 2 emergency aid organizations. The same 131 children were measured independently by 2 survey teams. 1 team measured 24.4% and the other team 47.7% of the 131 children as under 80% weight/height. The serious logistical consequences of these differences for the emergency aid program are discussed. The main cause of the difference was observer error. A reason for systematic bias could be that an observer using the Nabarro chart tends to classify children who are slightly above 80% as under 80%. This bias is understandable because one does not want to miss out malnourished children. This systematic bias does not occur when weight and height are measured separately and the % read from a card. The importance of following a strict protocol when measuring is stressed. Another less important cause for the different results was that the teams used 2 weight for height standards. The use of 2 standards in the same Oxfam kit is unfortunate. It would be better to choose 1 uniform standard, to avoid misinterpretations. The Nabarro chart is the quickest method for weight/height measurement, but is prone to observer error. The method in which weight and height are measured separately and the % read from a chart is more reliable. A plea is made for the use of uniform standards.

  9. Faculty development on item writing substantially improves item quality.

    NARCIS (Netherlands)

    Naeem, N.; Vleuten, C.P.M. van der; Alfaris, E.A.

    2012-01-01

    The quality of items written for in-house examinations in medical schools remains a cause of concern. Several faculty development programs are aimed at improving faculty's item writing skills. The purpose of this study was to evaluate the effectiveness of a faculty development program in item develo

  10. IRT Item Parameter Scaling for Developing New Item Pools

    Science.gov (United States)

    Kang, Hyeon-Ah; Lu, Ying; Chang, Hua-Hua

    2017-01-01

    Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent…

  11. Development and Pilot Testing of the Challenge Module: A Proposed Adjunct to the Gross Motor Function Measure for High-Functioning Children with Cerebral Palsy

    Science.gov (United States)

    Wilson, Ashlea; Kavanaugh, Abi; Moher, Rosemarie; McInroy, Megan; Gupta, Neena; Salbach, Nancy M.; Wright, F. Virginia

    2011-01-01

    The aim was to develop a Challenge Module (CM) as a proposed adjunct to the Gross Motor Function Measure for children with cerebral palsy who have high-level motor function. Items were generated in a physiotherapist (PT) focus group. Item reduction was based on PTs' ratings of item importance and safety via online surveys. The proposed CM items…

  12. Normative Data for the 12 Item WHO Disability Assessment Schedule 2.0

    OpenAIRE

    Gavin Andrews; Alice Kemp; Matthew Sunderland; Michael Von Korff; Tevik Bedirhan Ustun

    2009-01-01

    BACKGROUND: The World Health Organization Disability Assessment Schedule (WHODAS 2.0) measures disability due to health conditions including diseases, illnesses, injuries, mental or emotional problems, and problems with alcohol or drugs. METHOD: The 12 Item WHODAS 2.0 was used in the second Australian Survey of Mental Health and Well-being. We report the overall factor structure and the distribution of scores and normative data (means and SDs) for people with any physical disorder, any mental...

  13. 量表评估效度的项目反应理论%Item response theory for measurement validity

    Institute of Scientific and Technical Information of China (English)

    Yang FM; Kao ST

    2014-01-01

    项目反应理论(Item response theory,IRT)是用来评估精神病学领域那些尚未被充分使用的测量量表效度一种重要方法.IRT描述了潜在心理特征(例如,该量表拟评估心理问题的架构)、量表中各项目的属性、以及被测试者对各项目应答之间的关系.本文介绍了IRT的基本前提,假设和方法.为了帮助解释这些概念,我们依据流行病学调查中心抑郁量表修订版中三个答案为是/否二分类选项的问题制定了一个假设的量表.流行病学调查中心抑郁量表已经用于19,399被测试者.我们首先用因子分析确认这三个项目的单维性,然后用Mplus软件建立2-ParameterLogic (2-PL) IRT模型,这是一种用来评估量表中各项目两两差异和项目难度的方法.本文将就这些分析结果的临床意义和在量表结构中的用途展开讨论.

  14. Measuring the velocity field from type Ia supernovae in an LSST-like sky survey

    CERN Document Server

    Odderskov, Io

    2016-01-01

    With the upcoming sky survey with the Large Synoptic Survey Telescope a great sample of type Ia supernovae will be observed, allowing for a precise mapping of the velocity structure of the universe. Since the source of peculiar velocities is variations in the density field, cosmological parameters related to the matter distribution can subsequently be extracted from the velocity power spectrum. One way to quantify this is through the angular power spectrum of radial peculiar velocities on spheres at different redshifts. We investigate how well this observable can be measured, despite the problems caused by areas with no information. To obtain a realistic distribution of supernovae, we create mock supernova catalogs by using a semi-analytical code for galaxy formation on the merger trees extracted from N-body simulations. We measure the cosmic variance in the velocity power spectrum by repeating the procedure many times for differently located observers, and vary different aspects of the analysis, such as the ...

  15. Differential Item Functioning on the International Personality Item Poolâ s Neuroticism Scale

    OpenAIRE

    McBride, Nadine LeBarron

    2008-01-01

    As use of the public-domain International Personality Item Pool (IPIP) scales has grown significantly over the past decade (Goldberg, Johnson, Eber, Hogan, Ashton, Cloninger, & Gough, 2006) research on the psychometric properties of the items and scales have become increasingly important. This research study examines the IPIP scale constructed to measure the Five Factor Model (FFM) domain of Neuroticism (as measured by the NEO-PI-R) for occurrences of differential functioning a...

  16. Snow measurement system for airborne snow surveys (GPR system from helicopter) in high mountian areas.

    Science.gov (United States)

    Sorteberg, Hilleborg K.

    2010-05-01

    In the hydropower industry, it is important to have precise information about snow deposits at all times, to allow for effective planning and optimal use of the water. In Norway, it is common to measure snow density using a manual method, i.e. the depth and weight of the snow is measured. In recent years, radar measurements have been taken from snowmobiles; however, few energy supply companies use this method operatively - it has mostly been used in connection with research projects. Agder Energi is the first Norwegian power producer in using radar tecnology from helicopter in monitoring mountain snow levels. Measurement accuracy is crucial when obtaining input data for snow reservoir estimates. Radar screening by helicopter makes remote areas more easily accessible and provides larger quantities of data than traditional ground level measurement methods. In order to draw up a snow survey system, it is assumed as a basis that the snow distribution is influenced by vegetation, climate and topography. In order to take these factors into consideration, a snow survey system for fields in high mountain areas has been designed in which the data collection is carried out by following the lines of a grid system. The lines of this grid system is placed in order to effectively capture the distribution of elevation, x-coordinates, y-coordinates, aspect, slope and curvature in the field. Variation in climatic conditions are also captured better when using a grid, and dominant weather patterns will largely be captured in this measurement system.

  17. Measuring galaxy environment with the synergy of future photometric and spectroscopic surveys

    Science.gov (United States)

    Cucciati, O.; Marulli, F.; Cimatti, A.; Merson, A. I.; Norberg, P.; Pozzetti, L.; Baugh, C. M.; Branchini, E.

    2016-10-01

    We exploit the synergy between low-resolution spectroscopy and photometric redshifts to study environmental effects on galaxy evolution in slitless spectroscopic surveys from space. As a test case, we consider the future Euclid Deep survey (˜40 deg2), which combines a slitless spectroscopic survey limited at Hα flux ≥5 × 10-17 erg cm-2 s-1 and a photometric survey limited in H band (H ≤ 26). We use Euclid-like galaxy mock catalogues, in which we anchor the photometric redshifts to the 3D galaxy distribution of the available spectroscopic redshifts. We then estimate the local density contrast by counting objects in cylindrical cells with radius from 1 to 10 h-1Mpc, over the redshift range 0.9 < z < 1.8. We compare this density field with the one computed in a mock catalogue with the same depth as the Euclid Deep survey (H = 26) but without redshift measurement errors. We find that our method successfully separates high- from low-density environments (the last from the first quintile of the density distribution), with higher efficiency at low redshift and large cells: the fraction of low-density regions mistaken by high-density peaks is <1 per cent for all scales and redshifts explored, but for scales of 1 h-1Mpc for which is a few per cent. These results show that we can efficiently study environment in photometric samples if spectroscopic information is available for a smaller sample of objects that sparsely samples the same volume. We demonstrate that these studies are possible in the Euclid Deep survey, i.e. in a redshift range in which environmental effects are different from those observed in the local Universe, hence providing new constraints for galaxy evolution models.

  18. Identifying Unbiased Items for Screening Preschoolers for Disruptive Behavior Problems.

    Science.gov (United States)

    Studts, Christina R; Polaha, Jodi; van Zyl, Michiel A

    2016-10-25

    OBJECTIVE : Efficient identification and referral to behavioral services are crucial in addressing early-onset disruptive behavior problems. Existing screening instruments for preschoolers are not ideal for pediatric primary care settings serving diverse populations. Eighteen candidate items for a new brief screening instrument were examined to identify those exhibiting measurement bias (i.e., differential item functioning, DIF) by child characteristics. METHOD : Parents/guardians of preschool-aged children (N = 900) from four primary care settings completed two full-length behavioral rating scales. Items measuring disruptive behavior problems were tested for DIF by child race, sex, and socioeconomic status using two approaches: item response theory-based likelihood ratio tests and ordinal logistic regression. RESULTS : Of 18 items, eight were identified with statistically significant DIF by at least one method. CONCLUSIONS : The bias observed in 8 of 18 items made them undesirable for screening diverse populations of children. These items were excluded from the new brief screening tool.

  19. Can I Just Check...? Effects of Edit Check Questions on Measurement Error and Survey Estimates

    Directory of Open Access Journals (Sweden)

    Lugtig Peter

    2014-03-01

    Full Text Available Household income is difficult to measure, since it requires the collection of information about all potential income sources for each member of a household.Weassess the effects of two types of edit check questions on measurement error and survey estimates: within-wave edit checks use responses to questions earlier in the same interview to query apparent inconsistencies in responses; dependent interviewing uses responses from prior interviews to query apparent inconsistencies over time.Weuse data from three waves of the British Household Panel Survey (BHPS to assess the effects of edit checks on estimates, and data from an experimental study carried out in the context of the BHPS, where survey responses were linked to individual administrative records, to assess the effects on measurement error. The findings suggest that interviewing methods without edit checks underestimate non-labour household income in the lower tail of the income distribution. The effects on estimates derived from total household income, such as poverty rates or transition rates into and out of poverty, are small.

  20. Probing primordial non-Gaussianity via iSW measurements with SKA continuum surveys

    Energy Technology Data Exchange (ETDEWEB)

    Raccanelli, Alvise; Doré, Olivier, E-mail: alvise@jhu.edu, E-mail: olivier.dore@caltech.edu [Jet Propulsion Laboratory, California Institute of Technology, Pasadena, California 91109 (United States); Bacon, David J.; Maartens, Roy, E-mail: David.Bacon@port.ac.uk, E-mail: roy.maartens@gmail.com [Institute of Cosmology and Gravitation, University of Portsmouth, Portsmouth P01 3FX (United Kingdom); and others

    2015-01-01

    The Planck CMB experiment has delivered the best constraints so far on primordial non-Gaussianity, ruling out early-Universe models of inflation that generate large non-Gaussianity. Although small improvements in the CMB constraints are expected, the next frontier of precision will come from future large-scale surveys of the galaxy distribution. The advantage of such surveys is that they can measure many more modes than the CMB—in particular, forthcoming radio surveys with the Square Kilometre Array will cover huge volumes. Radio continuum surveys deliver the largest volumes, but with the disadvantage of no redshift information. In order to mitigate this, we use two additional observables. First, the integrated Sachs-Wolfe effect—the cross-correlation of the radio number counts with the CMB temperature anisotropies—helps to reduce systematics on the large scales that are sensitive to non-Gaussianity. Second, optical data allows for cross-identification in order to gain some redshift information. We show that, while the single redshift bin case can provide a σ(f{sub NL}) ∼ 20, and is therefore not competitive with current and future constraints on non-Gaussianity, a tomographic analysis could improve the constraints by an order of magnitude, even with only two redshift bins. A huge improvement is provided by the addition of high-redshift sources, so having cross-ID for high-z galaxies and an even higher-z radio tail is key to enabling very precise measurements of f{sub NL}. We use Fisher matrix forecasts to predict the constraining power in the case of no redshift information and the case where cross-ID allows a tomographic analysis, and we show that the constraints do not improve much with 3 or more bins. Our results show that SKA continuum surveys could provide constraints competitive with CMB and forthcoming optical surveys, potentially allowing a measurement of σ(f{sub NL}) ∼ 1 to be made. Moreover, these measurements would act as a useful check

  1. Dutch–Flemish translation of nine pediatric item banks from the Patient-Reported Outcomes Measurement Information System (PROMIS)®

    NARCIS (Netherlands)

    L. Haverman (Lotte); M.A. Grootenhuis (Martha); H. Raat (Hein); M.A.J. van Rossum (Marion); E. van Dulmen-den Broeder (E.); K. Hoppenbrouwers (Karel); H. Correia (Helena); M. Cella (Massimo); L.D. Roorda (Lieuwe); C.B. Terwee (Caroline)

    2016-01-01

    textabstractPurpose: The Patient-Reported Outcomes Measurement Information System (PROMIS®) is a new, state-of-the-art assessment system for measuring patient-reported health and well-being of adults and children. It has the potential to be more valid, reliable, and responsive than existing PROMs. T

  2. Item Overexposure in Computerized Classification Tests Using Sequential Item Selection

    Directory of Open Access Journals (Sweden)

    Alan Huebner

    2012-06-01

    Full Text Available Computerized classification tests (CCTs often use sequential item selection which administers items according to maximizing psychometric information at a cut point demarcating passing and failing scores. This paper illustrates why this method of item selection leads to the overexposure of a significant number of items, and the performances of three different methods for controlling maximum item exposure rates in CCTs are compared. Specifically, the Sympson-Hetter, restricted, and item eligibility methods are examined in two studies realistically simulating different types of CCTs and are evaluated based upon criteria including classification accuracy, the number of items exceeding the desired maximum exposure rate, and test overlap. The pros and cons of each method are discussed from a practical perspective.

  3. The Development of Practical Item Analysis Program for Indonesian Teachers

    Directory of Open Access Journals (Sweden)

    Ali Muhson

    2017-04-01

    Full Text Available Item analysis has essential roles in the learning assessment. The item analysis program is designed to measure student achievement and instructional effectiveness. This study was aimed to develop item-analysis program and verify its feasibility. This study uses a Research and Development (R & D model. The procedure includes designing and developing a product, validating, and testing the product. The data were collected through documentations, questionnaires, and interviews. This study successfully developed item analysis program, namely AnBuso. It is developed based on classical test theory (CTT. It was practical and applicable for Indonesian teachers to analyse test items

  4. Probing primordial non-Gaussianity via iSW measurements with SKA continuum surveys

    CERN Document Server

    Raccanelli, Alvise; Bacon, David J; Maartens, Roy; Santos, Mario G; Camera, Stefano; Davis, Tamara; Drinkwater, Michael J; Jarvis, Matt; Norris, Ray; Parkinson, David

    2014-01-01

    The Planck CMB experiment has delivered the best constraints so far on primordial non-Gaussianity, ruling out early-Universe models of inflation that generate large non-Gaussianity. Although small improvements in the CMB constraints are expected, the next frontier of precision will come from future large-scale surveys of the galaxy distribution. The advantage of such surveys is that they can measure many more modes than the CMB -- in particular, forthcoming radio surveys with the SKA will cover huge volumes. Radio continuum surveys deliver the largest volumes, but with the disadvantage of no redshift information. In order to mitigate this, we use two additional observables. First, the integrated Sachs-Wolfe effect -- the cross-correlation of the radio number counts with the CMB temperature anisotropies -- helps to reduce systematics on the large scales that are sensitive to non-Gaussianity. Second, optical data allows for cross-identification in order to gain some redshift information. We show that, while the...

  5. Measuring patient safety culture in Taiwan using the Hospital Survey on Patient Safety Culture (HSOPSC

    Directory of Open Access Journals (Sweden)

    Chen I-Chi

    2010-06-01

    Full Text Available Abstract Background Patient safety is a critical component to the quality of health care. As health care organizations endeavour to improve their quality of care, there is a growing recognition of the importance of establishing a culture of patient safety. In this research, the authors use the Hospital Survey on Patient Safety Culture (HSOPSC questionnaire to assess the culture of patient safety in Taiwan and attempt to provide an explanation for some of the phenomena that are unique in Taiwan. Methods The authors used HSOPSC to measure the 12 dimensions of the patient safety culture from 42 hospitals in Taiwan. The survey received 788 respondents including physicians, nurses, and non-clinical staff. This study used SPSS 15.0 for Windows and Amos 7 software tools to perform the statistical analysis on the survey data, including descriptive statistics and confirmatory factor analysis of the structural equation model. Results The overall average positive response rate for the 12 patient safety culture dimensions of the HSOPSC survey was 64%, slightly higher than the average positive response rate for the AHRQ data (61%. The results showed that hospital staff in Taiwan feel positively toward patient safety culture in their organization. The dimension that received the highest positive response rate was "Teamwork within units", similar to the results reported in the US. The dimension with the lowest percentage of positive responses was "Staffing". Statistical analysis showed discrepancies between Taiwan and the US in three dimensions, including "Feedback and communication about error", "Communication openness", and "Frequency of event reporting". Conclusions The HSOPSC measurement provides evidence for assessing patient safety culture in Taiwan. The results show that in general, hospital staffs in Taiwan feel positively toward patient safety culture within their organization. The existence of discrepancies between the US data and the Taiwanese data

  6. Laser Scanning in Engineering Surveying: Methods of Measurement and Modeling of Structures

    Directory of Open Access Journals (Sweden)

    Lenda Grzegorz

    2016-06-01

    Full Text Available The study is devoted to the uses of laser scanning in the field of engineering surveying. It is currently one of the main trends of research which is developed at the Department of Engineering Surveying and Civil Engineering at the Faculty of Mining Surveying and Environmental Engineering of AGH University of Science and Technology in Krakow. They mainly relate to the issues associated with tower and shell structures, infrastructure of rail routes, or development of digital elevation models for a wide range of applications. These issues often require the use of a variety of scanning techniques (stationary, mobile, but the differences also regard the planning of measurement stations and methods of merging point clouds. Significant differences appear during the analysis of point clouds, especially when modeling objects. Analysis of the selected parameters is already possible basing on ad hoc measurements carried out on a point cloud. However, only the construction of three-dimensional models provides complete information about the shape of structures, allows to perform the analysis in any place and reduces the amount of the stored data. Some structures can be modeled in the form of simple axes, sections, or solids, for others it becomes necessary to create sophisticated models of surfaces, depicting local deformations. The examples selected for the study allow to assess the scope of measurement and office work for a variety of uses related to the issue set forth in the title of this study. Additionally, the latest, forward-looking technology was presented - laser scanning performed from Unmanned Aerial Vehicles (drones. Currently, it is basically in the prototype phase, but it might be expected to make a significant progress in numerous applications in the field of engineering surveying.

  7. Laser Scanning in Engineering Surveying: Methods of Measurement and Modeling of Structures

    Science.gov (United States)

    Lenda, Grzegorz; Uznański, Andrzej; Strach, Michał; Lewińska, Paulina

    2016-06-01

    The study is devoted to the uses of laser scanning in the field of engineering surveying. It is currently one of the main trends of research which is developed at the Department of Engineering Surveying and Civil Engineering at the Faculty of Mining Surveying and Environmental Engineering of AGH University of Science and Technology in Krakow. They mainly relate to the issues associated with tower and shell structures, infrastructure of rail routes, or development of digital elevation models for a wide range of applications. These issues often require the use of a variety of scanning techniques (stationary, mobile), but the differences also regard the planning of measurement stations and methods of merging point clouds. Significant differences appear during the analysis of point clouds, especially when modeling objects. Analysis of the selected parameters is already possible basing on ad hoc measurements carried out on a point cloud. However, only the construction of three-dimensional models provides complete information about the shape of structures, allows to perform the analysis in any place and reduces the amount of the stored data. Some structures can be modeled in the form of simple axes, sections, or solids, for others it becomes necessary to create sophisticated models of surfaces, depicting local deformations. The examples selected for the study allow to assess the scope of measurement and office work for a variety of uses related to the issue set forth in the title of this study. Additionally, the latest, forward-looking technology was presented - laser scanning performed from Unmanned Aerial Vehicles (drones). Currently, it is basically in the prototype phase, but it might be expected to make a significant progress in numerous applications in the field of engineering surveying.

  8. Development of measurement apparatus for high resolution electrical surveys; Komitsudo denki tansa sokuteiki no kaihatsu

    Energy Technology Data Exchange (ETDEWEB)

    Moriuchi, H.; Matsuda, Y.; Shiokawa, Y. [Sumiko Consultants Co. Ltd., Tokyo (Japan); Uchino, Y. [Cosmic Co. Ltd., Tokyo (Japan)

    1996-05-01

    For the enforcement of the {rho}a-{rho}u survey method which is a type of high-density electrical survey, a multichannel resistivity measuring instrument has been developed. This instrument, in addition to the above, conducts resistivity tomography and various other kinds of high-density electrical survey. A potential produced by a low frequency rectangular current of 1Hz or lower outputted by the transmitter of this instrument is received and measured by the receiver connected to electrodes positioned at 100 or less locations. The receiver comprises a scanner that automatically switches from electrode to electrode, conditioner that processes signals, and controller. A transmitter of the standard design outputs a maximum voltage of 800V and maximum current of 2A, making a device suitable for probing 50 to several 100m-deep levels. The receiver is operated by a personal computer that the controller is provided with. The newly-developed apparatus succeeded in presenting high-precision images of the result of a {rho}a-{rho}u analysis for an apparent resistivity section and of the underground structure, verifying the high quality of the data collected by this apparatus. 10 refs., 5 figs., 1 tab.

  9. The 12-item World Health Organization Disability Assessment Schedule II (WHO-DAS II: a nonparametric item response analysis

    Directory of Open Access Journals (Sweden)

    Fernandez Ana

    2010-05-01

    Full Text Available Abstract Background Previous studies have analyzed the psychometric properties of the World Health Organization Disability Assessment Schedule II (WHO-DAS II using classical omnibus measures of scale quality. These analyses are sample dependent and do not model item responses as a function of the underlying trait level. The main objective of this study was to examine the effectiveness of the WHO-DAS II items and their options in discriminating between changes in the underlying disability level by means of item response analyses. We also explored differential item functioning (DIF in men and women. Methods The participants were 3615 adult general practice patients from 17 regions of Spain, with a first diagnosed major depressive episode. The 12-item WHO-DAS II was administered by the general practitioners during the consultation. We used a non-parametric item response method (Kernel-Smoothing implemented with the TestGraf software to examine the effectiveness of each item (item characteristic curves and their options (option characteristic curves in discriminating between changes in the underliying disability level. We examined composite DIF to know whether women had a higher probability than men of endorsing each item. Results Item response analyses indicated that the twelve items forming the WHO-DAS II perform very well. All items were determined to provide good discrimination across varying standardized levels of the trait. The items also had option characteristic curves that showed good discrimination, given that each increasing option became more likely than the previous as a function of increasing trait level. No gender-related DIF was found on any of the items. Conclusions All WHO-DAS II items were very good at assessing overall disability. Our results supported the appropriateness of the weights assigned to response option categories and showed an absence of gender differences in item functioning.

  10. Rapid Survey For Measuring The Level And Causes Of Maternal Mortality

    Directory of Open Access Journals (Sweden)

    Kumar Rajesh

    1997-01-01

    Full Text Available Research question: What is the extent of problem of maternal mortality in a given population? Objective: 1. To evolve a rapid survey methodology aimed at measuring maternal mortality ratio. 2. To find out the probable medical causes of maternal deaths and behavioural factors associated with them. Study Design: cross- sectional. Setting: Urban and rural areas of district Mohindergarh, Haryana. Participants: Members of families in which a maternal death had taken place in last 12 months. Sample size: All 275 deaths among women 15-44 years occurring in the district from 1st April 95 to 31st March 96. Study variables: Age, gravida, parity, literacy, caste, land holding, health care facilities, distance from health centers, mode of conveyance. Statistical Analysis: Rates and ratios. Results: Maternal mortality ratio was estimated to be 275 per 100,000 live births (298 rural and 82 urban. Major causes of death were â€" sepsis(30%, haemorrhage (21%, abortion(5%, eclampsia (3% and obstructed labour(3%. Twenty-nine causes of deaths occurred at home and 26% on way to hospital. Out of 59(93.7% cases who could avail medical consultation, 61% arranged it within five hours after onset of symptoms, and 78% availed two, 21% three, and 11% four consulations. The survey was completed in three months at a cost of Rs. 54,000. Recommendations: Such rapid surveys should be carried out periodically (every 4-5 years to monitor the progress in maternal health. Staff of heath deptt. Should be involved in carrying out these surveys. This will not only help in reducing cost of the survey but information about specific problems of maternal mortality in the area can be utilized by health staff for taking appropriate action to improve maternal health care.

  11. Survey of use of malaria prevention measures by Canadians visiting India

    Science.gov (United States)

    dos Santos, C C; Anvar, A; Keystone, J S; Kain, K C

    1999-01-01

    BACKGROUND: Imported malaria is an increasing problem, particularly among new immigrant populations. The objective of this study was to determine the malaria prevention measures used by Canadians originating from a malaria-endemic area when returning to visit their country of origin. METHODS: A 35-item English-language questionnaire was administered by interview to travellers at a departure lounge at Pearson International Airport, Toronto, between January and June 1995. Information was collected on subject characteristics, travel itinerary, perceptions about malaria, and pretravel health advice and malaria chemoprophylaxis and barriers to their use. RESULTS: A total of 324 travellers departing on flights to India were approached, of whom 307 (95%) agreed to participate in the study. Participants were Canadian residents of south Asian origin with a mean duration of residence in Canada of 12.8 years. Most of the respondents were returning to visit relatives for a mean visit duration of 6.8 weeks. Although 69% of the respondents thought malaria was a moderate to severe illness and 54% had sought advice before travelling, only 31% intended to use any chemoprophylaxis, and less than 10% were using measures to prevent mosquito bites. Only 7% had been prescribed a recommended drug regimen. Family practitioners were the primary source of information for travellers and were more likely to prescribe an inappropriate chemoprophylactic regimen than were travel clinics or public health centres (76% v. 36%) (p = 0.003). Respondents who had lived in Canada longest and those with a family history of malaria were more likely to use chemoprophylaxis (p < 0.01). INTERPRETATION: Few travellers were using appropriate chemoprophylaxis and mosquito prevention measures. Misconceptions about malaria risk and appropriate prevention measures were the main barriers identified. PMID:9951440

  12. A survey tool for measuring evidence-based decision making capacity in public health agencies

    Directory of Open Access Journals (Sweden)

    Jacobs Julie A

    2012-03-01

    Full Text Available Abstract Background While increasing attention is placed on using evidence-based decision making (EBDM to improve public health, there is little research assessing the current EBDM capacity of the public health workforce. Public health agencies serve a wide range of populations with varying levels of resources. Our survey tool allows an individual agency to collect data that reflects its unique workforce. Methods Health department leaders and academic researchers collaboratively developed and conducted cross-sectional surveys in Kansas and Mississippi (USA to assess EBDM capacity. Surveys were delivered to state- and local-level practitioners and community partners working in chronic disease control and prevention. The core component of the surveys was adopted from a previously tested instrument and measured gaps (importance versus availability in competencies for EBDM in chronic disease. Other survey questions addressed expectations and incentives for using EBDM, self-efficacy in three EBDM skills, and estimates of EBDM within the agency. Results In both states, participants identified communication with policymakers, use of economic evaluation, and translation of research to practice as top competency gaps. Self-efficacy in developing evidence-based chronic disease control programs was lower than in finding or using data. Public health practitioners estimated that approximately two-thirds of programs in their agency were evidence-based. Mississippi participants indicated that health department leaders' expectations for the use of EBDM was approximately twice that of co-workers' expectations and that the use of EBDM could be increased with training and leadership prioritization. Conclusions The assessment of EBDM capacity in Kansas and Mississippi built upon previous nationwide findings to identify top gaps in core competencies for EBDM in chronic disease and to estimate a percentage of programs in U.S. health departments that are evidence

  13. Measuring the health of the Indian elderly: evidence from National Sample Survey data

    Directory of Open Access Journals (Sweden)

    Mahal Ajay

    2010-11-01

    Full Text Available Abstract Background Comparable health measures across different sets of populations are essential for describing the distribution of health outcomes and assessing the impact of interventions on these outcomes. Self-reported health (SRH is a commonly used indicator of health in household surveys and has been shown to be predictive of future mortality. However, the susceptibility of SRH to influence by individuals' expectations complicates its interpretation and undermines its usefulness. Methods This paper applies the empirical methodology of Lindeboom and van Doorslaer (2004 to investigate elderly health in India using data from the 52nd round of the National Sample Survey conducted in 1995-96 that includes both an SRH variable as well as a range of objective indicators of disability and ill health. The empirical testing was conducted on stratified homogeneous groups, based on four factors: gender, education, rural-urban residence, and region. Results We find that region generally has a significant impact on how women perceive their health. Reporting heterogeneity can arise not only from cut-point shifts, but also from differences in health effects by objective health measures. In contrast, we find little evidence of reporting heterogeneity due to differences in gender or educational status within regions. Rural-urban residence does matter in some cases. The findings are robust with different specifications of objective health indicators. Conclusions Our exercise supports the thesis that the region of residence is associated with different cut-points and reporting behavior on health surveys. We believe this is the first paper that applies the Lindeboom-van Doorslaer methodology to data on the elderly in a developing country, showing the feasibility of applying this methodology to data from many existing cross-sectional health surveys.

  14. Population survey sampling methods in a rural African setting: measuring mortality

    Directory of Open Access Journals (Sweden)

    Byass Peter

    2008-05-01

    Full Text Available Abstract Background Population-based sample surveys and sentinel surveillance methods are commonly used as substitutes for more widespread health and demographic monitoring and intervention studies in resource-poor settings. Such methods have been criticised as only being worthwhile if the results can be extrapolated to the surrounding 100-fold population. With an emphasis on measuring mortality, this study explores the extent to which choice of sampling method affects the representativeness of 1% sample data in relation to various demographic and health parameters in a rural, developing-country setting. Methods Data from a large community based census and health survey conducted in rural Burkina Faso were used as a basis for modelling. Twenty 1% samples incorporating a range of health and demographic parameters were drawn at random from the overall dataset for each of seven different sampling procedures at two different levels of local administrative units. Each sample was compared with the overall 'gold standard' survey results, thus enabling comparisons between the different sampling procedures. Results All sampling methods and parameters tested performed reasonably well in representing the overall population. Nevertheless, a degree of variation could be observed both between sampling approaches and between different parameters, relating to their overall distribution in the total population. Conclusion Sample surveys are able to provide useful demographic and health profiles of local populations. However, various parameters being measured and their distribution within the sampling unit of interest may not all be best represented by a particular sampling method. It is likely therefore that compromises may have to be made in choosing a sampling strategy, with costs, logistics the intended use of the data being important considerations.

  15. A portable UAV LIDAR system for coastal topographic surveys and sea surface measurements

    Science.gov (United States)

    Huang, Zhi-Cheng; Liu, Philip L.-F.; Tseng, Kuo-Hsin; Yeh, Sunny

    2017-04-01

    A light-weight UAV system for coastal topography and coastal sea surface measurements is developed. This system is based on techniques of a multirotor UAV, a light detection and ranging (LIDAR), an inertial measurement unit, and a real-time kinematic global navigation satellite system (RTK-GNSS). The synchronization and data recording are achieved using Labview. This system can be operated in a very low attitude flight within a range of 10m that can provide very high resolution of point cloud data. The performance of this system has been tested and calibrated with known targets. The vertical root-mean-square error is less than about 10 cm, depending on the flight height. Applications of the system, including coastal topographic surveys, tidal elevation measurement, wave measurements, and bottom roughness measurements are presented and discussed. The tide and wave measurements are compared with in-situ measurements using pressure sensors. The results of comparison suggest that this system is a useful tool to measure the sea surface elevation and topography. The challenges of applying this system are also discussed.

  16. Revisiting the internal consistency and factorial validity of the 8-item Morisky Medication Adherence Scale

    Directory of Open Access Journals (Sweden)

    Arsène Zongo

    2016-10-01

    Full Text Available Objective: To assess the internal consistency and factorial validity of the adapted French 8-item Morisky Medication Adherence Scale in assessing adherence to noninsulin antidiabetic drug treatment. Study Design and Setting: In a cross-sectional web survey of individuals with type 2 diabetes of the Canadian province of Quebec, self-reported adherence to the antidiabetes drug treatment was measured using the Morisky Medication Adherence Scale-8. We assessed the internal consistency of the Morisky Medication Adherence Scale-8 with Cronbach’s alpha, and factorial validity was assessed by identifying the underlying factors using exploratory factor analyses. Results: A total of 901 individuals completed the survey. Cronbach’s alpha was 0.60. Two factors were identified. One factor comprised five items: stopping medication when diabetes is under control, stopping when feeling worse, feeling hassled about sticking to the prescription, reasons other than forgetting and a cross-loading item (i.e. taking drugs the day before. The second factor comprised three other items that were all related to forgetfulness in addition to the cross-loading item. Conclusion: Cronbach’s alpha of the adapted French Morisky Medication Adherence Scale-8 was below the acceptable value of 0.70. This observed low internal consistency of the scale is probably related to the causal nature of the items of the scale but not necessarily a lack of reliability. The results suggest that the adapted French Morisky Medication Adherence Scale-8 is a two-factor scale assessing intentional (first factor and unintentional (second factor non-adherence to the noninsulin antidiabetes drug treatment. The scale could be used to separately identify these outcomes using scores obtained on each of the sub-scales.

  17. The measurement invariance of job diagnostic survey (JDS) across three university student groups

    Energy Technology Data Exchange (ETDEWEB)

    Martinez-Gomez, M.; Marin-Garcia, J.A.; Girado Omeara, M.

    2016-07-01

    The main purpose of this study is to apply a multigroup confirmatory analysis to examine the measurement invariance (MI) of the adapted version of the Job Diagnosis Survey (JDS) as a measurement tool that analyses the relationship between the features of teaching methodologies with university students’ motivation and satisfaction across data collected on different degrees and academic years. Design/methodology/approach: Confirmatory factor analysis was carried out using a multigroup structural equation model, using the program EQS 6.1 to test the invariance of the adapted version of JDS in a sample constituted by 535 student of a Spanish public university. The assessment of invariance included the levels of configural, metric, scalar, covariance and latent variables invariance. Several goodness-of-fit measures were assessed... (Author)

  18. Psychometric Properties of a Korean Measure of Person-Directed Care in Nursing Homes

    Science.gov (United States)

    Choi, Jae-Sung; Lee, Minhong

    2014-01-01

    Objective: This study examined the validity and reliability of a person-directed care (PDC) measure for nursing homes in Korea. Method: Managerial personnel from 223 nursing homes in 2010 and 239 in 2012 were surveyed. Results: Item analysis and exploratory factor analysis for the first sample generated a 33-item PDC measure with eight factors.…

  19. Merit Principles Survey 2016 Data

    Data.gov (United States)

    Merit Systems Protection Board — MPS contains a combination of core items that MSPB tracks over time and special-purpose items developed to support a particular special study. This survey differs...

  20. Mistaken identity? Visual similarities of marine debris to natural prey items of sea turtles

    Science.gov (United States)

    2014-01-01

    Background There are two predominant hypotheses as to why animals ingest plastic: 1) they are opportunistic feeders, eating plastic when they encounter it, and 2) they eat plastic because it resembles prey items. To assess which hypothesis is most likely, we created a model sea turtle visual system and used it to analyse debris samples from beach surveys and from necropsied turtles. We investigated colour, contrast, and luminance of the debris items as they would appear to the turtle. We also incorporated measures of texture and translucency to determine which of the two hypotheses is more plausible as a driver of selectivity in green sea turtles. Results Turtles preferred more flexible and translucent items to what was available in the environment, lending support to the hypothesis that they prefer debris that resembles prey, particularly jellyfish. They also ate fewer blue items, suggesting that such items may be less conspicuous against the background of open water where they forage. Conclusions Using visual modelling we determined the characteristics that drive ingestion of marine debris by sea turtles, from the point of view of the turtles themselves. This technique can be utilized to determine debris preferences of other visual predators, and help to more effectively focus management or remediation actions. PMID:24886170

  1. Mistaken identity? Visual similarities of marine debris to natural prey items of sea turtles.

    Science.gov (United States)

    Schuyler, Qamar A; Wilcox, Chris; Townsend, Kathy; Hardesty, B Denise; Marshall, N Justin

    2014-05-09

    There are two predominant hypotheses as to why animals ingest plastic: 1) they are opportunistic feeders, eating plastic when they encounter it, and 2) they eat plastic because it resembles prey items. To assess which hypothesis is most likely, we created a model sea turtle visual system and used it to analyse debris samples from beach surveys and from necropsied turtles. We investigated colour, contrast, and luminance of the debris items as they would appear to the turtle. We also incorporated measures of texture and translucency to determine which of the two hypotheses is more plausible as a driver of selectivity in green sea turtles. Turtles preferred more flexible and translucent items to what was available in the environment, lending support to the hypothesis that they prefer debris that resembles prey, particularly jellyfish. They also ate fewer blue items, suggesting that such items may be less conspicuous against the background of open water where they forage. Using visual modelling we determined the characteristics that drive ingestion of marine debris by sea turtles, from the point of view of the turtles themselves. This technique can be utilized to determine debris preferences of other visual predators, and help to more effectively focus management or remediation actions.

  2. Bayesian item fit analysis for unidimensional item response theory models.

    Science.gov (United States)

    Sinharay, Sandip

    2006-11-01

    Assessing item fit for unidimensional item response theory models for dichotomous items has always been an issue of enormous interest, but there exists no unanimously agreed item fit diagnostic for these models, and hence there is room for further investigation of the area. This paper employs the posterior predictive model-checking method, a popular Bayesian model-checking tool, to examine item fit for the above-mentioned models. An item fit plot, comparing the observed and predicted proportion-correct scores of examinees with different raw scores, is suggested. This paper also suggests how to obtain posterior predictive p-values (which are natural Bayesian p-values) for the item fit statistics of Orlando and Thissen that summarize numerically the information in the above-mentioned item fit plots. A number of simulation studies and a real data application demonstrate the effectiveness of the suggested item fit diagnostics. The suggested techniques seem to have adequate power and reasonable Type I error rate, and psychometricians will find them promising.

  3. Differential Item Functioning Analysis of the 2003-04 NHANES Physical Activity Questionnaire

    Science.gov (United States)

    Gao, Yong; Zhu, Weimo

    2011-01-01

    Using differential item functioning (DIF) analyses, this study examined whether there were any DIF items in the National Health and Nutrition Examination Survey (NHANES) physical activity (PA) questionnaire. A subset of adult data from the 2003-04 NHANES study (n = 3,083) was used. PA items related to respondents' occupational, transportation,…

  4. ABORTION ATTITUDES, 1984-1987-1988 - EFFECTS OF ITEM ORDER AND DIMENSIONALITY

    NARCIS (Netherlands)

    TENVERGERT, E; GILLESPIE, MW; KINGMA, J; KLASEN, H

    1992-01-01

    The comparability of surveys is often hampered by differences in the item order of presentation. The major focus of the present study was to investigate whether a general item or a specific item at the beginning of the questionnaire would affect the endorsement as well as the scalability of a set of

  5. Differential Item Functioning Analysis of the 2003-04 NHANES Physical Activity Questionnaire

    Science.gov (United States)

    Gao, Yong; Zhu, Weimo

    2011-01-01

    Using differential item functioning (DIF) analyses, this study examined whether there were any DIF items in the National Health and Nutrition Examination Survey (NHANES) physical activity (PA) questionnaire. A subset of adult data from the 2003-04 NHANES study (n = 3,083) was used. PA items related to respondents' occupational, transportation,…

  6. Computer-assisted measurement of perceived stress: an application for a community-based survey.

    Science.gov (United States)

    Kimura, Tomoaki; Uchida, Seiya; Tsuda, Yasutami; Eboshida, Akira

    2005-09-01

    The assessment of stress is a key issue in health promotion policies as well as in treatment strategies for patients. The aim of this study was to confirm the accessibility and reliability of computer-assisted data collection for perceived stress measurement, using the Japanese version of the Perceived Stress Scale (JPSS), within the setting of a community-based survey. There were two groups of participants in this survey. One group responded to a Web-based application, and the other to the VBA of a spreadsheet software. The total scores of JPSS were almost normally distributed. The means of total scores of JPSS were 23.6 and 23.1. These results were lower than the previous study of JPSS. Since Cronbach's alpha coefficients in both surveys were more than 0.8, high reliability was demonstrated despite a number of computer-illiterate and/or aged participants. They felt that the spreadsheet form was easier to respond to. Two components were extracted with the Varimax rotation of principal component analysis, and these were named "perception of stress and stressors" and "behavior to stress". This finding suggests that it is possible to determine sub-scales. From the viewpoint of preventive medicine, it is expected that the JPSS applications will be utilized to investigate the relationship between stress and other factors such as lifestyle, environment and quality of life.

  7. [Perceptions on item disclosure for the Korean medical licensing examination].

    Science.gov (United States)

    Yang, Eunbae B

    2015-09-01

    This study analyzed the perceptions of medical students and faculty regarding disclosure of test items on the Korean medical licensing examination. I conducted a survey of medical students from medical colleges and professional medical schools nationwide. Responses were analyzed from 718 participants as well as 69 faculty members who participated in creating the medical licensing examination item sets. Data were analyzed using descriptive statistics and the chi-square test. It is important to maintain test quality and to keep the test items unavailable to the public. There are also concerns among students that disclosure of test items would prompt increasing difficulty of test items (48.3%). Further, few students found it desirable to disclose test items regardless of any considerations (28.5%). The professors, who had experience in designing the test items, also expressed their opposition to test item disclosure (60.9%). It is desirable not to disclose the test items of the Korean medical licensing examination to the public on the condition that students are provided with a sufficient amount of information regarding the examination. This is so that the exam can appropriately identify candidates with the required qualifications.

  8. Efficient Algorithms for Segmentation of Item-Set Time Series

    Science.gov (United States)

    Chundi, Parvathi; Rosenkrantz, Daniel J.

    We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets—Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.

  9. Improving the Validity and Reliability of a Health Promotion Survey for Physical Therapists

    Science.gov (United States)

    Stephens, Jaca L.; Lowman, John D.; Graham, Cecilia L.; Morris, David M.; Kohler, Connie L.; Waugh, Jonathan B.

    2013-01-01

    Purpose Physical therapists (PTs) have a unique opportunity to intervene in the area of health promotion. However, no instrument has been validated to measure PTs’ views on health promotion in physical therapy practice. The purpose of this study was to evaluate the content validity and test-retest reliability of a health promotion survey designed for PTs. Methods An expert panel of PTs assessed the content validity of “The Role of Health Promotion in Physical Therapy Survey” and provided suggestions for revision. Item content validity was assessed using the content validity ratio (CVR) as well as the modified kappa statistic. Therapists then participated in the test-retest reliability assessment of the revised health promotion survey, which was assessed using a weighted kappa statistic. Results Based on feedback from the expert panelists, significant revisions were made to the original survey. The expert panel reached at least a majority consensus agreement for all items in the revised survey and the survey-CVR improved from 0.44 to 0.66. Only one item on the revised survey had substantial test-retest agreement, with 55% of the items having moderate agreement and 43% poor agreement. Conclusions All items on the revised health promotion survey demonstrated at least fair validity, but few items had reasonable test-retest reliability. Further modifications should be made to strengthen the validity and improve the reliability of this survey. PMID:23754935

  10. Item bias in self-reported functional ability among 75-year-old men and women in three Nordic localities

    DEFF Research Database (Denmark)

    Avlund, K; Era, P; Davidsen, M

    1996-01-01

    The purpose of this article is to analyse item bias in a measure of self-reported functional ability among 75-year-old people in three Nordic localities. The present item bias analysis examines whether the construction of a functional ability index from several variables results in bias in relation...... to geographical locality and gender. Information about self-reported functional ability was gathered from surveys on 75-year-old men and women in Glostrup (Denmark), Göteborg (Sweden) and Jyväskylä (Finland). The data were collected by structured home interviews about mobility and Physical activities of daily...

  11. Pattern analysis of total item score and item response of the Kessler Screening Scale for Psychological Distress (K6) in a nationally representative sample of US adults.

    Science.gov (United States)

    Tomitaka, Shinichiro; Kawasaki, Yohei; Ide, Kazuki; Akutagawa, Maiko; Yamada, Hiroshi; Yutaka, Ono; Furukawa, Toshiaki A

    2017-01-01

    Several recent studies have shown that total scores on depressive symptom measures in a general population approximate an exponential pattern except for the lower end of the distribution. Furthermore, we confirmed that the exponential pattern is present for the individual item responses on the Center for Epidemiologic Studies Depression Scale (CES-D). To confirm the reproducibility of such findings, we investigated the total score distribution and item responses of the Kessler Screening Scale for Psychological Distress (K6) in a nationally representative study. Data were drawn from the National Survey of Midlife Development in the United States (MIDUS), which comprises four subsamples: (1) a national random digit dialing (RDD) sample, (2) oversamples from five metropolitan areas, (3) siblings of individuals from the RDD sample, and (4) a national RDD sample of twin pairs. K6 items are scored using a 5-point scale: "none of the time," "a little of the time," "some of the time," "most of the time," and "all of the time." The pattern of total score distribution and item responses were analyzed using graphical analysis and exponential regression model. The total score distributions of the four subsamples exhibited an exponential pattern with similar rate parameters. The item responses of the K6 approximated a linear pattern from "a little of the time" to "all of the time" on log-normal scales, while "none of the time" response was not related to this exponential pattern. The total score distribution and item responses of the K6 showed exponential patterns, consistent with other depressive symptom scales.

  12. 10 CFR 74.55 - Item monitoring.

    Science.gov (United States)

    2010-01-01

    ..., except for reactor components measuring at least one meter in length and weighing in excess of 30... 10 Energy 2 2010-01-01 2010-01-01 false Item monitoring. 74.55 Section 74.55 Energy NUCLEAR REGULATORY COMMISSION (CONTINUED) MATERIAL CONTROL AND ACCOUNTING OF SPECIAL NUCLEAR MATERIAL...

  13. Item Feature Effects in Evolution Assessment

    Science.gov (United States)

    Nehm, Ross H.; Ha, Minsu

    2011-01-01

    Despite concerted efforts by science educators to understand patterns of evolutionary reasoning in science students and teachers, the vast majority of evolution education studies have failed to carefully consider or control for item feature effects in knowledge measurement. Our study explores whether robust contextualization patterns emerge within…

  14. Item Feature Effects in Evolution Assessment

    Science.gov (United States)

    Nehm, Ross H.; Ha, Minsu

    2011-01-01

    Despite concerted efforts by science educators to understand patterns of evolutionary reasoning in science students and teachers, the vast majority of evolution education studies have failed to carefully consider or control for item feature effects in knowledge measurement. Our study explores whether robust contextualization patterns emerge within…

  15. Development and Validation of an Instrument for Assessing Climate Change Knowledge and Perceptions: The Climate Stewardship Survey (CSS)

    OpenAIRE

    Scott L. WALKER; McNeal, Karen S

    2013-01-01

    The Climate Stewardship Survey (CSS) was developed to measure knowledge and perceptions of global climate change, while also considering information sources that respondents ‘trust.’ The CSS was drafted using a three-stage approach: development of salient scales, writing individual items, and field testing and analyses. Construct validity and alpha-level reliability was conducted on the 122-item test instrument to produce a refined 84-item CSS.  The field tested C...

  16. The Body Appreciation Scale-2: item refinement and psychometric evaluation.

    Science.gov (United States)

    Tylka, Tracy L; Wood-Barcalow, Nichole L

    2015-01-01

    Considered a positive body image measure, the 13-item Body Appreciation Scale (BAS; Avalos, Tylka, & Wood-Barcalow, 2005) assesses individuals' acceptance of, favorable opinions toward, and respect for their bodies. While the BAS has accrued psychometric support, we improved it by rewording certain BAS items (to eliminate sex-specific versions and body dissatisfaction-based language) and developing additional items based on positive body image research. In three studies, we examined the reworded, newly developed, and retained items to determine their psychometric properties among college and online community (Amazon Mechanical Turk) samples of 820 women and 767 men. After exploratory factor analysis, we retained 10 items (five original BAS items). Confirmatory factor analysis upheld the BAS-2's unidimensionality and invariance across sex and sample type. Its internal consistency, test-retest reliability, and construct (convergent, incremental, and discriminant) validity were supported. The BAS-2 is a psychometrically sound positive body image measure applicable for research and clinical settings.

  17. 大型教育調查研究中的差別試題功能:次級分析中的核心概念及建模方法 Differential Item Functioning Analyses in Large-Scale Educational Surveys: Key Concepts and Modeling Approaches for Secondary Analysts

    Directory of Open Access Journals (Sweden)

    朱小姝 Xiao-Shu Zhu

    2011-03-01

    Full Text Available 大型教育評量研究常採用多階段抽樣的設計(multi-stage sampling design),透過對母群體之抽樣單位進行分層以抽取受測者。此外,還會採用複雜題本設計(complex booklet design)的方式將題目組成多份測驗題本。在此情況下,欲確保公正測量出不同受測群體的能力,關鍵在於能夠有效偵測所採用的題目是否具差別試題功能(differential item functioning, DIF)。本文旨在介紹探討在大型教育評量複雜設計之下能用以偵測差別試題功能的建模方法,並應用六種可用於偵測DIF 的多階層廣義線性模式(hierarchical generalized linear models, HGLMs),再透過電腦模擬比較它們偵測DIF 的效力。接著又將這些模式應用到國際數學與科學教育成就趨勢調查研究(TIMSS)的實證數據上,藉以探測是否存在一致性的性別DIF(uniform gender DIF)。 Many educational surveys employ a multi-stage sampling design for students, which makes use of stratification and/or clustering of population units, as well as a complex booklet design for items from an item pool. In these surveys, the reliable detection of item bias or differential item functioning (DIF across student groups is a key component for ensuring fair representations of different student groups. In this paper, we describe several modeling approaches that can be useful for detecting DIF in educational surveys. We illustrate the key ideas by investigating the performance of six hierarchical generalized linear models (HGLMs using a small simulation study and by applying them to real data from the Trends in Mathematics and Science Study (TIMSS study where we use them to investigate potential uniform gender DIF.

  18. Measuring infertility in populations: constructing a standard definition for use with demographic and reproductive health surveys.

    Science.gov (United States)

    Mascarenhas, Maya N; Cheung, Hoiwan; Mathers, Colin D; Stevens, Gretchen A

    2012-08-31

    Infertility is a significant disability, yet there are no reliable estimates of its global prevalence. Studies on infertility prevalence define the condition inconsistently, rendering the comparison of studies or quantitative summaries of the literature difficult. This study analyzed key components of infertility to develop a definition that can be consistently applied to globally available household survey data. We proposed a standard definition of infertility and used it to generate prevalence estimates using 53 Demographic and Health Surveys (DHS). The analysis was restricted to the subset of DHS that contained detailed fertility information collected through the reproductive health calendar. We performed sensitivity analyses for key components of the definition and used these to inform our recommendations for each element of the definition. Exposure type (couple status, contraceptive use, and intent), exposure time, and outcomes were key elements of the definition that we proposed. Our definition produced estimates that ranged from 0.6% to 3.4% for primary infertility and 8.7% to 32.6% for secondary infertility. Our sensitivity analyses showed that using an exposure measure of five years is less likely to misclassify fertile unions as infertile. Additionally, using a current, rather than continuous, measure of contraceptive use over five years resulted in a median relative error in secondary infertility of 20.7% (interquartile range of relative error [IQR]: 12.6%-26.9%), while not incorporating intent produced a corresponding error in secondary infertility of 58.2% (IQR: 44.3%-67.9%). In order to estimate the global burden of infertility, prevalence estimates using a consistent definition need to be generated. Our analysis provided a recommended definition that could be applied to widely available global household data. We also summarized potential biases that should be considered when making estimates of infertility prevalence using household survey data.

  19. Polytomous latent scales for the investigation of the ordering of items

    NARCIS (Netherlands)

    Ligtvoet, R.; van der Ark, L.A.; Bergsma, W. P.; Sijtsma, K.

    2011-01-01

    We propose three latent scales within the framework of nonparametric item response theory for polytomously scored items. Latent scales are models that imply an invariant item ordering, meaning that the order of the items is the same for each measurement value on the latent scale. This ordering prope

  20. IRT Item Parameters and the Reliability and Validity of Pretest, Posttest, and Gain Scores

    Science.gov (United States)

    May, Kim; Jackson, Tameika S.

    2005-01-01

    The effect of different combinations of item response theory (IRT) item parameters (item difficulty, item discrimination, and the guessing probability) on the reliability and construct validity (correlation with the latent trait being measured) of pretest, posttest, and gain scores was analytically examined using the 3-parameter logistic (3PL)…

  1. Testing Three-Item Versions for Seven of Young's Maladaptive Schema

    Science.gov (United States)

    Blau, Gary; DiMino, John; Sheridan, Natalie; Pred, Robert S.; Beverly, Clyde; Chessler, Marcy

    2015-01-01

    The Young Schema Questionnaire (YSQ) in either long-form (205- item) or short-form (75-item or 90-item) versions has demonstrated its clinical usefulness for assessing early maladaptive schemas. However, even a 75 or 90-item "short form", particularly when combined with other measures, can represent a lengthy…

  2. GRE Verbal Analogy Items: Examinee Reasoning on Items.

    Science.gov (United States)

    Duran, Richard P.; And Others

    Information about how Graduate Record Examination (GRE) examinees solve verbal analogy problems was obtained in this study through protocol analysis. High- and low-ability subjects who had recently taken the GRE General Test were asked to "think aloud" as they worked through eight analogy items. These items varied factorially on the…

  3. Measuring population health: costs of alternative survey approaches in the Nouna Health and Demographic Surveillance System in rural Burkina Faso

    Directory of Open Access Journals (Sweden)

    Henrike Lietz

    2015-08-01

    Full Text Available Background: There are more than 40 Health and Demographic Surveillance System (HDSS sites in 19 different countries. The running costs of HDSS sites are high. The financing of HDSS activities is of major importance, and adding external health surveys to the HDSS is challenging. To investigate the ways of improving data quality and collection efficiency in the Nouna HDSS in Burkina Faso, the stand-alone data collection activities of the HDSS and the Household Morbidity Survey (HMS were integrated, and the paper-based questionnaires were consolidated into a single tablet-based questionnaire, the Comprehensive Disease Assessment (CDA. Objective: The aims of this study are to estimate and compare the implementation costs of the two different survey approaches for measuring population health. Design: All financial costs of stand-alone (HDSS and HMS and integrated (CDA surveys were estimated from the perspective of the implementing agency. Fixed and variable costs of survey implementation and key cost drivers were identified. The costs per household visit were calculated for both survey approaches. Results: While fixed costs of survey implementation were similar for the two survey approaches, there were considerable variations in variable costs, resulting in an estimated annual cost saving of about US$45,000 under the integrated survey approach. This was primarily because the costs of data management for the tablet-based CDA survey were considerably lower than for the paper-based stand-alone surveys. The cost per household visit from the integrated survey approach was US$21 compared with US$25 from the stand-alone surveys for collecting the same amount of information from 10,000 HDSS households. Conclusions: The CDA tablet-based survey method appears to be feasible and efficient for collecting health and demographic data in the Nouna HDSS in rural Burkina Faso. The possibility of using the tablet-based data collection platform to improve the quality

  4. Measuring the success of electronic medical record implementation using electronic and survey data.

    Science.gov (United States)

    Keshavjee, K; Troyan, S; Holbrook, A M; VanderMolen, D

    2001-01-01

    Computerization of physician practices is increasing. Stakeholders are demanding demonstrated value for their Electronic Medical Record (EMR) implementations. We developed survey tools to measure medical office processes, including administrative and physician tasks pre- and post-EMR implementation. We included variables that were expected to improve with EMR implementation and those that were not expected to improve, as controls. We measured the same processes pre-EMR, at six months and 18 months post-EMR. Time required for most administrative tasks decreased within six months of EMR implementation. Staff time spent on charting increased with time, in keeping with our anecdotal observations that nurses were given more responsibility for charting in many offices. Physician time to chart increased initially by 50%, but went down to original levels by 18 months. However, this may be due to the drop-out of those physicians who had a difficult time charting electronically.

  5. Developing and testing a new measure of staff nurse clinical leadership: the clinical leadership survey.

    Science.gov (United States)

    Patrick, Allison; Laschinger, Heather K Spence; Wong, Carol; Finegan, Joan

    2011-05-01

    To test the psychometric properties of a newly developed measure of staff nurse clinical leadership derived from Kouzes and Posner's model of transformational leadership. While nurses have been recognized for their essential role in keeping patients safe, there has been little empirical research that has examined clinical leadership at the staff nurse level.   A non-experimental survey design was used to test the psychometric properties of the clinical leadership survey (CLS). Four hundred and eighty registered nurses (RNs) providing direct patient care in Ontario acute care hospitals returned useable questionnaires.   Confirmatory factor analysis provided preliminary evidence for the construct validity for the new measure of staff nurse clinical leadership. Structural empowerment fully mediated the relationship between nursing leadership and staff nurse clinical leadership. The results provide encouraging evidence for the construct validity of the CLS. Nursing administrators must create empowering work environments to ensure staff nurses have access to work structures which enable them to enact clinical leadership behaviours while providing direct patient care. © 2011 The Authors. Journal compilation © 2011 Blackwell Publishing Ltd.

  6. Clustering of Sloan Digital Sky Survey III Photometric Luminous Galaxies: The Measurement, Systematics and Cosmological Implications

    CERN Document Server

    Ho, Shirley; Seo, Hee-Jong; de Putter, Roland; Ross, Ashley J; White, Martin; Padmanabhan, Nikhil; Saito, Shun; Schlegel, David J; Schlafly, Eddie; Seljak, Uros; Hernandez-Monteagudo, Carlos; Sanchez, Ariel G; Percival, Will J; Blanton, Michael; Skibba, Ramin; Schneider, Don; Reid, Beth; Mena, Olga; Viel, Matteo; Eisenstein, Daniel J; Prada, Francisco; Weaver, Benjamin; Bahcall, Neta; Bizyaev, Dimitry; Brewinton, Howard; Brinkman, Jon; da Costa, Luiz Nicolaci; Gott, John R; Malanushenko, Elena; Malanushenko, Viktor; Nichol, Bob; Oravetz, Daniel; Pan, Kaike; Palanque-Delabrouille, Nathalie; Ross, Nicholas P; Simmons, Audrey; de Simoni, Fernando; Snedden, Stephanie; Yeche, Christophe

    2012-01-01

    The Sloan Digital Sky Survey (SDSS) surveyed 14,555 square degrees, and delivered over a trillion pixels of imaging data. We present a study of galaxy clustering using 900,000 luminous galaxies with photometric redshifts, spanning between $z=0.45$ and $z=0.65$, constructed from the SDSS using methods described in Ross et al. (2011). This data-set spans 11,000 square degrees and probes a volume of $3h^{-3} \\rm{Gpc}^3$, making it the largest volume ever used for galaxy clustering measurements. We present a novel treatment of the observational systematics and its applications to the clustering signals from the data set. In this paper, we measure the angular clustering using an optimal quadratic estimator at 4 redshift slices with an accuracy of ~15% with bin size of delta_l = 10 on scales of the Baryon Acoustic Oscillations (BAO) (at l~40-400). We derive cosmological constraints using the full-shape of the power-spectra. For a flat Lambda CDM model, when combined with Cosmic Microwave Background Wilkinson Microw...

  7. Spectral Classification and Redshift Measurement for the SDSS-III Baryon Oscillation Spectroscopic Survey

    CERN Document Server

    Bolton, Adam S; Aubourg, Eric; Bailey, Stephen; Bhardwaj, Vaishali; Brownstein, Joel R; Burles, Scott; Chen, Yan-Mei; Gunn, James E; Dawson, Kyle; Eisenstein, Daniel J; Knapp, G R; Loomis, Craig P; Lupton, Robert H; Maraston, Claudia; Muna, Demitri; Myers, Adam D; Olmstead, Matthew D; Padmanabhan, Nikhil; Paris, Isabelle; Percival, Will J; Petitjean, Patrick; Rockosi, Constance M; Ross, Nicholas P; Schneider, Donald P; Shu, Yiping; Strauss, Michael A; Thomas, Daniel; Tremonti, Christy A; Wake, David A; Weaver, Benjamin A; Wood-Vasey, W Michael

    2012-01-01

    (abridged) We describe the automated spectral classification, redshift determination, and parameter measurement pipeline in use for the Baryon Oscillation Spectroscopic Survey (BOSS) of the Sloan Digital Sky Survey III (SDSS-III) as of Data Release 9, encompassing 831,000 moderate-resolution optical spectra. We give a review of the algorithms employed, and describe the changes to the pipeline that have been implemented for BOSS relative to previous SDSS-I/II versions, including new sets of stellar, galaxy, and quasar redshift templates. For the color-selected CMASS sample of massive galaxies at redshift 0.4 <~ z <~ 0.8 targeted by BOSS for the purposes of large-scale cosmological measurements, the pipeline achieves an automated classification success rate of 98.7% and confirms 95.4% of unique CMASS targets as galaxies (with the balance being mostly M stars). Based on visual inspections of a subset of BOSS galaxies, we find that ~0.2% of confidently reported CMASS sample classifications and redshifts are...

  8. The measurement invariance of job diagnostic survey (JDS across three university student groups

    Directory of Open Access Journals (Sweden)

    Monica Martinez-Gomez

    2016-02-01

    Full Text Available Purpose: The main purpose of this study is to apply a multigroup confirmatory analysis to examine the measurement invariance (MI of the adapted version of the Job Diagnosis Survey (JDS as a measurement tool that analyses the relationship between the features of teaching methodologies with university students’ motivation and satisfaction across data collected on different degrees and academic years. Design/methodology/approach: Confirmatory factor analysis was carried out using a multigroup structural equation model, using the program EQS 6.1 to test the invariance of the adapted version of JDS in a sample constituted by 535 student of a Spanish public university. The assessment of invariance included the levels of configural, metric, scalar, covariance and latent variables invariance. Several goodness-of-fit measures were assessed. Findings: The results show that measurements are equivalent at the configural, metric, covariance and latent factors invariance. Although the hypotheses of scalar invariance is rejected, results suggest that JDS is partial strict invariant and has satisfactory psychometric properties on all samples. Research limitations/implications: The sample is framed in university students aged between 18 and 30 and for a questionnaire on teaching methodology and students' satisfaction in the context of a Spanish university and the generalization to other questionnaire, or population, should be proved with specific data. Furthermore, the sample size is rather small. Originality/value: In the current process of change that is taking place in universities according to the plan developed by the European Space of Higher Education, focused on increasing the student skills, validate instruments as the satisfaction scale of JDS, are necessary to evaluate students’ satisfaction with new active methodologies. These findings are useful for researchers since they add the first sample in which the MI of a student’s satisfaction survey

  9. The VIMOS Public Extragalactic Redshift Survey: Measuring the growth rate of structure around cosmic voids

    CERN Document Server

    Hawken, A J; Iovino, A; Guzzo, L; Peacock, J A; de la Torre, S; Garilli, B; Bolzonella, M; Scodeggio, M; Abbas, U; Adami, C; Bottini, D; Cappi, A; Cucciati, O; Davidzon, I; Fritz, A; Franzetti, P; Krywult, J; Brun, V Le; Fevre, O Le; Maccagni, D; Małek, K; Marulli, F; Polletta, M; Pollo, A; Tasca, L A M; Tojeiro, R; Vergani, D; Zanichelli, A; Arnouts, S; Bel, J; Branchini, E; De Lucia, G; Ilbert, O; Moscardini, L; Percival, W J

    2016-01-01

    We identified voids in the completed VIMOS Public Extragalactic Redshift Survey (VIPERS), using an algorithm based on searching for empty spheres. We measured the cross-correlation between the centres of voids and the complete galaxy catalogue. The cross-correlation function exhibits a clear anisotropy in both VIPERS fields (W1 and W4), which is characteristic of linear redshift space distortions. By measuring the projected cross-correlation and then deprojecting it we are able to estimate the undistorted cross-correlation function. We propose that given a sufficiently well measured cross-correlation function one should be able to measure the linear growth rate of structure by applying a simple linear Gaussian streaming model for the redshift space distortions (RSD). Our study of voids in 306 mock galaxy catalogues mimicking the VIPERS fields would suggest that VIPERS is capable of measuring $\\beta$ with an error of around $25\\%$. Applying our method to the VIPERS data, we find a value for the redshift space ...

  10. Making high-accuracy null depth measurements for the LBTI exozodi survey

    Science.gov (United States)

    Mennesson, Bertrand; Defrère, Denis; Nowak, Matthias; Hinz, Philip; Millan-Gabet, Rafael; Absil, Olivier; Bailey, Vanessa; Bryden, Geoffrey; Danchi, William; Kennedy, Grant M.; Marion, Lindsay; Roberge, Aki; Serabyn, Eugene; Skemer, Andy J.; Stapelfeldt, Karl; Weinberger, Alycia J.; Wyatt, Mark

    2016-08-01

    The characterization of exozodiacal light emission is both important for the understanding of planetary systems evolution and for the preparation of future space missions aiming to characterize low mass planets in the habitable zone of nearby main sequence stars. The Large Binocular Telescope Interferometer (LBTI) exozodi survey aims at providing a ten-fold improvement over current state of the art, measuring dust emission levels down to a typical accuracy of 12 zodis per star, for a representative ensemble of 30+ high priority targets. Such measurements promise to yield a final accuracy of about 2 zodis on the median exozodi level of the targets sample. Reaching a 1 σ measurement uncertainty of 12 zodis per star corresponds to measuring interferometric cancellation ("null") levels, i.e visibilities at the few 100 ppm uncertainty level. We discuss here the challenges posed by making such high accuracy mid-infrared visibility measurements from the ground and present the methodology we developed for achieving current best levels of 500 ppm or so. We also discuss current limitations and plans for enhanced exozodi observations over the next few years at LBTI.

  11. Improved characterisation of measurement errors in electrical resistivity tomography (ERT) surveys

    Science.gov (United States)

    Tso, C. H. M.; Binley, A. M.; Kuras, O.; Graham, J.

    2016-12-01

    Measurement errors can play a pivotal role in geophysical inversion. Most inverse models require users to prescribe a statistical model of data errors before inversion. Wrongly prescribed error levels can lead to over- or under-fitting of data, yet commonly used models of measurement error are relatively simplistic. With the heightening interests in uncertainty estimation across hydrogeophysics, better characterisation and treatment of measurement errors is needed to provide more reliable estimates of uncertainty. We have analysed two time-lapse electrical resistivity tomography (ERT) datasets; one contains 96 sets of direct and reciprocal data collected from a surface ERT line within a 24h timeframe, while the other is a year-long cross-borehole survey at a UK nuclear site with over 50,000 daily measurements. Our study included the characterisation of the spatial and temporal behaviour of measurement errors using autocorrelation and covariance analysis. We find that, in addition to well-known proportionality effects, ERT measurements can also be sensitive to the combination of electrodes used. This agrees with reported speculation in previous literature that ERT errors could be somewhat correlated. Based on these findings, we develop a new error model that allows grouping based on electrode number in additional to fitting a linear model to transfer resistance. The new model fits the observed measurement errors better and shows superior inversion and uncertainty estimates in synthetic examples. It is robust, because it groups errors together based on the number of the four electrodes used to make each measurement. The new model can be readily applied to the diagonal data weighting matrix commonly used in classical inversion methods, as well as to the data covariance matrix in the Bayesian inversion framework. We demonstrate its application using extensive ERT monitoring datasets from the two aforementioned sites.

  12. Making Meaningful Measurement in Survey Research: A Demonstration of the Utility of the Rasch Model. IR Applications. Volume 28

    Science.gov (United States)

    Royal, Kenneth D.

    2010-01-01

    Quality measurement is essential in every form of research, including institutional research and assessment. This paper addresses the erroneous assumptions institutional researchers often make with regard to survey research and provides an alternative method to producing more valid and reliable measures. Rasch measurement models are discussed and…

  13. A Mixed Effects Randomized Item Response Model

    Science.gov (United States)

    Fox, J.-P.; Wyrick, Cheryl

    2008-01-01

    The randomized response technique ensures that individual item responses, denoted as true item responses, are randomized before observing them and so-called randomized item responses are observed. A relationship is specified between randomized item response data and true item response data. True item response data are modeled with a (non)linear…

  14. Computerized adaptive testing with item cloning

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; van der Linden, Willem J.

    2003-01-01

    To increase the number of items available for adaptive testing and reduce the cost of item writing, the use of techniques of item cloning has been proposed. An important consequence of item cloning is possible variability between the item parameters. To deal with this variability, a multilevel item

  15. Item Veto: Dangerous Constitutional Tinkering.

    Science.gov (United States)

    Bellamy, Calvin

    1989-01-01

    In theory, the item veto would empower the President to remove wasteful and unnecessary projects from legislation. Yet, despite its history at the state level, the item veto is a loosely defined concept that may not work well at the federal level. Much more worrisome is the impact on the balance of power. (Author/CH)

  16. Using Rasch modeling to measure acculturation in youth.

    Science.gov (United States)

    Davis, Melinda F; Adam, Mary; Carvajal, Scott; Sechrest, Lee; Reyna, Valerie F

    2011-01-01

    Ethnic differences in health outcomes are assumed to reflect levels of acculturation, among other factors. Health surveys frequently include language and social interaction items taken from existing acculturation instruments. This study evaluated the dimensionality of responses to typical bilinear items in Latino youth using Rasch modeling. Two seven-item scales measuring Anglo-Hispanic orientation were adapted from Marin and Gamba (1996) and Cuellar, Arnold, and Maldonado (1995). Most of the items fit the Rasch model. However, there were gaps in both the Hispanic and Anglo scales. The Anglo items were not well targeted for the sample because most students reported they always spoke English. The lack of variability found in a heterogeneous sample of Latino youth has negative implications for the common practice of relying on language as a measure of acculturation. Acculturation instruments for youth probably need more sensitive items to discriminate linguistic differences, or to measure other factors.

  17. Item banking to improve, shorten and computerize self-reported fatigue: an illustration of steps to create a core item bank from the FACIT-Fatigue Scale.

    Science.gov (United States)

    Lai, Jin-shei; Cella, David; Chang, Chih-Hung; Bode, Rita K; Heinemann, Allen W

    2003-08-01

    Fatigue is a common symptom among cancer patients and the general population. Due to its subjective nature, fatigue has been difficult to effectively and efficiently assess. Modern computerized adaptive testing (CAT) can enable precise assessment of fatigue using a small number of items from a fatigue item bank. CAT enables brief assessment by selecting questions from an item bank that provide the maximum amount of information given a person's previous responses. This article illustrates steps to prepare such an item bank, using 13 items from the Functional Assessment of Chronic Illness Therapy Fatigue Subscale (FACIT-F) as the basis. Samples included 1022 cancer patients and 1010 people from the general population. An Item Response Theory (IRT)-based rating scale model, a polytomous extension of the Rasch dichotomous model was utilized. Nine items demonstrating acceptable psychometric properties were selected and positioned on the fatigue continuum. The fatigue levels measured by these nine items along with their response categories covered 66.8% of the general population and 82.6% of the cancer patients. Although the operational CAT algorithms to handle polytomously scored items are still in progress, we illustrated how CAT may work by using nine core items to measure level of fatigue. Using this illustration, a fatigue measure comparable to its full-length 13-item scale administration was obtained using four items. The resulting item bank can serve as a core to which will be added a psychometrically sound and operational item bank covering the entire fatigue continuum.

  18. Continuous Online Item Calibration: Parameter Recovery and Item Utilization.

    Science.gov (United States)

    Ren, Hao; van der Linden, Wim J; Diao, Qi

    2017-06-01

    Parameter recovery and item utilization were investigated for different designs for online test item calibration. The design was adaptive in a double sense: it assumed both adaptive testing of examinees from an operational pool of previously calibrated items and adaptive assignment of field-test items to the examinees. Four criteria of optimality for the assignment of the field-test items were used, each of them based on the information in the posterior distributions of the examinee's ability parameter during adaptive testing as well as the sequentially updated posterior distributions of the field-test item parameters. In addition, different stopping rules based on target values for the posterior standard deviations of the field-test parameters and the size of the calibration sample were used. The impact of each of the criteria and stopping rules on the statistical efficiency of the estimates of the field-test parameters and on the time spent by the items in the calibration procedure was investigated. Recommendations as to the practical use of the designs are given.

  19. Spectral Classification and Redshift Measurement for the SDSS-III Baryon Oscillation Spectroscopic Survey

    Science.gov (United States)

    Bolton, Adam S.; Schlegel, David J.; Aubourg, Éric; Bailey, Stephen; Bhardwaj, Vaishali; Brownstein, Joel R.; Burles, Scott; Chen, Yan-Mei; Dawson, Kyle; Eisenstein, Daniel J.; Gunn, James E.; Knapp, G. R.; Loomis, Craig P.; Lupton, Robert H.; Maraston, Claudia; Muna, Demitri; Myers, Adam D.; Olmstead, Matthew D.; Padmanabhan, Nikhil; Pâris, Isabelle; Percival, Will J.; Petitjean, Patrick; Rockosi, Constance M.; Ross, Nicholas P.; Schneider, Donald P.; Shu, Yiping; Strauss, Michael A.; Thomas, Daniel; Tremonti, Christy A.; Wake, David A.; Weaver, Benjamin A.; Wood-Vasey, W. Michael

    2012-11-01

    We describe the automated spectral classification, redshift determination, and parameter measurement pipeline in use for the Baryon Oscillation Spectroscopic Survey (BOSS) of the Sloan Digital Sky Survey III (SDSS-III) as of the survey's ninth data release (DR9), encompassing 831,000 moderate-resolution optical spectra. We give a review of the algorithms employed, and describe the changes to the pipeline that have been implemented for BOSS relative to previous SDSS-I/II versions, including new sets of stellar, galaxy, and quasar redshift templates. For the color-selected "CMASS" sample of massive galaxies at redshift 0.4 visual inspections of a subset of BOSS galaxies, we find that approximately 0.2% of confidently reported CMASS sample classifications and redshifts are incorrect, and about 0.4% of all CMASS spectra are objects unclassified by the current algorithm which are potentially recoverable. The BOSS pipeline confirms that ~51.5% of the quasar targets have quasar spectra, with the balance mainly consisting of stars and low signal-to-noise spectra. Statistical (as opposed to systematic) redshift errors propagated from photon noise are typically a few tens of km s-1 for both galaxies and quasars, with a significant tail to a few hundreds of km s-1 for quasars. We test the accuracy of these statistical redshift error estimates using repeat observations, finding them underestimated by a factor of 1.19-1.34 for galaxies and by a factor of two for quasars. We assess the impact of sky-subtraction quality, signal-to-noise ratio, and other factors on galaxy redshift success. Finally, we document known issues with the BOSS DR9 spectroscopic data set and describe directions of ongoing development.

  20. Newfound compassion after prostate cancer: a psychometric evaluation of additional items in the Posttraumatic Growth Inventory.

    Science.gov (United States)

    Morris, Bronwyn A; Wilson, Bridget; Chambers, Suzanne K

    2013-12-01

    The most widely used measure of posttraumatic growth (PTG) is the Posttraumatic Growth Inventory (PTGI). Qualitative research indicates the importance of increased compassion as a result of struggling with challenges presented by cancer and treatments. However, current PTG measures may not adequately assess compassion. A cross-sectional survey of 514 prostate cancer survivors assessed the PTGI and Dispositional Positive Emotional Scale (DPES). Five additional PTG items were derived from previous qualitative research to assess increased compassion. After removing eight items with complex loadings, a principal components analysis with oblimin rotation revealed a six-component structure. A clear delineation was seen between components relating to compassion, new possibilities, relating to others, personal strength, appreciation of life and spiritual change. Compassion accounted for 48.9 % of variance in data, with the overall model accounting for 79.9 % of variance. Strong factorability was demonstrated through Kaiser-Meyer-Olkin (0.92) and Bartlett's test of sphericity (approximate χ (2) = 5,791.85, df 153, p item-to-total correlations and inter-item correlations exceeded accepted thresholds of 0.50 and 0.30, respectively. Convergent validity was acceptable between the PTGI compassion subscale and DPES (r = 0.50). Compassion is a highly salient PTG domain after prostate cancer. Further studies can explore this construct with more heterogeneous samples of cancer types and gender.

  1. CLUSTERING OF SLOAN DIGITAL SKY SURVEY III PHOTOMETRIC LUMINOUS GALAXIES: THE MEASUREMENT, SYSTEMATICS, AND COSMOLOGICAL IMPLICATIONS

    Energy Technology Data Exchange (ETDEWEB)

    Ho, Shirley; White, Martin; Schlegel, David J.; Seljak, Uros; Reid, Beth [Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, MS 50R-5045, Berkeley, CA 94720 (United States); Cuesta, Antonio; Padmanabhan, Nikhil [Yale Center for Astronomy and Astrophysics, Yale University, New Haven, CT 06511 (United States); Seo, Hee-Jong [Berkeley Center for Cosmological Physics, LBL and Department of Physics, University of California, Berkeley, CA 94720 (United States); De Putter, Roland [ICC, University of Barcelona (IEEC-UB), Marti i Franques 1, E-08028 Barcelona (Spain); Ross, Ashley J.; Percival, Will J. [Institute of Cosmology and Gravitation, Dennis Sciama Building, University of Portsmouth, Portsmouth PO1 3FX (United Kingdom); Saito, Shun [Department of Astronomy, University of California Berkeley, CA (United States); Schlafly, Eddie [Department of Astronomy, Harvard University, 60 Garden St. MS 20, Cambridge, MA 02138 (United States); Hernandez-Monteagudo, Carlos [Centro de Estudios de Fisica del Cosmos de Aragon (CEFCA), Plaza de San Juan 1, planta 2, E-44001 Teruel (Spain); Sanchez, Ariel G. [Max-Planck-Institut fuer Extraterrestrische Physik, Giessenbachstrasse 1, D-85748 Garching (Germany); Blanton, Michael [Center for Cosmology and Particle Physics, Department of Physics, New York University, 4 Washington Place, New York, NY 10003 (United States); Skibba, Ramin [Steward Observatory, University of Arizona, 933 N. Cherry Avenue, Tucson, AZ 85721 (United States); Schneider, Don [Department of Astronomy and Astrophysics, The Pennsylvania State University, University Park, PA 16802 (United States); Mena, Olga [Instituto de Fisica Corpuscular, Universidad de Valencia-CSIC (Spain); Viel, Matteo, E-mail: cwho@lbl.gov [INAF-Osservatorio Astronomico di Trieste, Via G. B. Tiepolo 11, I-34131 Trieste (Italy); and others

    2012-12-10

    The Sloan Digital Sky Survey (SDSS) surveyed 14,555 deg{sup 2}, and delivered over a trillion pixels of imaging data. We present a study of galaxy clustering using 900,000 luminous galaxies with photometric redshifts, spanning between z = 0.45 and z = 0.65, constructed from the SDSS using methods described in Ross et al. This data set spans 11,000 deg{sup 2} and probes a volume of 3 h {sup -3} Gpc{sup 3}, making it the largest volume ever used for galaxy clustering measurements. We describe in detail the construction of the survey window function and various systematics affecting our measurement. With such a large volume, high-precision cosmological constraints can be obtained given careful control and understanding of the observational systematics. We present a novel treatment of the observational systematics and its applications to the clustering signals from the data set. In this paper, we measure the angular clustering using an optimal quadratic estimator at four redshift slices with an accuracy of {approx}15%, with a bin size of {delta}{sub l} = 10 on scales of the baryon acoustic oscillations (BAOs; at l {approx} 40-400). We also apply corrections to the power spectra due to systematics and derive cosmological constraints using the full shape of the power spectra. For a flat {Lambda}CDM model, when combined with cosmic microwave background Wilkinson Microwave Anisotropy Probe 7 (WMAP7) and H{sub 0} constraints from using 600 Cepheids observed by Wide Field Camera 3 (WFC3; HST), we find {Omega}{sub {Lambda}} = 0.73 {+-} 0.019 and H{sub 0} to be 70.5 {+-} 1.6 s{sup -1} Mpc{sup -1} km. For an open {Lambda}CDM model, when combined with WMAP7 + HST, we find {Omega}{sub K} = 0.0035 {+-} 0.0054, improved over WMAP7+HST alone by 40%. For a wCDM model, when combined with WMAP7+HST+SN, we find w = -1.071 {+-} 0.078, and H{sub 0} to be 71.3 {+-} 1.7 s{sup -1} Mpc{sup -1} km, which is competitive with the latest large-scale structure constraints from large spectroscopic

  2. The basics of item response theory using R

    CERN Document Server

    Baker, Frank B

    2017-01-01

    This graduate-level textbook is a tutorial for item response theory that covers both the basics of item response theory and the use of R for preparing graphical presentation in writings about the theory. Item response theory has become one of the most powerful tools used in test construction, yet one of the barriers to learning and applying it is the considerable amount of sophisticated computational effort required to illustrate even the simplest concepts. This text provides the reader access to the basic concepts of item response theory freed of the tedious underlying calculations. It is intended for those who possess limited knowledge of educational measurement and psychometrics. Rather than presenting the full scope of item response theory, this textbook is concise and practical and presents basic concepts without becoming enmeshed in underlying mathematical and computational complexities. Clearly written text and succinct R code allow anyone familiar with statistical concepts to explore and apply item re...

  3. An item factor analysis and item response theory-based revision of the Everyday Discrimination Scale.

    Science.gov (United States)

    Stucky, Brian D; Gottfredson, Nisha C; Panter, A T; Daye, Charles E; Allen, Walter R; Wightman, Linda F

    2011-04-01

    The Everyday Discrimination Scale (EDS), a widely used measure of daily perceived discrimination, is purported to be unidimensional, to function well among African Americans, and to have adequate construct validity. Two separate studies and data sources were used to examine and cross-validate the psychometric properties of the EDS. In Study 1, an exploratory factor analysis was conducted on a sample of African American law students (N = 589), providing strong evidence of local dependence, or nuisance multidimensionality within the EDS. In Study 2, a separate nationally representative community sample (N = 3,527) was used to model the identified local dependence in an item factor analysis (i.e., bifactor model). Next, item response theory (IRT) calibrations were conducted to obtain item parameters. A five-item, revised-EDS was then tested for gender differential item functioning (in an IRT framework). Based on these analyses, a summed score to IRT-scaled score translation table is provided for the revised-EDS. Our results indicate that the revised-EDS is unidimensional, with minimal differential item functioning, and retains predictive validity consistent with the original scale.

  4. A Survey of Channel Measurements and Models for Current and Future Railway Communication Systems

    Directory of Open Access Journals (Sweden)

    Paul Unterhuber

    2016-01-01

    Full Text Available Modern society demands cheap, more efficient, and safer public transport. These enhancements, especially an increase in efficiency and safety, are accompanied by huge amounts of data traffic that need to be handled by wireless communication systems. Hence, wireless communications inside and outside trains are key technologies to achieve these efficiency and safety goals for railway operators in a cost-efficient manner. This paper briefly describes nowadays used wireless technologies in the railway domain and points out possible directions for future wireless systems. Channel measurements and models for wireless propagation are surveyed and their suitability in railway environments is investigated. Identified gaps are pointed out and solutions to fill those gaps for wireless communication links in railway environments are proposed.

  5. Selected items from the Charcot-Marie-Tooth (CMT) Neuropathy Score and secondary clinical outcome measures serve as sensitive clinical markers of disease severity in CMT1A patients.

    Science.gov (United States)

    Mannil, Manoj; Solari, Alessandra; Leha, Andreas; Pelayo-Negro, Ana L; Berciano, José; Schlotter-Weigel, Beate; Walter, Maggie C; Rautenstrauss, Bernd; Schnizer, Tuuli J; Schenone, Angelo; Seeman, Pavel; Kadian, Chandini; Schreiber, Olivia; Angarita, Natalia G; Fabrizi, Gian Maria; Gemignani, Franco; Padua, Luca; Santoro, Lucio; Quattrone, Aldo; Vita, Giuseppe; Calabrese, Daniela; Young, Peter; Laurà, Matilde; Haberlová, Jana; Mazanec, Radim; Paulus, Walter; Beissbarth, Tim; Shy, Michael E; Reilly, Mary M; Pareyson, Davide; Sereda, Michael W

    2014-11-01

    This study evaluates primary and secondary clinical outcome measures in Charcot-Marie-Tooth disease type 1A (CMT1A) with regard to their contribution towards discrimination of disease severity. The nine components of the composite Charcot-Marie-Tooth disease Neuropathy Score and six additional secondary clinical outcome measures were assessed in 479 adult patients with genetically proven CMT1A and 126 healthy controls. Using hierarchical clustering, we identified four significant clusters of patients according to clinical severity. We then tested the impact of each of the CMTNS components and of the secondary clinical parameters with regard to their power to differentiate these four clusters. The CMTNS components ulnar sensory nerve action potential (SNAP), pin sensibility, vibration and strength of arms did not increase the discriminant value of the remaining five CMTNS components (Ulnar compound motor action potential [CMAP], leg motor symptoms, arm motor symptoms, leg strength and sensory symptoms). However, three of the six additional clinical outcome measures - the 10m-timed walking test (T10MW), 9 hole-peg test (9HPT), and foot dorsal flexion dynamometry - further improved discrimination between severely and mildly affected patients. From these findings, we identified three different composite measures as score hypotheses and compared their discriminant power with that of the CMTNS. A composite of eight components CMAP, Motor symptoms legs, Motor symptoms arms, Strength of Legs, Sensory symptoms), displayed the strongest power to discriminate between the clusters. As a conclusion, five items from the CMTNS and three secondary clinical outcome measures improve the clinical assessment of patients with CMT1A significantly and are beneficial for upcoming clinical and therapeutic trials. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

    Science.gov (United States)

    Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

    2013-07-01

    Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.

  7. Self-Report Measures of the Home Learning Environment in Large Scale Research: Measurement Properties and Associations with Key Developmental Outcomes

    Science.gov (United States)

    Niklas, Frank; Nguyen, Cuc; Cloney, Daniel S.; Tayler, Collette; Adams, Raymond

    2016-01-01

    Favourable home learning environments (HLEs) support children's literacy, numeracy and social development. In large-scale research, HLE is typically measured by self-report survey, but there is little consistency between studies and many different items and latent constructs are observed. Little is known about the stability of these items and…

  8. Mathematical-programming approaches to test item pool design

    NARCIS (Netherlands)

    Veldkamp, Bernard P.; van der Linden, Willem J.; Ariel, A.

    2002-01-01

    This paper presents an approach to item pool design that has the potential to improve on the quality of current item pools in educational and psychological testing andhence to increase both measurement precision and validity. The approach consists of the application of mathematical programming

  9. Characterizing Sources of Uncertainty in Item Response Theory Scale Scores

    Science.gov (United States)

    Yang, Ji Seung; Hansen, Mark; Cai, Li

    2012-01-01

    Traditional estimators of item response theory scale scores ignore uncertainty carried over from the item calibration process, which can lead to incorrect estimates of the standard errors of measurement (SEMs). Here, the authors review a variety of approaches that have been applied to this problem and compare them on the basis of their statistical…

  10. Multidimensional Linking for Tests with Mixed Item Types

    Science.gov (United States)

    Yao, Lihua; Boughton, Keith

    2009-01-01

    Numerous assessments contain a mixture of multiple choice (MC) and constructed response (CR) item types and many have been found to measure more than one trait. Thus, there is a need for multidimensional dichotomous and polytomous item response theory (IRT) modeling solutions, including multidimensional linking software. For example,…

  11. Measuring the mental health care system responsiveness: results of an outpatient survey in Tehran

    Directory of Open Access Journals (Sweden)

    Setareh eForouzan

    2016-01-01

    Full Text Available AbstractAs explained by the World Health Organisation (WHO in 2000, the concept of health system responsiveness is one of the core goals of health systems. Since 2000, further efforts have been made to measure health system responsiveness and the factors affecting responsiveness, yet few studies have applied responsiveness concepts to the evaluation of mental health systems. The present study aims to measure responsiveness and its related domains in the mental health care system of Tehran. Utilising the same method used by the WHO for its responsiveness survey, responsiveness for outpatient mental health care was evaluated using a validated Farsi questionnaire. A sample of 500 public mental health service users in Tehran participated and subsequently completed the questionnaire. On average, 47% of participants reported experiencing poor responsiveness. Among responsiveness domains, confidentiality and dignity were the best performing factors while autonomy, access to care and quality of basic amenities were the worst performing. Respondents who reported their social status as low were more likely to experience poor responsiveness overall. Autonomy, quality of basic amenities and clear communication were responsiveness dimensions that performed poorly but were considered to be important by study participants. In summary, the study suggests that measuring responsiveness could provide guidance for further development of mental health care systems to become more patient orientated and provide patients with more respect.

  12. Measurements of Greenhouse Gases around the Sacramento Area: The Airborne Greenhouse Emissions Survey (AGES) Campaign

    Science.gov (United States)

    Karion, A.; Fischer, M. L.; Turnbull, J. C.; Sweeney, C.; Faloona, I. C.; Zagorac, N.; Guilderson, T. P.; Saripalli, S.; Sherwood, T.

    2009-12-01

    The state of California is leading the United States by enacting legislation (AB-32) to reduce greenhouse gas emissions to 1990 levels by 2020. The success of reduction efforts can be gauged with accurate emissions inventories and potentially verified with atmospheric measurements of greenhouse gases (GHGs) over time. Measurements of multiple GHGs and associated trace gas species in a specific region also provide information on emissions ratios for source apportionment. We conducted the Airborne Greenhouse Emissions Survey (AGES) campaign to determine emissions signature ratios for the sources that exist in the San Francisco Bay and Sacramento Valley areas. Specifically, we attempt to determine the emissions signatures of sources that influence ongoing measurements made at a tall-tower measurement site near Walnut Grove, CA. For two weeks in February and March of 2009, a Cessna 210 was flown throughout the Sacramento region, making continuous measurements of CO2, CH4, and CO while also sampling discrete flasks for a variety of additional tracers, including SF6, N2O, and 14C in CO2 (Δ14CO2). Flight paths were planned using wind predictions for each day to maximize sampling of sources whose emissions would also be sampled contemporaneously by the instrumentation at the Walnut Grove tower (WGC), part of the ongoing California Greenhouse Gas Emissions Measurement (CALGEM) project between NOAA/ESRL’s Carbon Cycle group and Lawrence Berkeley National Laboratory (LBNL). Flights were performed in two distinct patterns: 1) flying across a plume upwind and downwind of the Sacramento urban area, and 2) flying across the Sacramento-San Joaquin Delta from Richmond to Walnut Grove, a region consisting of natural wetlands as well as several power plants and refineries. Results show a variety of well-correlated mixing ratio signals downwind of Sacramento, documenting the urban signature emission ratios, while emissions ratios in the Delta region were more variable, likely due

  13. The Long-Term Conditions Questionnaire: conceptual framework and item development

    Directory of Open Access Journals (Sweden)

    Peters M

    2016-08-01

    initial draft of the measure. These initial items will undergo cognitive testing to refine the items prior to further validation in a survey. Keywords: long-term conditions, conceptual framework, qualitative interviews, patient-reported outcome measure

  14. A Measurement of the Correlation of Galaxy Surveys with CMB Lensing Convergence Maps from the South Pole Telescope

    CERN Document Server

    Bleem, L E; Holder, G P; Aird, K A; Armstrong, R; Ashby, M L N; Becker, M R; Benson, B A; Biesiadzinski, T; Brodwin, M; Busha, M T; Carlstrom, J E; Chang, C L; Cho, H M; Crawford, T M; Crites, A T; de Haan, T; Desai, S; Dobbs, M A; Doré, O; Dudley, J; Geach, J E; George, E M; Gladders, M D; Gonzalez, A H; Halverson, N W; Harrington, N; High, F W; Holden, B P; Holzapfel, W L; Hoover, S; Hrubes, J D; Joy, M; Keisler, R; Knox, L; Lee, A T; Leitch, E M; Lueker, M; Luong-Van, D; Marrone, D P; Martinez-Manso, J; McMahon, J J; Mehl, J; Meyer, S S; Mohr, J J; Montroy, T E; Natoli, T; Padin, S; Plagge, T; Pryke, C; Reichardt, C L; Rest, A; Ruhl, J E; Saliwanchik, B R; Sayre, J T; Schaffer, K K; Shaw, L; Shirokoff, E; Spieler, H G; Stalder, B; Stanford, S A; Staniszewski, Z; Stark, A A; Stern, D; Story, K; Vallinotto, A; Vanderlinde, K; Vieira, J D; Wechsler, R H; Williamson, R; Zahn, O

    2012-01-01

    We compare cosmic microwave background lensing convergence maps derived from South Pole Telescope (SPT) data with galaxy survey data from the Blanco Cosmology Survey, the Wide-field Infrared Survey Explorer, and a new large Spitzer/IRAC field designed to overlap with the SPT survey. Using optical and infrared catalogs covering between 17 and 68 square degrees of sky, we detect correlation between the SPT convergence maps and each of the galaxy density maps at >4 sigma, with zero cross-correlation robustly ruled out in all cases. The amplitude and shape of the cross-power spectra are in good agreement with theoretical expectations and the measured galaxy bias is consistent with previous work. The detections reported here utilize a small fraction of the full 2500 square degree SPT survey data and serve as both a proof of principle of the technique and an illustration of the potential of this emerging cosmological probe.

  15. How social processes distort measurement: the impact of survey nonresponse on estimates of volunteer work in the United States.

    Science.gov (United States)

    Abraham, Katharine G; Presser, Stanley; Helms, Sara

    2009-01-01

    The authors argue that both the large variability in survey estimates of volunteering and the fact that survey estimates do not show the secular decline common to other social capital measures are caused by the greater propensity of those who do volunteer work to respond to surveys. Analyses of the American Time Use Survey (ATUS)--the sample for which is drawn from the Current Population Survey (CPS)--together with the CPS volunteering supplement show that CPS respondents who become ATUS respondents report much more volunteering in the CPS than those who become ATUS nonrespondents. This difference is replicated within subgroups. Consequently, conventional adjustments for nonresponse cannot correct the bias. Although nonresponse leads to estimates of volunteer activity that are too high, it generally does not affect inferences about the characteristics of volunteers.

  16. Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

    Science.gov (United States)

    Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina

    2015-06-01

    This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism.

  17. The Measure of a Nation: The USDA and the Rise of Survey Methodology

    Science.gov (United States)

    Mahoney, Kevin T.; Baker, David B.

    2007-01-01

    Survey research has played a major role in American social science. An outgrowth of efforts by the United States Department of Agriculture in the 1930s, the Division of Program Surveys (DPS) played an important role in the development of survey methodology. The DPS was headed by the ambitious and entrepreneurial Rensis Likert, populated by young…

  18. The academic medical center linear disability score (ALDS) item bank: item response theory analysis in a mixed patient population

    NARCIS (Netherlands)

    Holman, Rebecca; Weisscher, Nadine; Glas, Cornelis A.W.; Dijkgraaf, Marcel G.W.; Vermeulen, Martinus; de Haan, Rob J.; Lindeboom, Robert

    2005-01-01

    Background: Currently, there is a lot of interest in the flexible framework offered by item banks for measuring patient relevant outcomes. However, there are few item banks, which have been developed to quantify functional status, as expressed by the ability to perform activities of daily life. This

  19. Development of the emergency physician job satisfaction measurement instrument.

    Science.gov (United States)

    Lloyd, S; Streiner, D; Hahn, E; Shannon, S

    1994-01-01

    The objective of this study was to develop a valid and reliable instrument to measure the job satisfaction of physicians practicing emergency medicine. A prospective survey involving four separate stages (an item evaluation and reduction stage, a factor analysis stage, a construct validity stage, and a reliability stage) was distributed in Canada to full-time emergency physicians. Three separate survey instruments were administered (an initial draft instrument with 228 items, a pilot instrument with 142 items, and the final instrument with 79 items). Construct validity of the final instrument was tested by evaluating the correlation between physician scores on the instrument, and scores on two instruments measuring the same construct, and three measuring different but related constructs. A draft instrument with 228 items and six hypothetical domains was tested on 61 physicians. Evaluation for frequency endorsement, redundancy, and homogeneity reduced the item pool to 157. The remaining 157 items were used as a pilot instrument and tested on 223 physicians. Factor analysis eliminated 66 items from the pilot instrument, creating a final instrument with 79 items, 11 factors, and six domains. Cronbach's coefficient alpha for the final instrument domains is 0.81, and all domain-total correlations are greater than 0.4. All correlations between the final instrument and the construct validity instruments were statistically significant (P job satisfaction, which is both internally consistent and stable.

  20. Why we love or hate our cars: A qualitative approach to the development of a quantitative user experience survey.

    Science.gov (United States)

    Tonetto, Leandro Miletto; Desmet, Pieter M A

    2016-09-01

    This paper presents a more ecologically valid way of developing theory-based item questionnaires for measuring user experience. In this novel approach, items were generated using natural and domain-specific language of the research population, what seems to have made the survey much more sensitive to real experiences than theory-based ones. The approach was applied in a survey that measured car experience. Ten in-depth interviews were conducted with drivers inside their cars. The resulting transcripts were analysed with the aim of capturing their natural utterances for expressing their car experience. This analysis resulted in 71 categories of answers. For each category, one sentence was selected to serve as a survey-item. In an online platform, 538 respondents answered the survey. Data reliability, tested with Cronbach alpha index, was 0.94, suggesting a survey with highly reliable results to measure drivers' appraisals of their cars.