WorldWideScience

Sample records for assessment item format

  1. Varying the item format improved the range of measurement in patient-reported outcome measures assessing physical function

    DEFF Research Database (Denmark)

    Liegl, Gregor; Gandek, Barbara; Fischer, H. Felix

    2017-01-01

    precision between the short forms using different item formats. Results: Sufficient unidimensionality of all short-form items and the original PF item bank was supported. Compared to formats A and B, format C increased the range of reliable measurement by about 0.5 standard deviations on the positive side...

  2. Item difficulty of multiple choice tests dependant on different item response formats – An experiment in fundamental research on psychological assessment

    Directory of Open Access Journals (Sweden)

    KLAUS D. KUBINGER

    2007-12-01

    Full Text Available Multiple choice response formats are problematical as an item is often scored as solved simply because the test-taker is a lucky guesser. Instead of applying pertinent IRT models which take guessing effects into account, a pragmatic approach of re-conceptualizing multiple choice response formats to reduce the chance of lucky guessing is considered. This paper compares the free response format with two different multiple choice formats. A common multiple choice format with a single correct response option and five distractors (“1 of 6” is used, as well as a multiple choice format with five response options, of which any number of the five is correct and the item is only scored as mastered if all the correct response options and none of the wrong ones are marked (“x of 5”. An experiment was designed, using pairs of items with exactly the same content but different response formats. 173 test-takers were randomly assigned to two test booklets of 150 items altogether. Rasch model analyses adduced a fitting item pool, after the deletion of 39 items. The resulting item difficulty parameters were used for the comparison of the different formats. The multiple choice format “1 of 6” differs significantly from “x of 5”, with a relative effect of 1.63, while the multiple choice format “x of 5” does not significantly differ from the free response format. Therefore, the lower degree of difficulty of items with the “1 of 6” multiple choice format is an indicator of relevant guessing effects. In contrast the “x of 5” multiple choice format can be seen as an appropriate substitute for free response format.

  3. Varying the item format improved the range of measurement in patient-reported outcome measures assessing physical function.

    Science.gov (United States)

    Liegl, Gregor; Gandek, Barbara; Fischer, H Felix; Bjorner, Jakob B; Ware, John E; Rose, Matthias; Fries, James F; Nolte, Sandra

    2017-03-21

    Physical function (PF) is a core patient-reported outcome domain in clinical trials in rheumatic diseases. Frequently used PF measures have ceiling effects, leading to large sample size requirements and low sensitivity to change. In most of these instruments, the response category that indicates the highest PF level is the statement that one is able to perform a given physical activity without any limitations or difficulty. This study investigates whether using an item format with an extended response scale, allowing respondents to state that the performance of an activity is easy or very easy, increases the range of precise measurement of self-reported PF. Three five-item PF short forms were constructed from the Patient-Reported Outcomes Measurement Information System (PROMIS®) wave 1 data. All forms included the same physical activities but varied in item stem and response scale: format A ("Are you able to …"; "without any difficulty"/"unable to do"); format B ("Does your health now limit you …"; "not at all"/"cannot do"); format C ("How difficult is it for you to …"; "very easy"/"impossible"). Each short-form item was answered by 2217-2835 subjects. We evaluated unidimensionality and estimated a graded response model for the 15 short-form items and remaining 119 items of the PROMIS PF bank to compare item and test information for the short forms along the PF continuum. We then used simulated data for five groups with different PF levels to illustrate differences in scoring precision between the short forms using different item formats. Sufficient unidimensionality of all short-form items and the original PF item bank was supported. Compared to formats A and B, format C increased the range of reliable measurement by about 0.5 standard deviations on the positive side of the PF continuum of the sample, provided more item information, and was more useful in distinguishing known groups with above-average functioning. Using an item format with an extended

  4. MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

    Science.gov (United States)

    Wang, Wen-Chung; Shih, Ching-Lin

    2010-01-01

    Three multiple indicators-multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods…

  5. Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

    Science.gov (United States)

    Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

    2013-07-01

    Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.

  6. Item Response Theory for Peer Assessment

    Science.gov (United States)

    Uto, Masaki; Ueno, Maomi

    2016-01-01

    As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…

  7. The Effects of Item Format and Cognitive Domain on Students' Science Performance in TIMSS 2011

    Science.gov (United States)

    Liou, Pey-Yan; Bulut, Okan

    2017-12-01

    The purpose of this study was to examine eighth-grade students' science performance in terms of two test design components, item format, and cognitive domain. The portion of Taiwanese data came from the 2011 administration of the Trends in International Mathematics and Science Study (TIMSS), one of the major international large-scale assessments in science. The item difficulty analysis was initially applied to show the proportion of correct items. A regression-based cumulative link mixed modeling (CLMM) approach was further utilized to estimate the impact of item format, cognitive domain, and their interaction on the students' science scores. The results of the proportion-correct statistics showed that constructed-response items were more difficult than multiple-choice items, and that the reasoning cognitive domain items were more difficult compared to the items in the applying and knowing domains. In terms of the CLMM results, students tended to obtain higher scores when answering constructed-response items as well as items in the applying cognitive domain. When the two predictors and the interaction term were included together, the directions and magnitudes of the predictors on student science performance changed substantially. Plausible explanations for the complex nature of the effects of the two test-design predictors on student science performance are discussed. The results provide practical, empirical-based evidence for test developers, teachers, and stakeholders to be aware of the differential function of item format, cognitive domain, and their interaction in students' science performance.

  8. Using Automated Processes to Generate Test Items And Their Associated Solutions and Rationales to Support Formative Feedback

    Directory of Open Access Journals (Sweden)

    Mark Gierl

    2015-08-01

    Full Text Available Automatic item generation is the process of using item models to produce assessment tasks using computer technology. An item model is similar to a template that highlights the elements in the task that must be manipulated to produce new items. The purpose of our study is to describe an innovative method for generating large numbers of diverse and heterogeneous items along with their solutions and associated rationales to support formative feedback. We demonstrate the method by generating items in two diverse content areas, mathematics and nonverbal reasoning

  9. Assessing difference between classical test theory and item ...

    African Journals Online (AJOL)

    Assessing difference between classical test theory and item response theory methods in scoring primary four multiple choice objective test items. ... All research participants were ranked on the CTT number correct scores and the corresponding IRT item pattern scores from their performance on the PRISMADAT. Wilcoxon ...

  10. Writing, Evaluating and Assessing Data Response Items in Economics.

    Science.gov (United States)

    Trotman-Dickenson, D. I.

    1989-01-01

    Describes some of the problems in writing data response items in economics for use by A Level and General Certificate of Secondary Education (GCSE) students. Examines the experience of two series of workshops on writing items, evaluating them and assessing responses from schools. Offers suggestions for producing packages of data response items as…

  11. An Investigation of Item Type in a Standards-Based Assessment.

    Directory of Open Access Journals (Sweden)

    Liz Hollingworth

    2007-12-01

    Full Text Available Large-scale state assessment programs use both multiple-choice and open-ended items on tests for accountability purposes. Certainly, there is an intuitive belief among some educators and policy makers that open-ended items measure something different than multiple-choice items. This study examined two item formats in custom-built, standards-based tests of achievement in Reading and Mathematics at grades 3-8. In this paper, we raise questions about the value of including open-ended items, given scoring costs, time constraints, and the higher probability of missing data from test-takers.

  12. Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

    Science.gov (United States)

    Wang, Wei

    2013-01-01

    Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

  13. Applying Item Response Theory methods to design a learning progression-based science assessment

    Science.gov (United States)

    Chen, Jing

    Learning progressions are used to describe how students' understanding of a topic progresses over time and to classify the progress of students into steps or levels. This study applies Item Response Theory (IRT) based methods to investigate how to design learning progression-based science assessments. The research questions of this study are: (1) how to use items in different formats to classify students into levels on the learning progression, (2) how to design a test to give good information about students' progress through the learning progression of a particular construct and (3) what characteristics of test items support their use for assessing students' levels. Data used for this study were collected from 1500 elementary and secondary school students during 2009--2010. The written assessment was developed in several formats such as the Constructed Response (CR) items, Ordered Multiple Choice (OMC) and Multiple True or False (MTF) items. The followings are the main findings from this study. The OMC, MTF and CR items might measure different components of the construct. A single construct explained most of the variance in students' performances. However, additional dimensions in terms of item format can explain certain amount of the variance in student performance. So additional dimensions need to be considered when we want to capture the differences in students' performances on different types of items targeting the understanding of the same underlying progression. Items in each item format need to be improved in certain ways to classify students more accurately into the learning progression levels. This study establishes some general steps that can be followed to design other learning progression-based tests as well. For example, first, the boundaries between levels on the IRT scale can be defined by using the means of the item thresholds across a set of good items. Second, items in multiple formats can be selected to achieve the information criterion at all

  14. Item Response Theory Models for Wording Effects in Mixed-Format Scales

    Science.gov (United States)

    Wang, Wen-Chung; Chen, Hui-Fang; Jin, Kuan-Yu

    2015-01-01

    Many scales contain both positively and negatively worded items. Reverse recoding of negatively worded items might not be enough for them to function as positively worded items do. In this study, we commented on the drawbacks of existing approaches to wording effect in mixed-format scales and used bi-factor item response theory (IRT) models to…

  15. Measurement Properties of Two Innovative Item Formats in a Computer-Based Test

    Science.gov (United States)

    Wan, Lei; Henly, George A.

    2012-01-01

    Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…

  16. IRT-Estimated Reliability for Tests Containing Mixed Item Formats

    Science.gov (United States)

    Shu, Lianghua; Schwarz, Richard D.

    2014-01-01

    As a global measure of precision, item response theory (IRT) estimated reliability is derived for four coefficients (Cronbach's a, Feldt-Raju, stratified a, and marginal reliability). Models with different underlying assumptions concerning test-part similarity are discussed. A detailed computational example is presented for the targeted…

  17. Goodness-of-Fit Assessment of Item Response Theory Models

    Science.gov (United States)

    Maydeu-Olivares, Alberto

    2013-01-01

    The article provides an overview of goodness-of-fit assessment methods for item response theory (IRT) models. It is now possible to obtain accurate "p"-values of the overall fit of the model if bivariate information statistics are used. Several alternative approaches are described. As the validity of inferences drawn on the fitted model…

  18. Advanced Marketing Core Curriculum. Test Items and Assessment Techniques.

    Science.gov (United States)

    Smith, Clifton L.; And Others

    This document contains duties and tasks, multiple-choice test items, and other assessment techniques for Missouri's advanced marketing core curriculum. The core curriculum begins with a list of 13 suggested textbook resources. Next, nine duties with their associated tasks are given. Under each task appears one or more citations to appropriate…

  19. The 12-item World Health Organization Disability Assessment Schedule II (WHO-DAS II: a nonparametric item response analysis

    Directory of Open Access Journals (Sweden)

    Fernandez Ana

    2010-05-01

    Full Text Available Abstract Background Previous studies have analyzed the psychometric properties of the World Health Organization Disability Assessment Schedule II (WHO-DAS II using classical omnibus measures of scale quality. These analyses are sample dependent and do not model item responses as a function of the underlying trait level. The main objective of this study was to examine the effectiveness of the WHO-DAS II items and their options in discriminating between changes in the underlying disability level by means of item response analyses. We also explored differential item functioning (DIF in men and women. Methods The participants were 3615 adult general practice patients from 17 regions of Spain, with a first diagnosed major depressive episode. The 12-item WHO-DAS II was administered by the general practitioners during the consultation. We used a non-parametric item response method (Kernel-Smoothing implemented with the TestGraf software to examine the effectiveness of each item (item characteristic curves and their options (option characteristic curves in discriminating between changes in the underliying disability level. We examined composite DIF to know whether women had a higher probability than men of endorsing each item. Results Item response analyses indicated that the twelve items forming the WHO-DAS II perform very well. All items were determined to provide good discrimination across varying standardized levels of the trait. The items also had option characteristic curves that showed good discrimination, given that each increasing option became more likely than the previous as a function of increasing trait level. No gender-related DIF was found on any of the items. Conclusions All WHO-DAS II items were very good at assessing overall disability. Our results supported the appropriateness of the weights assigned to response option categories and showed an absence of gender differences in item functioning.

  20. Assessing Differential Item Functioning on the Test of Relational Reasoning

    Directory of Open Access Journals (Sweden)

    Denis Dumas

    2018-03-01

    Full Text Available The test of relational reasoning (TORR is designed to assess the ability to identify complex patterns within visuospatial stimuli. The TORR is designed for use in school and university settings, and therefore, its measurement invariance across diverse groups is critical. In this investigation, a large sample, representative of a major university on key demographic variables, was collected, and the resulting data were analyzed using a multi-group, multidimensional item-response theory model-comparison procedure. No significant differential item functioning was found on any of the TORR items across any of the demographic groups of interest. This finding is interpreted as evidence of the cultural fairness of the TORR, and potential test-development choices that may have contributed to that cultural fairness are discussed.

  1. Modeling Composite Assessment Data Using Item Response Theory

    Science.gov (United States)

    Ueckert, Sebastian

    2018-01-01

    Composite assessments aim to combine different aspects of a disease in a single score and are utilized in a variety of therapeutic areas. The data arising from these evaluations are inherently discrete with distinct statistical properties. This tutorial presents the framework of the item response theory (IRT) for the analysis of this data type in a pharmacometric context. The article considers both conceptual (terms and assumptions) and practical questions (modeling software, data requirements, and model building). PMID:29493119

  2. Item Response Theory with Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model

    Science.gov (United States)

    Tay, Louis; Huang, Qiming; Vermunt, Jeroen K.

    2016-01-01

    In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…

  3. Item response theory analysis of Working Alliance Inventory, revised response format, and new Brief Alliance Inventory.

    Science.gov (United States)

    Mallinckrodt, Brent; Tekie, Yacob T

    2016-11-01

    The Working Alliance Inventory (WAI) has made great contributions to psychotherapy research. However, studies suggest the 7-point response format and 3-factor structure of the client version may have psychometric problems. This study used Rasch item response theory (IRT) to (a) improve WAI response format, (b) compare two brief 12-item versions (WAI-sr; WAI-s), and (c) develop a new 16-item Brief Alliance Inventory (BAI). Archival data from 1786 counseling center and community clients were analyzed. IRT findings suggested problems with crossed category thresholds. A rescoring scheme that combines neighboring responses to create 5- and 4-point scales sharply reduced these problems. Although subscale variance was reduced by 11-26%, rescoring yielded improved reliability and generally higher correlations with therapy process (session depth and smoothness) and outcome measures (residual gain symptom improvement). The 16-item BAI was designed to maximize "bandwidth" of item difficulty and preserve a broader range of WAI sensitivity than WAI-s or WAI-sr. Comparisons suggest the BAI performed better in several respects than the WAI-s or WAI-sr and equivalent to the full WAI on several performance indicators.

  4. Formative assessment: a student perspective.

    Science.gov (United States)

    Hill, D A; Guinea, A I; McCarthy, W H

    1994-09-01

    An educator's view would be that formative assessment has an important role in the learning process. This study was carried out to obtain a student perspective of the place of formative assessment in the curriculum. Final-year medical students at Royal Prince Alfred Hospital took part in four teaching sessions, each structured to integrate teaching with assessment. Three assessment methods were used; the group objective structured clinical examination (G-OSCE), structured short answer (SSA) questions and a pre/post-test multiple choice questionnaire (MCQ). Teaching sessions were conducted on the subject areas of traumatology, the 'acute abdomen', arterial disorders and cancer. Fifty-five students, representing 83% of those who took part in the programme, responded to a questionnaire where they were asked to rate (on a 5-point Likert scale) their response to general questions about formative assessment and 13 specific questions concerning the comparative value of the three assessment modalities. Eighty-nine per cent of respondents felt that formative assessment should be incorporated into the teaching process. The SSA assessment was regarded as the preferred modality to reinforce previous teaching and test problem-solving skills. The MCQ was the least favoured assessment method. The effect size variable between the total scores for the SSA and MCQ was 0.64. The variable between G-OSCE and SSA/MCQ was 0.26 and 0.33 respectively. Formative assessment is a potentially powerful method to direct learning behaviour. Students should have input into the methods used.

  5. Missouri Assessment Program (MAP), Spring 2000: Secondary Science, Released Items, Grade 10.

    Science.gov (United States)

    Missouri State Dept. of Elementary and Secondary Education, Jefferson City.

    This assessment sample provides information on the Missouri Assessment Program (MAP) for grade 10 science. The sample consists of six items taken from the test booklet and scoring guides for the six items. The items assess ecosystems, mechanics, and data analysis. (MM)

  6. Methods for Assessing Item, Step, and Threshold Invariance in Polytomous Items Following the Partial Credit Model

    Science.gov (United States)

    Penfield, Randall D.; Myers, Nicholas D.; Wolfe, Edward W.

    2008-01-01

    Measurement invariance in the partial credit model (PCM) can be conceptualized in several different but compatible ways. In this article the authors distinguish between three forms of measurement invariance in the PCM: step invariance, item invariance, and threshold invariance. Approaches for modeling these three forms of invariance are proposed,…

  7. Using Item Response Theory to Describe the Nonverbal Literacy Assessment (NVLA)

    Science.gov (United States)

    Fleming, Danielle; Wilson, Mark; Ahlgrim-Delzell, Lynn

    2018-01-01

    The Nonverbal Literacy Assessment (NVLA) is a literacy assessment designed for students with significant intellectual disabilities. The 218-item test was initially examined using confirmatory factor analysis. This method showed that the test worked as expected, but the items loaded onto a single factor. This article uses item response theory to…

  8. Development and evaluation of CAHPS survey items assessing how well healthcare providers address health literacy.

    Science.gov (United States)

    Weidmer, Beverly A; Brach, Cindy; Hays, Ron D

    2012-09-01

    The complexity of health information often exceeds patients' skills to understand and use it. To develop survey items assessing how well healthcare providers communicate health information. Domains and items for the Consumer Assessment of Healthcare Providers and Systems (CAHPS) Item Set for Addressing Health Literacy were identified through an environmental scan and input from stakeholders. The draft item set was translated into Spanish and pretested in both English and Spanish. The revised item set was field tested with a randomly selected sample of adult patients from 2 sites using mail and telephonic data collection. Item-scale correlations, confirmatory factor analysis, and internal consistency reliability estimates were estimated to assess how well the survey items performed and identify composite measures. Finally, we regressed the CAHPS global rating of the provider item on the CAHPS core communication composite and the new health literacy composites. A total of 601 completed surveys were obtained (52% response rate). Two composite measures were identified: (1) Communication to Improve Health Literacy (16 items); and (2) How Well Providers Communicate About Medicines (6 items). These 2 composites were significantly uniquely associated with the global rating of the provider (communication to improve health literacy: PLiteracy composite accounted for 90% of the variance of the original 16-item composite. This study provides support for reliability and validity of the CAHPS Item Set for Addressing Health Literacy. These items can serve to assess whether healthcare providers have communicated effectively with their patients and as a tool for quality improvement.

  9. Development of the Assessment Items of Debris Flow Using the Delphi Method

    Science.gov (United States)

    Byun, Yosep; Seong, Joohyun; Kim, Mingi; Park, Kyunghan; Yoon, Hyungkoo

    2016-04-01

    In recent years in Korea, Typhoon and the localized extreme rainfall caused by the abnormal climate has increased. Accordingly, debris flow is becoming one of the most dangerous natural disaster. This study aimed to develop the assessment items which can be used for conducting damage investigation of debris flow. Delphi method was applied to classify the realms of assessment items. As a result, 29 assessment items which can be classified into 6 groups were determined.

  10. Matrix Sampling of Items in Large-Scale Assessments

    Directory of Open Access Journals (Sweden)

    Ruth A. Childs

    2003-07-01

    Full Text Available Matrix sampling of items -' that is, division of a set of items into different versions of a test form..-' is used by several large-scale testing programs. Like other test designs, matrixed designs have..both advantages and disadvantages. For example, testing time per student is less than if each..student received all the items, but the comparability of student scores may decrease. Also,..curriculum coverage is maintained, but reporting of scores becomes more complex. In this paper,..matrixed designs are compared with more traditional designs in nine categories of costs:..development costs, materials costs, administration costs, educational costs, scoring costs,..reliability costs, comparability costs, validity costs, and reporting costs. In choosing among test..designs, a testing program should examine the costs in light of its mandate(s, the content of the..tests, and the financial resources available, among other considerations.

  11. Assessing errors related to characteristics of the items measured

    International Nuclear Information System (INIS)

    Liggett, W.

    1980-01-01

    Errors that are related to some intrinsic property of the items measured are often encountered in nuclear material accounting. An example is the error in nondestructive assay measurements caused by uncorrected matrix effects. Nuclear material accounting requires for each materials type one measurement method for which bounds on these errors can be determined. If such a method is available, a second method might be used to reduce costs or to improve precision. If the measurement error for the first method is longer-tailed than Gaussian, then precision might be improved by measuring all items by both methods. 8 refs

  12. Better assessment of physical function: item improvement is neglected but essential.

    Science.gov (United States)

    Bruce, Bonnie; Fries, James F; Ambrosini, Debbie; Lingala, Bharathi; Gandek, Barbara; Rose, Matthias; Ware, John E

    2009-01-01

    Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models

  13. Comparison of Classical Test Theory and Item Response Theory in Individual Change Assessment

    NARCIS (Netherlands)

    Jabrayilov, Ruslan; Emons, Wilco H. M.; Sijtsma, Klaas

    2016-01-01

    Clinical psychologists are advised to assess clinical and statistical significance when assessing change in individual patients. Individual change assessment can be conducted using either the methodologies of classical test theory (CTT) or item response theory (IRT). Researchers have been optimistic

  14. Generalizability theory and item response theory

    OpenAIRE

    Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a selected-response format. This chapter presents a short overview of how item response theory and generalizability theory were integrated to model such assessments. Further, the precision of the esti...

  15. Assessment of the Item Selection and Weighting in the Birmingham Vasculitis Activity Score for Wegener's Granulomatosis

    Science.gov (United States)

    MAHR, ALFRED D.; NEOGI, TUHINA; LAVALLEY, MICHAEL P.; DAVIS, JOHN C.; HOFFMAN, GARY S.; MCCUNE, W. JOSEPH; SPECKS, ULRICH; SPIERA, ROBERT F.; ST.CLAIR, E. WILLIAM; STONE, JOHN H.; MERKEL, PETER A.

    2013-01-01

    Objective To assess the Birmingham Vasculitis Activity Score for Wegener's Granulomatosis (BVAS/WG) with respect to its selection and weighting of items. Methods This study used the BVAS/WG data from the Wegener's Granulomatosis Etanercept Trial. The scoring frequencies of the 34 predefined items and any “other” items added by clinicians were calculated. Using linear regression with generalized estimating equations in which the physician global assessment (PGA) of disease activity was the dependent variable, we computed weights for all predefined items. We also created variables for clinical manifestations frequently added as other items, and computed weights for these as well. We searched for the model that included the items and their generated weights yielding an activity score with the highest R2 to predict the PGA. Results We analyzed 2,044 BVAS/WG assessments from 180 patients; 734 assessments were scored during active disease. The highest R2 with the PGA was obtained by scoring WG activity based on the following items: the 25 predefined items rated on ≥5 visits, the 2 newly created fatigue and weight loss variables, the remaining minor other and major other items, and a variable that signified whether new or worse items were present at a specific visit. The weights assigned to the items ranged from 1 to 21. Compared with the original BVAS/WG, this modified score correlated significantly more strongly with the PGA. Conclusion This study suggests possibilities to enhance the item selection and weighting of the BVAS/WG. These changes may increase this instrument's ability to capture the continuum of disease activity in WG. PMID:18512722

  16. A confirmative clinimetric analysis of the 36-item Family Assessment Device.

    Science.gov (United States)

    Timmerby, Nina; Cosci, Fiammetta; Watson, Maggie; Csillag, Claudio; Schmitt, Florence; Steck, Barbara; Bech, Per; Thastum, Mikael

    2018-02-07

    The Family Assessment Device (FAD) is a 60-item questionnaire widely used to evaluate self-reported family functioning. However, the factor structure as well as the number of items has been questioned. A shorter and more user-friendly version of the original FAD-scale, the 36-item FAD, has therefore previously been proposed, based on findings in a nonclinical population of adults. We aimed in this study to evaluate the brief 36-item version of the FAD in a clinical population. Data from a European multinational study, examining factors associated with levels of family functioning in adult cancer patients' families, were used. Both healthy and ill parents completed the 60-item version FAD. The psychometric analyses conducted were Principal Component Analysis and Mokken-analysis. A total of 564 participants were included. Based on the psychometric analysis we confirmed that the 36-item version of the FAD has robust psychometric properties and can be used in clinical populations. The present analysis confirmed that the 36-item version of the FAD (18 items assessing 'well-being' and 18 items assessing 'dysfunctional' family function) is a brief scale where the summed total score is a valid measure of the dimensions of family functioning. This shorter version of the FAD is, in accordance with the concept of 'measurement-based care', an easy to use scale that could be considered when the aim is to evaluate self-reported family functioning.

  17. Factor Structure and Reliability of Test Items for Saudi Teacher Licence Assessment

    Science.gov (United States)

    Alsadaawi, Abdullah Saleh

    2017-01-01

    The Saudi National Assessment Centre administers the Computer Science Teacher Test for teacher certification. The aim of this study is to explore gender differences in candidates' scores, and investigate dimensionality, reliability, and differential item functioning using confirmatory factor analysis and item response theory. The confirmatory…

  18. Assessment of Preference for Edible and Leisure Items in Individuals with Dementia

    Science.gov (United States)

    Ortega, Javier Virues; Iwata, Brian A.; Nogales-Gonzalez, Celia; Frades, Belen

    2012-01-01

    We conducted 2 studies on reinforcer preference in patients with dementia. Results of preference assessments yielded differential selections by 14 participants. Unlike prior studies with individuals with intellectual disabilities, all participants showed a noticeable preference for leisure items over edible items. Results of a subsequent analysis…

  19. Developing an African youth psychosocial assessment: an application of item response theory.

    Science.gov (United States)

    Betancourt, Theresa S; Yang, Frances; Bolton, Paul; Normand, Sharon-Lise

    2014-06-01

    This study aimed to refine a dimensional scale for measuring psychosocial adjustment in African youth using item response theory (IRT). A 60-item scale derived from qualitative data was administered to 667 war-affected adolescents (55% female). Exploratory factor analysis (EFA) determined the dimensionality of items based on goodness-of-fit indices. Items with loadings less than 0.4 were dropped. Confirmatory factor analysis (CFA) was used to confirm the scale's dimensionality found under the EFA. Item discrimination and difficulty were estimated using a graded response model for each subscale using weighted least squares means and variances. Predictive validity was examined through correlations between IRT scores (θ) for each subscale and ratings of functional impairment. All models were assessed using goodness-of-fit and comparative fit indices. Fisher's Information curves examined item precision at different underlying ranges of each trait. Original scale items were optimized and reconfigured into an empirically-robust 41-item scale, the African Youth Psychosocial Assessment (AYPA). Refined subscales assess internalizing and externalizing problems, prosocial attitudes/behaviors and somatic complaints without medical cause. The AYPA is a refined dimensional assessment of emotional and behavioral problems in African youth with good psychometric properties. Validation studies in other cultures are recommended. Copyright © 2014 John Wiley & Sons, Ltd.

  20. International Assessment: A Rasch Model and Teachers' Evaluation of TIMSS Science Achievement Items

    Science.gov (United States)

    Glynn, Shawn M.

    2012-01-01

    The Trends in International Mathematics and Science Study (TIMSS) is a comparative assessment of the achievement of students in many countries. In the present study, a rigorous independent evaluation was conducted of a representative sample of TIMSS science test items because item quality influences the validity of the scores used to inform…

  1. Assessment of Differential Item Functioning in the Experiences of Discrimination Index

    Science.gov (United States)

    Cunningham, Timothy J.; Berkman, Lisa F.; Gortmaker, Steven L.; Kiefe, Catarina I.; Jacobs, David R.; Seeman, Teresa E.; Kawachi, Ichiro

    2011-01-01

    The psychometric properties of instruments used to measure self-reported experiences of discrimination in epidemiologic studies are rarely assessed, especially regarding construct validity. The authors used 2000–2001 data from the Coronary Artery Risk Development in Young Adults (CARDIA) Study to examine differential item functioning (DIF) in 2 versions of the Experiences of Discrimination (EOD) Index, an index measuring self-reported experiences of racial/ethnic and gender discrimination. DIF may confound interpretation of subgroup differences. Large DIF was observed for 2 of 7 racial/ethnic discrimination items: White participants reported more racial/ethnic discrimination for the “at school” item, and black participants reported more racial/ethnic discrimination for the “getting housing” item. The large DIF by race/ethnicity in the index for racial/ethnic discrimination probably reflects item impact and is the result of valid group differences between blacks and whites regarding their respective experiences of discrimination. The authors also observed large DIF by race/ethnicity for 3 of 7 gender discrimination items. This is more likely to have been due to item bias. Users of the EOD Index must consider the advantages and disadvantages of DIF adjustment (omitting items, constructing separate measures, and retaining items). The EOD Index has substantial usefulness as an instrument that can assess self-reported experiences of discrimination. PMID:22038104

  2. Formative assessment : Enriching teaching and learning with formative assesment

    NARCIS (Netherlands)

    van Diggelen, M.R.; Morgan, C.M.; Funk, M.; Bruns, M.

    2016-01-01

    Formative assessment is a valuable aspect in teaching and learning, and is proven to be an e ective learning method. There is evidence that adding formative assessment to your teaching increases students’ learning results (Black and William, 1998), but in practice many of the possibilities are left

  3. Generalizability theory and item response theory

    NARCIS (Netherlands)

    Glas, Cornelis A.W.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a

  4. Comparison of Alternate and Original Items on the Montreal Cognitive Assessment.

    Science.gov (United States)

    Lebedeva, Elena; Huang, Mei; Koski, Lisa

    2016-03-01

    The Montreal Cognitive Assessment (MoCA) is a screening tool for mild cognitive impairment (MCI) in elderly individuals. We hypothesized that measurement error when using the new alternate MoCA versions to monitor change over time could be related to the use of items that are not of comparable difficulty to their corresponding originals of similar content. The objective of this study was to compare the difficulty of the alternate MoCA items to the original ones. Five selected items from alternate versions of the MoCA were included with items from the original MoCA administered adaptively to geriatric outpatients (N = 78). Rasch analysis was used to estimate the difficulty level of the items. None of the five items from the alternate versions matched the difficulty level of their corresponding original items. This study demonstrates the potential benefits of a Rasch analysis-based approach for selecting items during the process of development of parallel forms. The results suggest that better match of the items from different MoCA forms by their difficulty would result in higher sensitivity to changes in cognitive function over time.

  5. Item response theory, computerized adaptive testing, and PROMIS: assessment of physical function.

    Science.gov (United States)

    Fries, James F; Witter, James; Rose, Matthias; Cella, David; Khanna, Dinesh; Morgan-DeWitt, Esi

    2014-01-01

    Patient-reported outcome (PRO) questionnaires record health information directly from research participants because observers may not accurately represent the patient perspective. Patient-reported Outcomes Measurement Information System (PROMIS) is a US National Institutes of Health cooperative group charged with bringing PRO to a new level of precision and standardization across diseases by item development and use of item response theory (IRT). With IRT methods, improved items are calibrated on an underlying concept to form an item bank for a "domain" such as physical function (PF). The most informative items can be combined to construct efficient "instruments" such as 10-item or 20-item PF static forms. Each item is calibrated on the basis of the probability that a given person will respond at a given level, and the ability of the item to discriminate people from one another. Tailored forms may cover any desired level of the domain being measured. Computerized adaptive testing (CAT) selects the best items to sharpen the estimate of a person's functional ability, based on prior responses to earlier questions. PROMIS item banks have been improved with experience from several thousand items, and are calibrated on over 21,000 respondents. In areas tested to date, PROMIS PF instruments are superior or equal to Health Assessment Questionnaire and Medical Outcome Study Short Form-36 Survey legacy instruments in clarity, translatability, patient importance, reliability, and sensitivity to change. Precise measures, such as PROMIS, efficiently incorporate patient self-report of health into research, potentially reducing research cost by lowering sample size requirements. The advent of routine IRT applications has the potential to transform PRO measurement.

  6. Explanatory item response modelling of an abstract reasoning assessment: A case for modern test design

    OpenAIRE

    Helland, Fredrik

    2016-01-01

    Assessment is an integral part of society and education, and for this reason it is important to know what you measure. This thesis is about explanatory item response modelling of an abstract reasoning assessment, with the objective to create a modern test design framework for automatic generation of valid and precalibrated items of abstract reasoning. Modern test design aims to strengthen the connections between the different components of a test, with a stress on strong theory, systematic it...

  7. Computer-based feedback in formative assessment

    NARCIS (Netherlands)

    van der Kleij, Fabienne

    2013-01-01

    Formative assessment concerns any assessment that provides feedback that is intended to support learning and can be used by teachers and/or students. Computers could offer a solution to overcoming obstacles encountered in implementing formative assessment. For example, computer-based assessments

  8. Recommended core items to assess e-cigarette use in population-based surveys.

    Science.gov (United States)

    Pearson, Jennifer L; Hitchman, Sara C; Brose, Leonie S; Bauld, Linda; Glasser, Allison M; Villanti, Andrea C; McNeill, Ann; Abrams, David B; Cohen, Joanna E

    2018-05-01

    A consistent approach using standardised items to assess e-cigarette use in both youth and adult populations will aid cross-survey and cross-national comparisons of the effect of e-cigarette (and tobacco) policies and improve our understanding of the population health impact of e-cigarette use. Focusing on adult behaviour, we propose a set of e-cigarette use items, discuss their utility and potential adaptation, and highlight e-cigarette constructs that researchers should avoid without further item development. Reliable and valid items will strengthen the emerging science and inform knowledge synthesis for policy-making. Building on informal discussions at a series of international meetings of 65 experts from 15 countries, the authors provide recommendations for assessing e-cigarette use behaviour, relative perceived harm, device type, presence of nicotine, flavours and reasons for use. We recommend items assessing eight core constructs: e-cigarette ever use, frequency of use and former daily use; relative perceived harm; device type; primary flavour preference; presence of nicotine; and primary reason for use. These items should be standardised or minimally adapted for the policy context and target population. Researchers should be prepared to update items as e-cigarette device characteristics change. A minimum set of e-cigarette items is proposed to encourage consensus around items to allow for cross-survey and cross-jurisdictional comparisons of e-cigarette use behaviour. These proposed items are a starting point. We recognise room for continued improvement, and welcome input from e-cigarette users and scientific colleagues. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  9. Do people with and without medical conditions respond similarly to the short health anxiety inventory? An assessment of differential item functioning using item response theory.

    Science.gov (United States)

    LeBouthillier, Daniel M; Thibodeau, Michel A; Alberts, Nicole M; Hadjistavropoulos, Heather D; Asmundson, Gordon J G

    2015-04-01

    Individuals with medical conditions are likely to have elevated health anxiety; however, research has not demonstrated how medical status impacts response patterns on health anxiety measures. Measurement bias can undermine the validity of a questionnaire by overestimating or underestimating scores in groups of individuals. We investigated whether the Short Health Anxiety Inventory (SHAI), a widely-used measure of health anxiety, exhibits medical condition-based bias on item and subscale levels, and whether the SHAI subscales adequately assess the health anxiety continuum. Data were from 963 individuals with diabetes, breast cancer, or multiple sclerosis, and 372 healthy individuals. Mantel-Haenszel tests and item characteristic curves were used to classify the severity of item-level differential item functioning in all three medical groups compared to the healthy group. Test characteristic curves were used to assess scale-level differential item functioning and whether the SHAI subscales adequately assess the health anxiety continuum. Nine out of 14 items exhibited differential item functioning. Two items exhibited differential item functioning in all medical groups compared to the healthy group. In both Thought Intrusion and Fear of Illness subscales, differential item functioning was associated with mildly deflated scores in medical groups with very high levels of the latent traits. Fear of Illness items poorly discriminated between individuals with low and very low levels of the latent trait. While individuals with medical conditions may respond differentially to some items, clinicians and researchers can confidently use the SHAI with a variety of medical populations without concern of significant bias. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. An Examination of Differential Item Functioning on the Vanderbilt Assessment of Leadership in Education

    Science.gov (United States)

    Polikoff, Morgan S.; May, Henry; Porter, Andrew C.; Elliott, Stephen N.; Goldring, Ellen; Murphy, Joseph

    2009-01-01

    The Vanderbilt Assessment of Leadership in Education is a 360-degree assessment of the effectiveness of principals' learning-centered leadership behaviors. In this report, we present results from a differential item functioning (DIF) study of the assessment. Using data from a national field trial, we searched for evidence of DIF on school level,…

  11. Assessing the Validity of Single-item Life Satisfaction Measures: Results from Three Large Samples

    Science.gov (United States)

    Cheung, Felix; Lucas, Richard E.

    2014-01-01

    Purpose The present paper assessed the validity of single-item life satisfaction measures by comparing single-item measures to the Satisfaction with Life Scale (SWLS) - a more psychometrically established measure. Methods Two large samples from Washington (N=13,064) and Oregon (N=2,277) recruited by the Behavioral Risk Factor Surveillance System (BRFSS) and a representative German sample (N=1,312) recruited by the Germany Socio-Economic Panel (GSOEP) were included in the present analyses. Single-item life satisfaction measures and the SWLS were correlated with theoretically relevant variables, such as demographics, subjective health, domain satisfaction, and affect. The correlations between the two life satisfaction measures and these variables were examined to assess the construct validity of single-item life satisfaction measures. Results Consistent across three samples, single-item life satisfaction measures demonstrated substantial degree of criterion validity with the SWLS (zero-order r = 0.62 – 0.64; disattenuated r = 0.78 – 0.80). Patterns of statistical significance for correlations with theoretically relevant variables were the same across single-item measures and the SWLS. Single-item measures did not produce systematically different correlations compared to the SWLS (average difference = 0.001 – 0.005). The average absolute difference in the magnitudes of the correlations produced by single-item measures and the SWLS were very small (average absolute difference = 0.015 −0.042). Conclusions Single-item life satisfaction measures performed very similarly compared to the multiple-item SWLS. Social scientists would get virtually identical answer to substantive questions regardless of which measure they use. PMID:24890827

  12. Assessing the validity of single-item life satisfaction measures: results from three large samples.

    Science.gov (United States)

    Cheung, Felix; Lucas, Richard E

    2014-12-01

    The present paper assessed the validity of single-item life satisfaction measures by comparing single-item measures to the Satisfaction with Life Scale (SWLS)-a more psychometrically established measure. Two large samples from Washington (N = 13,064) and Oregon (N = 2,277) recruited by the Behavioral Risk Factor Surveillance System and a representative German sample (N = 1,312) recruited by the Germany Socio-Economic Panel were included in the present analyses. Single-item life satisfaction measures and the SWLS were correlated with theoretically relevant variables, such as demographics, subjective health, domain satisfaction, and affect. The correlations between the two life satisfaction measures and these variables were examined to assess the construct validity of single-item life satisfaction measures. Consistent across three samples, single-item life satisfaction measures demonstrated substantial degree of criterion validity with the SWLS (zero-order r = 0.62-0.64; disattenuated r = 0.78-0.80). Patterns of statistical significance for correlations with theoretically relevant variables were the same across single-item measures and the SWLS. Single-item measures did not produce systematically different correlations compared to the SWLS (average difference = 0.001-0.005). The average absolute difference in the magnitudes of the correlations produced by single-item measures and the SWLS was very small (average absolute difference = 0.015-0.042). Single-item life satisfaction measures performed very similarly compared to the multiple-item SWLS. Social scientists would get virtually identical answer to substantive questions regardless of which measure they use.

  13. Forced-Choice Assessment of Work-Related Maladaptive Personality Traits: Preliminary Evidence From an Application of Thurstonian Item Response Modeling.

    Science.gov (United States)

    Guenole, Nigel; Brown, Anna A; Cooper, Andrew J

    2018-06-01

    This article describes an investigation of whether Thurstonian item response modeling is a viable method for assessment of maladaptive traits. Forced-choice responses from 420 working adults to a broad-range personality inventory assessing six maladaptive traits were considered. The Thurstonian item response model's fit to the forced-choice data was adequate, while the fit of a counterpart item response model to responses to the same items but arranged in a single-stimulus design was poor. Monotrait heteromethod correlations indicated corresponding traits in the two formats overlapped substantially, although they did not measure equivalent constructs. A better goodness of fit and higher factor loadings for the Thurstonian item response model, coupled with a clearer conceptual alignment to the theoretical trait definitions, suggested that the single-stimulus item responses were influenced by biases that the independent clusters measurement model did not account for. Researchers may wish to consider forced-choice designs and appropriate item response modeling techniques such as Thurstonian item response modeling for personality questionnaire applications in industrial psychology, especially when assessing maladaptive traits. We recommend further investigation of this approach in actual selection situations and with different assessment instruments.

  14. Psychometric properties of the Global Operative Assessment of Laparoscopic Skills (GOALS) using item response theory.

    Science.gov (United States)

    Watanabe, Yusuke; Madani, Amin; Ito, Yoichi M; Bilgic, Elif; McKendy, Katherine M; Feldman, Liane S; Fried, Gerald M; Vassiliou, Melina C

    2017-02-01

    The extent to which each item assessed using the Global Operative Assessment of Laparoscopic Skills (GOALS) contributes to the total score remains unknown. The purpose of this study was to evaluate the level of difficulty and discriminative ability of each of the 5 GOALS items using item response theory (IRT). A total of 396 GOALS assessments for a variety of laparoscopic procedures over a 12-year time period were included. Threshold parameters of item difficulty and discrimination power were estimated for each item using IRT. The higher slope parameters seen with "bimanual dexterity" and "efficiency" are indicative of greater discriminative ability than "depth perception", "tissue handling", and "autonomy". IRT psychometric analysis indicates that the 5 GOALS items do not demonstrate uniform difficulty and discriminative power, suggesting that they should not be scored equally. "Bimanual dexterity" and "efficiency" seem to have stronger discrimination. Weighted scores based on these findings could improve the accuracy of assessing individual laparoscopic skills. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank.

    Science.gov (United States)

    Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Vonkeman, Harald E; van de Laar, Mart A F J

    2017-11-01

    Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Seventy-two items generated from patient interviews and mapped to the International Classification of Functioning, Disability and Health (ICF) domestic life chapter were administered to 1128 adults representative of the Dutch population. The partial credit model was fitted to the item responses and evaluated with respect to its assumptions, model fit, and differential item functioning (DIF). Measurement performance of a computerized adaptive testing (CAT) algorithm was compared with the SF-36 physical functioning scale (PF-10). A final bank of 41 items was developed. All items demonstrated acceptable fit to the partial credit model and measurement invariance across age, sex, and educational level. Five- and ten-item CAT simulations were shown to have high measurement precision, which exceeded that of SF-36 physical functioning scale across the physical function continuum. Floor effects were absent for a 10-item empirical CAT simulation, and ceiling effects were low (13.5%) compared with SF-36 physical functioning (38.1%). CAT also discriminated better than SF-36 physical functioning between age groups, number of chronic conditions, and respondents with or without rheumatic conditions. The Rasch assessment of everyday activity limitations (REAL) item bank will hopefully prove a useful instrument for assessing everyday activity limitations. T-scores obtained using derived measures can be used to benchmark physical function outcomes against the general Dutch adult population.

  16. Using Item Analysis to Assess Objectively the Quality of the Calgary-Cambridge OSCE Checklist

    Directory of Open Access Journals (Sweden)

    Tyrone Donnon

    2011-06-01

    Full Text Available Background:  The purpose of this study was to investigate the use of item analysis to assess objectively the quality of items on the Calgary-Cambridge Communications OSCE checklist. Methods:  A total of 150 first year medical students were provided with extensive teaching on the use of the Calgary-Cambridge Guidelines for interviewing patients and participated in a final year end 20 minute communication OSCE station.  Grouped into either the upper half (50% or lower half (50% communication skills performance groups, discrimination, difficulty and point biserial values were calculated for each checklist item. Results:  The mean score on the 33 item communication checklist was 24.09 (SD = 4.46 and the internal reliability coefficient was ? = 0.77. Although most of the items were found to have moderate (k = 12, 36% or excellent (k = 10, 30% discrimination values, there were 6 (18% identified as ‘fair’ and 3 (9% as ‘poor’. A post-examination review focused on item analysis findings resulted in an increase in checklist reliability (? = 0.80. Conclusions:  Item analysis has been used with MCQ exams extensively. In this study, it was also found to be an objective and practical approach to use in evaluating the quality of a standardized OSCE checklist.

  17. An Anthropologist among the Psychometricians: Assessment Events, Ethnography, and Differential Item Functioning in the Mongolian Gobi

    Science.gov (United States)

    Maddox, Bryan; Zumbo, Bruno D.; Tay-Lim, Brenda; Qu, Demin

    2015-01-01

    This article explores the potential for ethnographic observations to inform the analysis of test item performance. In 2010, a standardized, large-scale adult literacy assessment took place in Mongolia as part of the United Nations Educational, Scientific and Cultural Organization Literacy Assessment and Monitoring Programme (LAMP). In a novel form…

  18. Modeling the World Health Organization Disability Assessment Schedule II using non-parametric item response models.

    Science.gov (United States)

    Galindo-Garre, Francisca; Hidalgo, María Dolores; Guilera, Georgina; Pino, Oscar; Rojo, J Emilio; Gómez-Benito, Juana

    2015-03-01

    The World Health Organization Disability Assessment Schedule II (WHO-DAS II) is a multidimensional instrument developed for measuring disability. It comprises six domains (getting around, self-care, getting along with others, life activities and participation in society). The main purpose of this paper is the evaluation of the psychometric properties for each domain of the WHO-DAS II with parametric and non-parametric Item Response Theory (IRT) models. A secondary objective is to assess whether the WHO-DAS II items within each domain form a hierarchy of invariantly ordered severity indicators of disability. A sample of 352 patients with a schizophrenia spectrum disorder is used in this study. The 36 items WHO-DAS II was administered during the consultation. Partial Credit and Mokken scale models are used to study the psychometric properties of the questionnaire. The psychometric properties of the WHO-DAS II scale are satisfactory for all the domains. However, we identify a few items that do not discriminate satisfactorily between different levels of disability and cannot be invariantly ordered in the scale. In conclusion the WHO-DAS II can be used to assess overall disability in patients with schizophrenia, but some domains are too general to assess functionality in these patients because they contain items that are not applicable to this pathology. Copyright © 2014 John Wiley & Sons, Ltd.

  19. Harnessing Collaborative Annotations on Online Formative Assessments

    Science.gov (United States)

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  20. Formative assessment (assessment for learning educational achievements of students

    Directory of Open Access Journals (Sweden)

    Zemlyаnskaya E.N.

    2016-09-01

    Full Text Available We present definition of the concept of formative assessment and its significance for modern education. Displaying developmental approach in foreign studies, the further development, the risks and the possibility of their reduction. We discuss some of the techniques and examples of formative assessment. We investigate the relationship between formative and final evaluation, including the national curriculum levels.

  1. Teacher perspectives about using formative assessment

    DEFF Research Database (Denmark)

    Evans, Robert Harry; Clesham, Rose; Dolin, Jens

    2018-01-01

    This chapter examines three different classroom teacher perspectives when using ASSIST-ME project formative assessment methods as described in the introductory chapter. The first ‘teacher perspective’ is about changes in teacher self-efficacies while using formative assessment methods as monitored...... by a pre- and post-teacher questionnaire. Teachers who tried the unfamiliar formative methods of assessment (see introductory book chapter for these methods) as well as their colleagues who did not were surveyed. The second ‘teacher perspective’ examines changes in teachers’ subjective theories while...... trying project-specific formative assessment methods in Czech Republic. Analyses are done through case studies and interviews. The final part of the chapter looks at teacher perspectives while using an Internet-based application to facilitate formative assessment. The teacher use of the application...

  2. FORMATIVE ASSESSMENT IN EFL CLASSROOM PRACTICES

    Directory of Open Access Journals (Sweden)

    Ida Ayu Made Sri Widiastuti

    2017-03-01

    Full Text Available This study investigated the challenges and opportunities of formative assessment in EFL classes. It made use of qualitative research design by using indepth interviews to collect the required data. Three teachers and three students were involved as research participants in this study and they were intensively interviewed to get valid and reliable data regarding their understanding of formative assessment and the follow up actions they took after implementing formative assessment. The results of this study showed that the English teachers were found not to take appropriate follow up actions due to their low understanding of formative assessment. The teachers’ understanding could influence their ability in deciding the actions. This study indicates that EFL teachers need urgent further intensive training on the appropriate implementation of formative assessment and how follow up actions should be integrated into classroom practices

  3. Applying Item Response Theory Methods to Examine the Impact of Different Response Formats

    Science.gov (United States)

    Hohensinn, Christine; Kubinger, Klaus D.

    2011-01-01

    In aptitude and achievement tests, different response formats are usually used. A fundamental distinction must be made between the class of multiple-choice formats and the constructed response formats. Previous studies have examined the impact of different response formats applying traditional statistical approaches, but these influences can also…

  4. Small group learning: effect on item analysis and accuracy of self-assessment of medical students.

    Science.gov (United States)

    Biswas, Shubho Subrata; Jain, Vaishali; Agrawal, Vandana; Bindra, Maninder

    2015-01-01

    Small group sessions are regarded as a more active and student-centered approach to learning. Item analysis provides objective evidence of whether such sessions improve comprehension and make the topic easier for students, in addition to assessing the relative benefit of the sessions to good versus poor performers. Self-assessment makes students aware of their deficiencies. Small group sessions can also help students develop the ability to self-assess. This study was carried out to assess the effect of small group sessions on item analysis and students' self-assessment. A total of 21 female and 29 male first year medical students participated in a small group session on topics covered by didactic lectures two weeks earlier. It was preceded and followed by two multiple choice question (MCQ) tests, in which students were asked to self-assess their likely score. The MCQs used were item analyzed in a previous group and were chosen of matching difficulty and discriminatory indices for the pre- and post-tests. The small group session improved the marks of both genders equally, but female performance was better. The session made the items easier; increasing the difficulty index significantly but there was no significant alteration in the discriminatory index. There was overestimation in the self-assessment of both genders, but male overestimation was greater. The session improved the self-assessment of students in terms of expected marks and expectation of passing. Small group session improved the ability of students to self-assess their knowledge and increased the difficulty index of items reflecting students' better performance.

  5. Improving the Reliability of Student Scores from Speeded Assessments: An Illustration of Conditional Item Response Theory Using a Computer-Administered Measure of Vocabulary

    Science.gov (United States)

    Petscher, Yaacov; Mitchell, Alison M.; Foorman, Barbara R.

    2015-01-01

    A growing body of literature suggests that response latency, the amount of time it takes an individual to respond to an item, may be an important factor to consider when using assessment data to estimate the ability of an individual. Considering that tests of passage and list fluency are being adapted to a computer administration format, it is…

  6. [Impact of passing items above the ceiling on the assessment results of Peabody developmental motor scales].

    Science.gov (United States)

    Zhao, Gai; Bian, Yang; Li, Ming

    2013-12-18

    To analyze the impact of passing items above the roof level in the gross motor subtest of Peabody development motor scales (PDMS-2) on its assessment results. In the subtests of PDMS-2, 124 children from 1.2 to 71 months were administered. Except for the original scoring method, a new scoring method which includes passing items above the ceiling were developed. The standard scores and quotients of the two scoring methods were compared using the independent-samples t test. Only one child could pass the items above the ceiling in the stationary subtest, 19 children in the locomotion subtest, and 17 children in the visual-motor integration subtest. When the scores of these passing items were included in the raw scores, the total raw scores got the added points of 1-12, the standard scores added 0-1 points and the motor quotients added 0-3 points. The diagnostic classification was changed only in two children. There was no significant difference between those two methods about motor quotients or standard scores in the specific subtest (P>0.05). The passing items above a ceiling of PDMS-2 isn't a rare situation. It usually takes place in the locomotion subtest and visual-motor integration subtest. Including these passing items into the scoring system will not make significant difference in the standard scores of the subtests or the developmental motor quotients (DMQ), which supports the original setting of a ceiling established by upassing 3 items in a row. However, putting the passing items above the ceiling into the raw score will improve tracking of children's developmental trajectory and intervention effects.

  7. Combining item response theory with multiple imputation to equate health assessment questionnaires.

    Science.gov (United States)

    Gu, Chenyang; Gutman, Roee

    2017-09-01

    The assessment of patients' functional status across the continuum of care requires a common patient assessment tool. However, assessment tools that are used in various health care settings differ and cannot be easily contrasted. For example, the Functional Independence Measure (FIM) is used to evaluate the functional status of patients who stay in inpatient rehabilitation facilities, the Minimum Data Set (MDS) is collected for all patients who stay in skilled nursing facilities, and the Outcome and Assessment Information Set (OASIS) is collected if they choose home health care provided by home health agencies. All three instruments or questionnaires include functional status items, but the specific items, rating scales, and instructions for scoring different activities vary between the different settings. We consider equating different health assessment questionnaires as a missing data problem, and propose a variant of predictive mean matching method that relies on Item Response Theory (IRT) models to impute unmeasured item responses. Using real data sets, we simulated missing measurements and compared our proposed approach to existing methods for missing data imputation. We show that, for all of the estimands considered, and in most of the experimental conditions that were examined, the proposed approach provides valid inferences, and generally has better coverages, relatively smaller biases, and shorter interval estimates. The proposed method is further illustrated using a real data set. © 2016, The International Biometric Society.

  8. Overview of Classical Test Theory and Item Response Theory for Quantitative Assessment of Items in Developing Patient-Reported Outcome Measures

    Science.gov (United States)

    Cappelleri, Joseph C.; Lundy, J. Jason; Hays, Ron D.

    2014-01-01

    Introduction The U.S. Food and Drug Administration’s patient-reported outcome (PRO) guidance document defines content validity as “the extent to which the instrument measures the concept of interest” (FDA, 2009, p. 12). “Construct validity is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity” (Strauss & Smith, 2009, p. 7). Hence both qualitative and quantitative information are essential in evaluating the validity of measures. Methods We review classical test theory and item response theory approaches to evaluating PRO measures including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized “difficulty” (severity) order of items is represented by observed responses. Conclusion Classical test theory and item response theory can be useful in providing a quantitative assessment of items and scales during the content validity phase of patient-reported outcome measures. Depending on the particular type of measure and the specific circumstances, either one or both approaches should be considered to help maximize the content validity of PRO measures. PMID:24811753

  9. Detecting intrajudge inconsistency in standard setting using test items with a selected-response format

    NARCIS (Netherlands)

    van der Linden, Willem J.; Vos, Hendrik J.; Chang, Lei

    2002-01-01

    In judgmental standard setting experiments, it may be difficult to specify subjective probabilities that adequately take the properties of the items into account. As a result, these probabilities are not consistent with each other in the sense that they do not refer to the same borderline level of

  10. Identifying Promising Items: The Use of Crowdsourcing in the Development of Assessment Instruments

    Science.gov (United States)

    Sadler, Philip M.; Sonnert, Gerhard; Coyle, Harold P.; Miller, Kelly A.

    2016-01-01

    The psychometrically sound development of assessment instruments requires pilot testing of candidate items as a first step in gauging their quality, typically a time-consuming and costly effort. Crowdsourcing offers the opportunity for gathering data much more quickly and inexpensively than from most targeted populations. In a simulation of a…

  11. Sensitivity and specificity of the 3-item memory test in the assessment of post traumatic amnesia.

    NARCIS (Netherlands)

    Andriessen, T.M.J.C.; Jong, B. de; Jacobs, B.; Werf, S.P. van der; Vos, P.E.

    2009-01-01

    PRIMARY OBJECTIVE: To investigate how the type of stimulus (pictures or words) and the method of reproduction (free recall or recognition after a short or a long delay) affect the sensitivity and specificity of a 3-item memory test in the assessment of post traumatic amnesia (PTA). METHODS: Daily

  12. Improving the Memory Sections of the Standardized Assessment of Concussion Using Item Analysis

    Science.gov (United States)

    McElhiney, Danielle; Kang, Minsoo; Starkey, Chad; Ragan, Brian

    2014-01-01

    The purpose of the study was to improve the immediate and delayed memory sections of the Standardized Assessment of Concussion (SAC) by identifying a list of more psychometrically sound items (words). A total of 200 participants with no history of concussion in the previous six months (aged 19.60 ± 2.20 years; N?=?93 men, N?=?107 women)…

  13. Investigation of Science Inquiry Items for Use on an Alternate Assessment Based on Modified Achievement Standards Using Cognitive Lab Methodology

    Science.gov (United States)

    Dickenson, Tammiee S.; Gilmore, Joanna A.; Price, Karen J.; Bennett, Heather L.

    2013-01-01

    This study evaluated the benefits of item enhancements applied to science-inquiry items for incorporation into an alternate assessment based on modified achievement standards for high school students. Six items were included in the cognitive lab sessions involving both students with and without disabilities. The enhancements (e.g., use of visuals,…

  14. Development and validation of a ten-item questionnaire with explanatory illustrations to assess upper extremity disorders: favorable effect of illustrations in the item reduction process.

    Science.gov (United States)

    Kurimoto, Shigeru; Suzuki, Mikako; Yamamoto, Michiro; Okui, Nobuyuki; Imaeda, Toshihiko; Hirata, Hitoshi

    2011-11-01

    The purpose of this study is to develop a short and valid measure for upper extremity disorders and to assess the effect of attached illustrations in item reduction of a self-administered disability questionnaire while retaining psychometric properties. A validated questionnaire used to assess upper extremity disorders, the Hand20, was reduced to ten items using two item-reduction techniques. The psychometric properties of the abbreviated form, the Hand10, were evaluated on an independent sample that was used for the shortening process. Validity, reliability, and responsiveness of the Hand10 were retained in the item reduction process. It was possible that the use of explanatory illustrations attached to the Hand10 helped with its reproducibility. The illustrations for the Hand10 promoted text comprehension and motivation to answer the items. These changes resulted in high acceptability; more than 99.3% of patients, including 98.5% of elderly patients, could complete the Hand10 properly. The illustrations had favorable effects on the item reduction process and made it possible to retain precision of the instrument. The Hand10 is a reliable and valid instrument for individual-level applications with the advantage of being compact and broadly applicable, even in elderly individuals.

  15. What Form of Mathematics Are Assessments Assessing? The Case of Multiplication and Division in Fourth Grade NAEP Items

    Science.gov (United States)

    Kosko Karl W.; Singh, Rashmi

    2018-01-01

    Multiplicative reasoning is a key concept in elementary school mathematics. Item statistics reported by the National Assessment of Educational Progress (NAEP) assessment provide the best current indicator for how well elementary students across the U.S. understand this, and other concepts. However, beyond expert reviews and statistical analysis,…

  16. Recommended core items to assess e-cigarette use in population-based surveys

    OpenAIRE

    Pearson, Jennifer L; Hitchman, Sara C; Brose, Leonie S; Bauld, Linda; Glasser, Allison M; Villanti, Andrea C; McNeill, Ann; Abrams, David B; Cohen, Joanna E

    2017-01-01

    Background: A consistent approach using standardized items to assess e-cigarette use in both youth and adult populations will aid cross-survey and cross-national comparisons of the effect of e-cigarette (and tobacco) policies and improve our understanding of the population health impact of e-cigarette use. Focusing on adult behavior, we propose a set of e-cigarette use items, discuss their utility and potential adaptation, and highlight e-cigarette constructs that researchers should avoid wit...

  17. Promoting proximal formative assessment with relational discourse

    Science.gov (United States)

    Scherr, Rachel E.; Close, Hunter G.; McKagan, Sarah B.

    2012-02-01

    The practice of proximal formative assessment - the continual, responsive attention to students' developing understanding as it is expressed in real time - depends on students' sharing their ideas with instructors and on teachers' attending to them. Rogerian psychology presents an account of the conditions under which proximal formative assessment may be promoted or inhibited: (1) Normal classroom conditions, characterized by evaluation and attention to learning targets, may present threats to students' sense of their own competence and value, causing them to conceal their ideas and reducing the potential for proximal formative assessment. (2) In contrast, discourse patterns characterized by positive anticipation and attention to learner ideas increase the potential for proximal formative assessment and promote self-directed learning. We present an analysis methodology based on these principles and demonstrate its utility for understanding episodes of university physics instruction.

  18. An introduction to Item Response Theory and Rasch Analysis of the Eating Assessment Tool (EAT-10).

    Science.gov (United States)

    Kean, Jacob; Brodke, Darrel S; Biber, Joshua; Gross, Paul

    2018-03-01

    Item response theory has its origins in educational measurement and is now commonly applied in health-related measurement of latent traits, such as function and symptoms. This application is due in large part to gains in the precision of measurement attributable to item response theory and corresponding decreases in response burden, study costs, and study duration. The purpose of this paper is twofold: introduce basic concepts of item response theory and demonstrate this analytic approach in a worked example, a Rasch model (1PL) analysis of the Eating Assessment Tool (EAT-10), a commonly used measure for oropharyngeal dysphagia. The results of the analysis were largely concordant with previous studies of the EAT-10 and illustrate for brain impairment clinicians and researchers how IRT analysis can yield greater precision of measurement.

  19. Item reduction and psychometric validation of the Oily Skin Self Assessment Scale (OSSAS) and the Oily Skin Impact Scale (OSIS).

    Science.gov (United States)

    Arbuckle, Robert; Clark, Marci; Harness, Jane; Bonner, Nicola; Scott, Jane; Draelos, Zoe; Rizer, Ronald; Yeh, Yating; Copley-Merriman, Kati

    2009-01-01

    Developed using focus groups, the Oily Skin Self Assessment Scale (OSSAS) and Oily Skin Impact Scale (OSIS) are patient-reported outcome measures of oily facial skin. The aim of this study was to finalize the item-scale structure of the instruments and perform psychometric validation in adults with self-reported oily facial skin. The OSSAS and OSIS were administered to 202 adult subjects with oily facial skin in the United States. A subgroup of 152 subjects returned, 4 to 10 days later, for test–retest reliability evaluation. Of the 202 participants, 72.8% were female; 64.4% had self-reported nonsevere acne. Item reduction resulted in a 14-item OSSAS with Sensation (five items), Tactile (four items) and Visual (four items) domains, a single blotting item, and an overall oiliness item. The OSIS was reduced to two three-item domains assessing Annoyance and Self-Image. Confirmatory factor analysis supported the construct validity of the final item-scale structures. The OSSAS and OSIS scales had acceptable item convergent validity (item-scale correlations >0.40) and floor and ceiling effects (skin severity (P skin (P skin), as assessments of self-reported oily facial skin severity and its emotional impact, respectively.

  20. Formative assessment in mathematics for engineering students

    Science.gov (United States)

    Ní Fhloinn, Eabhnat; Carr, Michael

    2017-07-01

    In this paper, we present a range of formative assessment types for engineering mathematics, including in-class exercises, homework, mock examination questions, table quizzes, presentations, critical analyses of statistical papers, peer-to-peer teaching, online assessments and electronic voting systems. We provide practical tips for the implementation of such assessments, with a particular focus on time or resource constraints and large class sizes, as well as effective methods of feedback. In addition, we consider the benefits of such formative assessments for students and staff.

  1. Examining the Psychometric Quality of Multiple-Choice Assessment Items using Mokken Scale Analysis.

    Science.gov (United States)

    Wind, Stefanie A

    The concept of invariant measurement is typically associated with Rasch measurement theory (Engelhard, 2013). Concerned with the appropriateness of the parametric transformation upon which the Rasch model is based, Mokken (1971) proposed a nonparametric procedure for evaluating the quality of social science measurement that is theoretically and empirically related to the Rasch model. Mokken's nonparametric procedure can be used to evaluate the quality of dichotomous and polytomous items in terms of the requirements for invariant measurement. Despite these potential benefits, the use of Mokken scaling to examine the properties of multiple-choice (MC) items in education has not yet been fully explored. A nonparametric approach to evaluating MC items is promising in that this approach facilitates the evaluation of assessments in terms of invariant measurement without imposing potentially inappropriate transformations. Using Rasch-based indices of measurement quality as a frame of reference, data from an eighth-grade physical science assessment are used to illustrate and explore Mokken-based techniques for evaluating the quality of MC items. Implications for research and practice are discussed.

  2. Item parameters dissociate between expectation formats: A regression analysis of time-frequency decomposed EEG data

    Directory of Open Access Journals (Sweden)

    Irene Fernández Monsalve

    2014-08-01

    Full Text Available During language comprehension, semantic contextual information is used to generate expectations about upcoming items. This has been commonly studied through the N400 event-related potential (ERP, as a measure of facilitated lexical retrieval. However, the associative relationships in multi-word expressions (MWE may enable the generation of a categorical expectation, leading to lexical retrieval before target word onset. Processing of the target word would thus reflect a target-identification mechanism, possibly indexed by a P3 ERP component. However, given their time overlap (200-500 ms post-stimulus onset, differentiating between N400/P3 ERP responses (averaged over multiple linguistically variable trials is problematic. In the present study, we analyzed EEG data from a previous experiment, which compared ERP responses to highly expected words that were placed either in a MWE or a regular non-fixed compositional context, and to low predictability controls. We focused on oscillatory dynamics and regression analyses, in order to dissociate between the two contexts by modeling the electrophysiological response as a function of item-level parameters. A significant interaction between word position and condition was found in the regression model for power in a theta range (~7-9 Hz, providing evidence for the presence of qualitative differences between conditions. Power levels within this band were lower for MWE than compositional contexts then the target word appeared later on in the sentence, confirming that in the former lexical retrieval would have taken place before word onset. On the other hand, gamma-power (~50-70 Hz was also modulated by predictability of the item in all conditions, which is interpreted as an index of a similar `matching' sub-step for both types of contexts, binding an expected representation and the external input.

  3. Development of an assessment tool to measure students′ perceptions of respiratory care education programs: Item generation, item reduction, and preliminary validation

    Directory of Open Access Journals (Sweden)

    Ghazi Alotaibi

    2013-01-01

    Full Text Available Objectives: Students who perceived their learning environment positively are more likely to develop effective learning strategies, and adopt a deep learning approach. Currently, there is no validated instrument for measuring the educational environment of educational programs on respiratory care (RC. The aim of this study was to develop an instrument to measure students′ perception of the RC educational environment. Materials and Methods: Based on the literature review and an assessment of content validity by multiple focus groups of RC educationalists, potential items of the instrument relevant to RC educational environment construct were generated by the research group. The initial 71 item questionnaire was then field-tested on all students from the 3 RC programs in Saudi Arabia and was subjected to multi-trait scaling analysis. Cronbach′s alpha was used to assess internal consistency reliabilities. Results: Two hundred and twelve students (100% completed the survey. The initial instrument of 71 items was reduced to 65 across 5 scales. Convergent and discriminant validity assessment demonstrated that the majority of items correlated more highly with their intended scale than a competing one. Cronbach′s alpha exceeded the standard criterion of >0.70 in all scales except one. There was no floor or ceiling effect for scale or overall score. Conclusions: This instrument is the first assessment tool developed to measure the RC educational environment. There was evidence of its good feasibility, validity, and reliability. This first validation of the instrument supports its use by RC students to evaluate educational environment.

  4. Checking Equity: Why Differential Item Functioning Analysis Should Be a Routine Part of Developing Conceptual Assessments

    Czech Academy of Sciences Publication Activity Database

    Martinková, Patrícia; Drabinová, Adéla; Liaw, Y.L.; Sanders, E.A.; McFarland, J.L.; Price, R.M.

    2017-01-01

    Roč. 16, č. 2 (2017), č. článku rm2. ISSN 1931-7913 R&D Projects: GA ČR GJ15-15856Y Grant - others:NSF(US) DUE-1043443 Institutional support: RVO:67985807 Keywords : differential item functioning * fairness * conceptual assessments * concept inventory * undergraduate education * bias Subject RIV: AM - Education OBOR OECD: Education , special (to gifted persons, those with learning disabilities) Impact factor: 3.930, year: 2016

  5. Improved utilization of ADAS-cog assessment data through item response theory based pharmacometric modeling.

    Science.gov (United States)

    Ueckert, Sebastian; Plan, Elodie L; Ito, Kaori; Karlsson, Mats O; Corrigan, Brian; Hooker, Andrew C

    2014-08-01

    This work investigates improved utilization of ADAS-cog data (the primary outcome in Alzheimer's disease (AD) trials of mild and moderate AD) by combining pharmacometric modeling and item response theory (IRT). A baseline IRT model characterizing the ADAS-cog was built based on data from 2,744 individuals. Pharmacometric methods were used to extend the baseline IRT model to describe longitudinal ADAS-cog scores from an 18-month clinical study with 322 patients. Sensitivity of the ADAS-cog items in different patient populations as well as the power to detect a drug effect in relation to total score based methods were assessed with the IRT based model. IRT analysis was able to describe both total and item level baseline ADAS-cog data. Longitudinal data were also well described. Differences in the information content of the item level components could be quantitatively characterized and ranked for mild cognitively impairment and mild AD populations. Based on clinical trial simulations with a theoretical drug effect, the IRT method demonstrated a significantly higher power to detect drug effect compared to the traditional method of analysis. A combined framework of IRT and pharmacometric modeling permits a more effective and precise analysis than total score based methods and therefore increases the value of ADAS-cog data.

  6. Development of Rasch-based item banks for the assessment of work performance in patients with musculoskeletal diseases.

    Science.gov (United States)

    Mueller, Evelyn A; Bengel, Juergen; Wirtz, Markus A

    2013-12-01

    This study aimed to develop a self-description assessment instrument to measure work performance in patients with musculoskeletal diseases. In terms of the International Classification of Functioning, Disability and Health (ICF), work performance is defined as the degree of meeting the work demands (activities) at the actual workplace (environment). To account for the fact that work performance depends on the work demands of the job, we strived to develop item banks that allow a flexible use of item subgroups depending on the specific work demands of the patients' jobs. Item development included the collection of work tasks from literature and content validation through expert surveys and patient interviews. The resulting 122 items were answered by 621 patients with musculoskeletal diseases. Exploratory factor analysis to ascertain dimensionality and Rasch analysis (partial credit model) for each of the resulting dimensions were performed. Exploratory factor analysis resulted in four dimensions, and subsequent Rasch analysis led to the following item banks: 'impaired productivity' (15 items), 'impaired cognitive performance' (18), 'impaired coping with stress' (13) and 'impaired physical performance' (low physical workload 20 items, high physical workload 10 items). The item banks exhibited person separation indices (reliability) between 0.89 and 0.96. The assessment of work performance adds the activities component to the more commonly employed participation component of the ICF-model. The four item banks can be adapted to specific jobs where necessary without losing comparability of person measures, as the item banks are based on Rasch analysis.

  7. Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.

    Science.gov (United States)

    Cappelleri, Joseph C; Jason Lundy, J; Hays, Ron D

    2014-05-01

    The US Food and Drug Administration's guidance for industry document on patient-reported outcomes (PRO) defines content validity as "the extent to which the instrument measures the concept of interest" (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity" (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. We review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized "difficulty" (severity) order of items is represented by observed responses. If a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. Classical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures. Copyright © 2014 Elsevier HS Journals, Inc. All rights reserved.

  8. The Impact of Varied Discrimination Parameters on Mixed-Format Item Response Theory Model Selection

    Science.gov (United States)

    Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G.

    2013-01-01

    Whittaker, Chang, and Dodd compared the performance of model selection criteria when selecting among mixed-format IRT models and found that the criteria did not perform adequately when selecting the more parameterized models. It was suggested by M. S. Johnson that the problems when selecting the more parameterized models may be because of the low…

  9. Identifying the most efficient items from the Mini-Mental State Examination for cognitive function assessment in older Taiwanese patients.

    Science.gov (United States)

    Lou, Meei-Fang; Dai, Yu-Tzu; Huang, Guey-Shiun; Yu, Po-Jui

    2007-03-01

    The purpose of the study was to identify the most efficient items from the Mini-Mental State Examination for assessment of cognitive function. The Mini-Mental State Examination is the most frequently used cognitive screening instrument. However, the Mini-Mental State Examination has been criticized for insensitivity to mild cognitive dysfunction, limited memory assessment and variability in level of difficulty of the individual items. This study used secondary data analysis. Item response theory two-parameter model was used to analyse the data from the admission assessment of mental status by the Mini-Mental State Examination for 801 patients. By using item response analysis, 16 items were selected from the original 30-item Mini-Mental State Examination. The 16 items included mainly the measures of orientation, recall and attention and calculation. The internal consistency of the 16-item Mini-Mental State Examination was 0.84. The proposed new cut-off point for the 16-item Mini-Mental State Examination was 11. The correct classification rate was 0.94, the sensitivity was 100% and the specificity was 97.4%, when compared with the original 30-item Mini-Mental State Examination from the cut-off point of 24. This new cut-off point was determined for the purpose of over-identifying patients at risk so as to ensure early detection of and prevention from the onset of cognitive disturbance. Only a few items are needed to describe the subject's cognitive status. Using item response theory analysis, the study found that the Mini-Mental State Examination could be simplified. Deleting the items with less variation makes this assessment tool not only shorter, easier to administer and less strenuous for respondents, but also enables one to maintain validity as a cognitive function test for clinical setting.

  10. What Do They Understand? Using Technology to Facilitate Formative Assessment

    Science.gov (United States)

    Mitten, Carolyn; Jacobbe, Tim; Jacobbe, Elizabeth

    2017-01-01

    Formative assessment is so important to inform teachers' planning. A discussion of the benefits of using technology to facilitate formative assessment explains how four primary school teachers adopted three different apps to make their formative assessment more meaningful and useful.

  11. Gender differences in national assessment of educational progress science items: What does i don't know really mean?

    Science.gov (United States)

    Linn, Marcia C.; de Benedictis, Tina; Delucchi, Kevin; Harris, Abigail; Stage, Elizabeth

    The National Assessment of Educational Progress Science Assessment has consistently revealed small gender differences on science content items but not on science inquiry items. This assessment differs from others in that respondents can choose I don't know rather than guessing. This paper examines explanations for the gender differences including (a) differential prior instruction, (b) differential response to uncertainty and use of the I don't know response, (c) differential response to figurally presented items, and (d) different attitudes towards science. Of these possible explanations, the first two received support. Females are more likely to use the I don't know response, especially for items with physical science content or masculine themes such as football. To ameliorate this situation we need more effective science instruction and more gender-neutral assessment items.

  12. Development of a questionnaire to assess patient satisfaction with allergen-specific immunotherapy in adults: item generation, item reduction, and preliminary validation

    Directory of Open Access Journals (Sweden)

    Justícia JL

    2011-05-01

    Full Text Available Jose Luis Justícia1, Eva Baró2, Victoria Cardona3, Pedro Guardia4, Pedro Ojeda5, José Maria Olaguíbel6, José Maria Vega7, Carmen Vidal81Medical Department, Stallergenes Ibérica, Barcelona, Spain; 2Health Outcomes Research Department, 3D Health Research, Barcelona, Spain; 3Hospital Vall d'Hebron, Barcelona, Spain; 4Hospital Virgen Macarena, Sevilla, Spain; 5Clínica de Asma y Alergia Dres. Ojeda, Madrid, Spain; 6Complejo Hospitalario de Navarra, Pamplona, Spain; 7Hospital Regional Universitario Carlos Haya Málaga, Spain; 8Complejo Hospitalario Universitario de Santiago, Santiago de Compostela, SpainBackground: Allergen-specific immunotherapy (SIT is a treatment capable of modifying the natural course of allergy, so ensuring good adherence to SIT is fundamental. Up until now there has not existed an instrument specifically developed to measure patient satisfaction with SIT, although its assessment could help us to comprehend better and improve treatment adherence and effectiveness. The aim of this study was to develop an instrument to measure adult patient satisfaction with SIT.Methods: Items were generated from a literature review, focus groups with allergic adult patients undergoing SIT, and a meeting with experts. Potential items were administered to allergic patients undergoing SIT in an observational, cross-sectional, multicenter study. Item reduction was based on quantitative and qualitative criteria. A preliminary assessment of feasibility, reliability, and validity of the retained items was performed.Results: An initial pool of 70 items was administered to 257 patients undergoing SIT. Fifty-four items were eliminated resulting in a provisional instrument with 16 items. Factor analysis yielded four factors that were identified as perceived efficacy, activities and environment, cost-benefit balance, and overall satisfaction, explaining 74.8% of variance. Ceiling and floor effects were negligible for overall score. Overall score was

  13. Formative assessment promotes learning in undergraduate clinical ...

    African Journals Online (AJOL)

    Introduction. Clinical clerkships, typically situated in environments lacking educational structure, form the backbone of undergraduate medical training. The imperative to develop strategies that enhance learning in this context is apparent. This study explored the impact of longitudinal bedside formative assessment on ...

  14. Assessment of the Assessment Tool: Analysis of Items in a Non-MCQ Mathematics Exam

    Science.gov (United States)

    Khoshaim, Heba Bakr; Rashid, Saima

    2016-01-01

    Assessment is one of the vital steps in the teaching and learning process. The reported action research examines the effectiveness of an assessment process and inspects the validity of exam questions used for the assessment purpose. The instructors of a college-level mathematics course studied questions used in the final exams during the academic…

  15. Sensitivity and specificity of the 3-item memory test in the assessment of post traumatic amnesia.

    Science.gov (United States)

    Andriessen, Teuntje M J C; de Jong, Ben; Jacobs, Bram; van der Werf, Sieberen P; Vos, Pieter E

    2009-04-01

    To investigate how the type of stimulus (pictures or words) and the method of reproduction (free recall or recognition after a short or a long delay) affect the sensitivity and specificity of a 3-item memory test in the assessment of post traumatic amnesia (PTA). Daily testing was performed in 64 consecutively admitted traumatic brain injured patients, 22 orthopedically injured patients and 26 healthy controls until criteria for resolution of PTA were reached. Subjects were randomly assigned to a test with visual or verbal stimuli. Short delay reproduction was tested after an interval of 3-5 minutes, long delay reproduction was tested after 24 hours. Sensitivity and specificity were calculated over the first 4 test days. The 3-word test showed higher sensitivity than the 3-picture test, while specificity of the two tests was equally high. Free recall was a more effortful task than recognition for both patients and controls. In patients, a longer delay between registration and recall resulted in a significant decrease in the number of items reproduced. Presence of PTA is best assessed with a memory test that incorporates the free recall of words after a long delay.

  16. Alzheimer's Disease Assessment: A Review and Illustrations Focusing on Item Response Theory Techniques.

    Science.gov (United States)

    Balsis, Steve; Choudhury, Tabina K; Geraci, Lisa; Benge, Jared F; Patrick, Christopher J

    2018-04-01

    Alzheimer's disease (AD) affects neurological, cognitive, and behavioral processes. Thus, to accurately assess this disease, researchers and clinicians need to combine and incorporate data across these domains. This presents not only distinct methodological and statistical challenges but also unique opportunities for the development and advancement of psychometric techniques. In this article, we describe relatively recent research using item response theory (IRT) that has been used to make progress in assessing the disease across its various symptomatic and pathological manifestations. We focus on applications of IRT to improve scoring, test development (including cross-validation and adaptation), and linking and calibration. We conclude by describing potential future multidimensional applications of IRT techniques that may improve the precision with which AD is measured.

  17. Pedagogy of Science Teaching Tests: Formative assessments of science teaching orientations

    Science.gov (United States)

    Cobern, William W.; Schuster, David; Adams, Betty; Skjold, Brandy Ann; Zeynep Muğaloğlu, Ebru; Bentz, Amy; Sparks, Kelly

    2014-09-01

    A critical aspect of teacher education is gaining pedagogical content knowledge of how to teach science for conceptual understanding. Given the time limitations of college methods courses, it is difficult to touch on more than a fraction of the science topics potentially taught across grades K-8, particularly in the context of relevant pedagogies. This research and development work centers on constructing a formative assessment resource to help expose pre-service teachers to a greater number of science topics within teaching episodes using various modes of instruction. To this end, 100 problem-based, science pedagogy assessment items were developed via expert group discussions and pilot testing. Each item contains a classroom vignette followed by response choices carefully crafted to include four basic pedagogies (didactic direct, active direct, guided inquiry, and open inquiry). The brief but numerous items allow a substantial increase in the number of science topics that pre-service students may consider. The intention is that students and teachers will be able to share and discuss particular responses to individual items, or else record their responses to collections of items and thereby create a snapshot profile of their teaching orientations. Subsets of items were piloted with students in pre-service science methods courses, and the quantitative results of student responses were spread sufficiently to suggest that the items can be effective for their intended purpose.

  18. War Reserve Analysis and Secondary Item Procureability Assessment of the AMCOM Supported Weapon Systems

    National Research Council Canada - National Science Library

    Maddux, Gary

    2000-01-01

    .... IOD evaluates the impacts of nonavailability of secondary items on the life cycle supportability of AMCOM weapon systems and evaluates the producibility of secondary items for war reserve requirements...

  19. Normative data for the 12 item WHO Disability Assessment Schedule 2.0.

    Directory of Open Access Journals (Sweden)

    Gavin Andrews

    Full Text Available BACKGROUND: The World Health Organization Disability Assessment Schedule (WHODAS 2.0 measures disability due to health conditions including diseases, illnesses, injuries, mental or emotional problems, and problems with alcohol or drugs. METHOD: The 12 Item WHODAS 2.0 was used in the second Australian Survey of Mental Health and Well-being. We report the overall factor structure and the distribution of scores and normative data (means and SDs for people with any physical disorder, any mental disorder and for people with neither. FINDINGS: A single second order factor justifies the use of the scale as a measure of global disability. People with mental disorders had high scores (mean 6.3, SD 7.1, people with physical disorders had lower scores (mean 4.3, SD 6.1. People with no disorder covered by the survey had low scores (mean 1.4, SD 3.6. INTERPRETATION: The provision of normative data from a population sample of adults will facilitate use of the WHODAS 2.0 12 item scale in clinical and epidemiological research.

  20. Use of differential item functioning analysis to assess the equivalence of translations of a questionnaire

    NARCIS (Netherlands)

    Petersen, Morten Aa; Groenvold, Mogens; Bjorner, Jakob B.; Aaronson, Neil; Conroy, Thierry; Cull, Ann; Fayers, Peter; Hjermstad, Marianne; Sprangers, Mirjam; Sullivan, Marianne

    2003-01-01

    In cross-national comparisons based on questionnaires, accurate translations are necessary to obtain valid results. Differential item functioning (DIF) analysis can be used to test whether translations of items in multi-item scales are equivalent to the original. In data from 10,815 respondents

  1. MEDICAL STUDENTS’ FEEDBACK ABOUT FORMATIVE ASSESSMENT PATTERN

    Directory of Open Access Journals (Sweden)

    Navajothi

    2016-03-01

    Full Text Available BACKGROUND Pharmacology is the toughest subject in the II MBBS syllabus. Students have to memorise a lot about the drugs’ name and classification. We are conducting internal assessment exams after completion of each system. Number of failures will be more than 60% in the internal assessments conducted during first six months of II MBBS course. AIM To assess the formative assessment pattern followed in our institution with the students’ feedback and modify the pattern according to the students’ feedback. SETTINGS & DESIGN Prospective Observational Study conducted at Department of Pharmacology, Government Sivagangai Medical College, Sivagangai, Tamil Nadu. MATERIALS AND METHODS Questionnaire was prepared and distributed to the 300 students of Government Sivagangai Medical College and feedback was collected. Data collected was analysed in Microsoft Excel 2007 version. RESULTS Received feedback from 274 students. Most (80% of the students wanted to attend the tests in all systems. Monthly assessment test was preferred by 47% of the students. Students who preferred to finish tests before holidays was 57%. Most (56% of the students preferred tests for 1 hour. Multiple choice question (MCQ type was preferred by 43%, which is not a routine question pattern. Only 7% preferred viva. Recall type of questions was preferred by 41% of the students. CONCLUSION In our institution, internal assessment is conducted as per the students’ mind setup. As the feedback has been the generally followed one, we will add MCQs in the forthcoming tests. Application type questions will be asked for more marks than Recall type of questions.

  2. Communicating Quantitative Literacy: An Examination of Open-Ended Assessment Items in TIMSS, NALS, IALS, and PISA

    Directory of Open Access Journals (Sweden)

    Karl W. Kosko

    2011-07-01

    Full Text Available Quantitative Literacy (QL has been described as the skill set an individual uses when interacting with the world in a quantitative manner. A necessary component of this interaction is communication. To this end, assessments of QL have included open-ended items as a means of including communicative aspects of QL. The present study sought to examine whether such open-ended items typically measured aspects of quantitative communication, as compared to mathematical communication, or mathematical skills. We focused on public-released items and rubrics from four of the most widely referenced assessments: the Third International Mathematics and Science Study (TIMSS-95: the National Adult Literacy Survey (NALS; now the National Assessment of Adult Literacy, NAAL in 1985 and 1992, the International Adult Literacy Skills (IALS beginning in 1994; and the Program for International Student Assessment (PISA beginning in 2000. We found that open-ended item rubrics in these QL assessments showed a strong tendency to assess answer-only responses. Therefore, while some open-ended items may have required certain levels of quantitative reasoning to find a solution, it is the solution rather than the reasoning that was often assessed.

  3. Item and test analysis to identify quality multiple choice questions (MCQS from an assessment of medical students of Ahmedabad, Gujarat

    Directory of Open Access Journals (Sweden)

    Sanju Gajjar

    2014-01-01

    Full Text Available Background: Multiple choice questions (MCQs are frequently used to assess students in different educational streams for their objectivity and wide reach of coverage in less time. However, the MCQs to be used must be of quality which depends upon its difficulty index (DIF I, discrimination index (DI and distracter efficiency (DE. Objective: To evaluate MCQs or items and develop a pool of valid items by assessing with DIF I, DI and DE and also to revise/ store or discard items based on obtained results. Settings: Study was conducted in a medical school of Ahmedabad. Materials and Methods: An internal examination in Community Medicine was conducted after 40 hours teaching during 1 st MBBS which was attended by 148 out of 150 students. Total 50 MCQs or items and 150 distractors were analyzed. Statistical Analysis: Data was entered and analyzed in MS Excel 2007 and simple proportions, mean, standard deviations, coefficient of variation were calculated and unpaired t test was applied. Results: Out of 50 items, 24 had "good to excellent" DIF I (31 - 60% and 15 had "good to excellent" DI (> 0.25. Mean DE was 88.6% considered as ideal/ acceptable and non functional distractors (NFD were only 11.4%. Mean DI was 0.14. Poor DI (< 0.15 with negative DI in 10 items indicates poor preparedness of students and some issues with framing of at least some of the MCQs. Increased proportion of NFDs (incorrect alternatives selected by < 5% students in an item decrease DE and makes it easier. There were 15 items with 17 NFDs, while rest items did not have any NFD with mean DE of 100%. Conclusion: Study emphasizes the selection of quality MCQs which truly assess the knowledge and are able to differentiate the students of different abilities in correct manner.

  4. Assessment of the psychometrics of a PROMIS item bank: self-efficacy for managing daily activities.

    Science.gov (United States)

    Hong, Ickpyo; Velozo, Craig A; Li, Chih-Ying; Romero, Sergio; Gruber-Baldini, Ann L; Shulman, Lisa M

    2016-09-01

    The aim of this study is to investigate the psychometrics of the Patient-Reported Outcomes Measurement Information System self-efficacy for managing daily activities item bank. The item pool was field tested on a sample of 1087 participants via internet (n = 250) and in-clinic (n = 837) surveys. All participants reported having at least one chronic health condition. The 35 item pool was investigated for dimensionality (confirmatory factor analyses, CFA and exploratory factor analysis, EFA), item-total correlations, local independence, precision, and differential item functioning (DIF) across gender, race, ethnicity, age groups, data collection modes, and neurological chronic conditions (McFadden Pseudo R (2) less than 10 %). The item pool met two of the four CFA fit criteria (CFI = 0.952 and SRMR = 0.07). EFA analysis found a dominant first factor (eigenvalue = 24.34) and the ratio of first to second eigenvalue was 12.4. The item pool demonstrated good item-total correlations (0.59-0.85) and acceptable internal consistency (Cronbach's alpha = 0.97). The item pool maintained its precision (reliability over 0.90) across a wide range of theta (3.70), and there was no significant DIF. The findings indicated the item pool has sound psychometric properties and the test items are eligible for development of computerized adaptive testing and short forms.

  5. TEDS-M 2008 User Guide for the International Database. Supplement 4: TEDS-M Released Mathematics and Mathematics Pedagogy Knowledge Assessment Items

    Science.gov (United States)

    Brese, Falk, Ed.

    2012-01-01

    The goal for selecting the released set of test items was to have approximately 25% of each of the full item sets for mathematics content knowledge (MCK) and mathematics pedagogical content knowledge (MPCK) that would represent the full range of difficulty, content, and item format used in the TEDS-M study. The initial step in the selection was to…

  6. Formative assessment of GP trainees' clinical skills.

    Science.gov (United States)

    Wiener-Ogilvie, Sharon; Begg, Drummond

    2012-03-01

    Clinical skill assessment (CSA) has been an integral part of the Royal College of General Practitioners' membership examination (MRCGP) since 2008. It is an expensive, high-stakes examination with first time pass rates ranging from 76.4 to 81.3. In this paper we describe the South East Scotland Deanery, NHS Education Scotland, pilot of a formative clinical skills assessment (fCSA) using the principles of formative assessment and OSCE. The purpose of the study was to assess the acceptability of the fCSA and to examine whether trainees, identified during the fCSA as 'at risk of failing the MRCGP CSA exam', are more likely to fail the MRCGP CSA exam later on in the year. Trainees were assessed in four clinical skills stations under exam conditions. After each station they were given verbal feedback and subsequently both trainee and their trainer received written feedback. We assessed the value of the exercise through written feedback from trainees and trainers. Each trainee's performance in fCSA was triangulated with trainer assessment to identify 'flagged trainees'. We compared flagged and non-flagged trainees' performance in MRCGP CSA. Both trainees and trainers highly rated the fCSA. Overall 97% of non-flagged trainees have passed the RCGP CSA exam by May of that year in comparison to 80% of flagged trainees who have passed the RCGP CSA (P = 0.005). Trainers and trainees rated the fCSA as excellent and useful. We were able to demonstrate that the fCSA can be used to identify those trainees likely to fail the RCGP CSA. Contrary to reservations about the potential to demoralise trainees, the fCSA was viewed as a useful and a positive experience by both trainees and trainers. In addition, we suggest that feedback from fCSA was useful in triggering appropriate educational interventions. Early intervention with trainees who are predicted to fail the CSA has the potential to reduce deaneries overall fail rate. Preventing one trainee failure could save over £30 000.

  7. Development of a self-report physical function instrument for disability assessment: item pool construction and factor analysis.

    Science.gov (United States)

    McDonough, Christine M; Jette, Alan M; Ni, Pengsheng; Bogusz, Kara; Marfeo, Elizabeth E; Brandt, Diane E; Chan, Leighton; Meterko, Mark; Haley, Stephen M; Rasch, Elizabeth K

    2013-09-01

    To build a comprehensive item pool representing work-relevant physical functioning and to test the factor structure of the item pool. These developmental steps represent initial outcomes of a broader project to develop instruments for the assessment of function within the context of Social Security Administration (SSA) disability programs. Comprehensive literature review; gap analysis; item generation with expert panel input; stakeholder interviews; cognitive interviews; cross-sectional survey administration; and exploratory and confirmatory factor analyses to assess item pool structure. In-person and semistructured interviews and Internet and telephone surveys. Sample of SSA claimants (n=1017) and a normative sample of adults from the U.S. general population (n=999). Not applicable. Model fit statistics. The final item pool consisted of 139 items. Within the claimant sample, 58.7% were white; 31.8% were black; 46.6% were women; and the mean age was 49.7 years. Initial factor analyses revealed a 4-factor solution, which included more items and allowed separate characterization of: (1) changing and maintaining body position, (2) whole body mobility, (3) upper body function, and (4) upper extremity fine motor. The final 4-factor model included 91 items. Confirmatory factor analyses for the 4-factor models for the claimant and the normative samples demonstrated very good fit. Fit statistics for claimant and normative samples, respectively, were: Comparative Fit Index=.93 and .98; Tucker-Lewis Index=.92 and .98; and root mean square error approximation=.05 and .04. The factor structure of the physical function item pool closely resembled the hypothesized content model. The 4 scales relevant to work activities offer promise for providing reliable information about claimant physical functioning relevant to work disability. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  8. Single-item measure for assessing quality of life in children with drug-resistant epilepsy.

    Science.gov (United States)

    Conway, Lauryn; Widjaja, Elysa; Smith, Mary Lou

    2018-03-01

    The current study investigated the psychometric properties of a single-item quality of life (QOL) measure, the Global Quality of Life in Childhood Epilepsy question (G-QOLCE), in children with drug-resistant epilepsy. Data came from the Impact of Pediatric Epilepsy Surgery on Health-Related Quality of Life Study (PESQOL), a multicenter prospective cohort study (n = 118) with observations collected at baseline and at 6 months of follow-up on children aged 4-18 years. QOL was measured with the QOLCE-76 and KIDSCREEN-27. The G-QOLCE was an overall QOL question derived from the QOLCE-76. Construct validity and reliability were assessed with Spearman's correlation and intraclass correlation coefficient (ICC). Responsiveness was examined through distribution-based and anchor-based methods. The G-QOLCE showed moderate (r ≥ 0.30) to strong (r ≥ 0.50) correlations with composite scores, and most subscales of the QOLCE-76 and KIDSCREEN-27 at baseline and 6-month follow-up. The G-QOLCE had moderate test-retest reliability (ICC range: 0.49-0.72) and was able to detect clinically important change in patients' QOL (standardized response mean: 0.38; probability of change: 0.65; Guyatt's responsiveness statistics: 0.62 and 0.78). Caregiver anxiety and family functioning contributed most strongly to G-QOLCE scores over time. Results offer promising preliminary evidence regarding the validity, reliability, and responsiveness of the proposed single-item QOL measure. The G-QOLCE is a potentially useful tool that can be feasibly administered in a busy clinical setting to evaluate clinical status and impact of treatment outcomes in pediatric epilepsy.

  9. Differential participation in formative assessment and achievement in introductory calculus

    OpenAIRE

    Dibbs, Rebecca-Anne

    2015-01-01

    International audience; Prior formative assessment research has shown positive achievement gains when classes using formative assessment are compared to classes that do not. However, little is known about what, if any, benefits of formative assessment occur within a class. The purpose of this study was to investigate the achievement of the students in introductory calculus using formative assessment at the two different participation levels observed in class. Although there was no significant...

  10. Work-related stress assessed by a text message single-item stress question.

    Science.gov (United States)

    Arapovic-Johansson, B; Wåhlin, C; Kwak, L; Björklund, C; Jensen, I

    2017-12-02

    Given the prevalence of work stress-related ill-health in the Western world, it is important to find cost-effective, easy-to-use and valid measures which can be used both in research and in practice. To examine the validity and reliability of the single-item stress question (SISQ), distributed weekly by short message service (SMS) and used for measurement of work-related stress. The convergent validity was assessed through associations between the SISQ and subscales of the Job Demand-Control-Support model, the Effort-Reward Imbalance model and scales measuring depression, exhaustion and sleep. The predictive validity was assessed using SISQ data collected through SMS. The reliability was analysed by the test-retest procedure. Correlations between the SISQ and all the subscales except for job strain and esteem reward were significant, ranging from -0.186 to 0.627. The SISQ could also predict sick leave, depression and exhaustion at 12-month follow-up. The analysis on reliability revealed a satisfactory stability with a weighted kappa between 0.804 and 0.868. The SISQ, administered through SMS, can be used for the screening of stress levels in a working population. © The Author 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  11. Assessing nicotine dependence in adolescent E-cigarette users: The 4-item Patient-Reported Outcomes Measurement Information System (PROMIS) Nicotine Dependence Item Bank for electronic cigarettes.

    Science.gov (United States)

    Morean, Meghan E; Krishnan-Sarin, Suchitra; S O'Malley, Stephanie

    2018-04-26

    Adolescent e-cigarette use (i.e., "vaping") likely confers risk for developing nicotine dependence. However, there have been no studies assessing e-cigarette nicotine dependence in youth. We evaluated the psychometric properties of the 4-item Patient-Reported Outcomes Measurement Information System Nicotine Dependence Item Bank for E-cigarettes (PROMIS-E) for assessing youth e-cigarette nicotine dependence and examined risk factors for experiencing stronger dependence symptoms. In 2017, 520 adolescent past-month e-cigarette users completed the PROMIS-E during a school-based survey (50.5% female, 84.8% White, 16.22[1.19] years old). Adolescents also reported on sex, grade, race, age at e-cigarette use onset, vaping frequency, nicotine e-liquid use, and past-month cigarette smoking. Analyses included conducting confirmatory factor analysis and examining the internal consistency of the PROMIS-E. Bivariate correlations and independent-samples t-tests were used to examine unadjusted relationships between e-cigarette nicotine dependence and the proposed risk factors. Regression models were run in which all potential risk factors were entered as simultaneous predictors of PROMIS-E scores. The single-factor structure of the PROMIS-E was confirmed and evidenced good internal consistency. Across models, larger PROMIS-E scores were associated with being in a higher grade, initiating e-cigarette use at an earlier age, vaping more frequently, using nicotine e-liquid (and higher nicotine concentrations), and smoking cigarettes. Adolescent e-cigarette users reported experiencing nicotine dependence, which was assessed using the psychometrically sound PROMIS-E. Experiencing stronger nicotine dependence symptoms was associated with characteristics that previously have been shown to confer risk for frequent vaping and tobacco cigarette dependence. Copyright © 2018 Elsevier B.V. All rights reserved.

  12. A Multidimensional Partial Credit Model with Associated Item and Test Statistics: An Application to Mixed-Format Tests

    Science.gov (United States)

    Yao, Lihua; Schwarz, Richard D.

    2006-01-01

    Multidimensional item response theory (IRT) models have been proposed for better understanding the dimensional structure of data or to define diagnostic profiles of student learning. A compensatory multidimensional two-parameter partial credit model (M-2PPC) for constructed-response items is presented that is a generalization of those proposed to…

  13. Assessing Impact, DIF, and DFF in Accommodated Item Scores: A Comparison of Multilevel Measurement Model Parameterizations

    Science.gov (United States)

    Beretvas, S. Natasha; Cawthon, Stephanie W.; Lockhart, L. Leland; Kaye, Alyssa D.

    2012-01-01

    This pedagogical article is intended to explain the similarities and differences between the parameterizations of two multilevel measurement model (MMM) frameworks. The conventional two-level MMM that includes item indicators and models item scores (Level 1) clustered within examinees (Level 2) and the two-level cross-classified MMM (in which item…

  14. An Application of Cognitive Diagnostic Assessment on TIMMS-2007 8th Grade Mathematics Items

    Science.gov (United States)

    Toker, Turker; Green, Kathy

    2012-01-01

    The least squares distance method (LSDM) was used in a cognitive diagnostic analysis of TIMSS (Trends in International Mathematics and Science Study) items administered to 4,498 8th-grade students from seven geographical regions of Turkey, extending analysis of attributes from content to process and skill attributes. Logit item positions were…

  15. A Faculty Toolkit for Formative Assessment in Pharmacy Education.

    Science.gov (United States)

    DiVall, Margarita V; Alston, Greg L; Bird, Eleanora; Buring, Shauna M; Kelley, Katherine A; Murphy, Nanci L; Schlesselman, Lauren S; Stowe, Cindy D; Szilagyi, Julianna E

    2014-11-15

    This paper aims to increase understanding and appreciation of formative assessment and its role in improving student outcomes and the instructional process, while educating faculty on formative techniques readily adaptable to various educational settings. Included are a definition of formative assessment and the distinction between formative and summative assessment. Various formative assessment strategies to evaluate student learning in classroom, laboratory, experiential, and interprofessional education settings are discussed. The role of reflective writing and portfolios, as well as the role of technology in formative assessment, are described. The paper also offers advice for formative assessment of faculty teaching. In conclusion, the authors emphasize the importance of creating a culture of assessment that embraces the concept of 360-degree assessment in both the development of a student's ability to demonstrate achievement of educational outcomes and a faculty member's ability to become an effective educator.

  16. Assessment of Differential Item Functioning in Health-Related Outcomes: A Simulation and Empirical Analysis with Hierarchical Polytomous Data

    Directory of Open Access Journals (Sweden)

    Zahra Sharafi

    2017-01-01

    Full Text Available Background. The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning (DIF in the presence of multilevel data and polytomously scored items. The assessment of DIF with multilevel data (e.g., patients nested within hospitals, hospitals nested within districts from large-scale assessment programs has received considerable attention but very few studies evaluated the effect of hierarchical structure of data on DIF detection for polytomously scored items. Methods. The ordinal logistic regression (OLR and hierarchical ordinal logistic regression (HOLR were utilized to assess DIF in simulated and real multilevel polytomous data. Six factors (DIF magnitude, grouping variable, intraclass correlation coefficient, number of clusters, number of participants per cluster, and item discrimination parameter with a fully crossed design were considered in the simulation study. Furthermore, data of Pediatric Quality of Life Inventory™ (PedsQL™ 4.0 collected from 576 healthy school children were analyzed. Results. Overall, results indicate that both methods performed equivalently in terms of controlling Type I error and detection power rates. Conclusions. The current study showed negligible difference between OLR and HOLR in detecting DIF with polytomously scored items in a hierarchical structure. Implications and considerations while analyzing real data were also discussed.

  17. Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank

    NARCIS (Netherlands)

    Oude Voshaar, Martijn A.H.; Ten Klooster, Peter M.; Vonkeman, Harald E.; van de Laar, Mart A.F.J.

    2017-01-01

    Objective: Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Study

  18. Assessing the Straightforwardly-Worded Brief Fear of Negative Evaluation Scale for Differential Item Functioning Across Gender and Ethnicity.

    Science.gov (United States)

    Harpole, Jared K; Levinson, Cheri A; Woods, Carol M; Rodebaugh, Thomas L; Weeks, Justin W; Brown, Patrick J; Heimberg, Richard G; Menatti, Andrew R; Blanco, Carlos; Schneier, Franklin; Liebowitz, Michael

    2015-06-01

    The Brief Fear of Negative Evaluation Scale (BFNE; Leary Personality and Social Psychology Bulletin , 9, 371-375, 1983) assesses fear and worry about receiving negative evaluation from others. Rodebaugh et al. Psychological Assessment, 16 , 169-181, (2004) found that the BFNE is composed of a reverse-worded factor (BFNE-R) and straightforwardly-worded factor (BFNE-S). Further, they found the BFNE-S to have better psychometric properties and provide more information than the BFNE-R. Currently there is a lack of research regarding the measurement invariance of the BFNE-S across gender and ethnicity with respect to item thresholds. The present study uses item response theory (IRT) to test the BFNE-S for differential item functioning (DIF) related to gender and ethnicity (White, Asian, and Black). Six data sets consisting of clinical, community, and undergraduate participants were utilized ( N =2,109). The factor structure of the BFNE-S was confirmed using categorical confirmatory factor analysis, IRT model assumptions were tested, and the BFNE-S was evaluated for DIF. Item nine demonstrated significant non-uniform DIF between White and Black participants. No other items showed significant uniform or non-uniform DIF across gender or ethnicity. Results suggest the BFNE-S can be used reliably with men and women and Asian and White participants. More research is needed to understand the implications of using the BFNE-S with Black participants.

  19. Implication of formative assessment practices among mathematics teacher

    Science.gov (United States)

    Samah, Mas Norbany binti Abu; Tajudin, Nor'ain binti Mohd

    2017-05-01

    Formative assessment of school-based assessment (SBA) is implemented in schools as a move to improve the National Education Assessment System (NEAS). Formative assessment focuses on assessment for learning. There are various types of formative assessment instruments used by teachers of mathematics, namely the form of observation, questioning protocols, worksheets and quizzes. This study aims to help teachers improve skills in formative assessments during the teaching and learning (t&l) Mathematics. One mathematics teacher had been chosen as the study participants. The collecting data using document analysis, observation and interviews. Data were analyzed narrative and assessments can help teachers implement PBS. Formative assessment is conducted to improve the skills of students in t&l effectively.

  20. Psychometric properties of a single-item scale to assess sleep quality among individuals with fibromyalgia

    Directory of Open Access Journals (Sweden)

    Sadosky Alesia B

    2009-06-01

    Full Text Available Abstract Background Sleep disturbances are a common and bothersome symptom of fibromyalgia (FM. This study reports psychometric properties of a single-item scale to assess sleep quality among individuals with FM. Methods Analyses were based on data from two randomized, double-blind, placebo-controlled trials of pregabalin (studies 1056 and 1077. In a daily diary, patients reported the quality of their sleep on a numeric rating scale ranging from 0 ("best possible sleep" to 10 ("worst possible sleep". Test re-test reliability of the Sleep Quality Scale was evaluated by computing intraclass correlation coefficients. Pearson correlation coefficients were computed between baseline Sleep Quality scores and baseline pain diary and Medical Outcomes Study (MOS Sleep scores. Responsiveness to treatment was evaluated by standardized effect sizes computed as the difference between least squares mean changes in Sleep Quality scores in the pregabalin and placebo groups divided by the standard deviation of Sleep Quality scores across all patients at baseline. Results Studies 1056 and 1077 included 748 and 745 patients, respectively. Most patients were female (study 1056: 94.4%; study 1077: 94.5% and white (study 1056: 90.2%; study 1077: 91.0%. Mean ages were 48.8 years (study 1056 and 50.1 years (study 1077. Test re-test reliability coefficients of the Sleep Quality Scale were 0.91 and 0.90 in the 1056 and 1077 studies, respectively. Pearson correlation coefficients between baseline Sleep Quality scores and baseline pain diary scores were 0.64 (p Conclusion These results provide evidence of the reproducibility, convergent validity, and responsiveness to treatment of the Sleep Quality Scale and provide a foundation for its further use and evaluation in FM patients.

  1. Assessment of chromium(VI) release from 848 jewellery items by use of a diphenylcarbazide spot test

    DEFF Research Database (Denmark)

    Bregnbak, David; Johansen, Jeanne D.; Hamann, Dathan

    2016-01-01

    We recently evaluated and validated a diphenylcarbazide(DPC)-based screening spot test that can detect the release of chromium(VI) ions (≥0.5 ppm) from various metallic items and leather goods (1). We then screened a selection of metal screws, leather shoes, and gloves, as well as 50 earrings......, and identified chromium(VI) release from one earring. In the present study, we used the DPC spot test to assess chromium(VI) release in a much larger sample of jewellery items (n=848), 160 (19%) of which had previously be shown to contain chromium when analysed with X-ray fluorescence spectroscopy (2)....

  2. Written Formative Assessment and Silence in the Classroom

    Science.gov (United States)

    Lee Hang, Desmond Mene; Bell, Beverley

    2015-01-01

    In this commentary, we build on Xinying Yin and Gayle Buck's discussion by exploring the cultural practices which are integral to formative assessment, when it is viewed as a sociocultural practice. First we discuss the role of assessment and in particular oral and written formative assessments in both western and Samoan cultures, building on the…

  3. Computer-based formative assessment: variables influencing feedback behaviour

    NARCIS (Netherlands)

    Timmers, C.F.

    2013-01-01

    Assessment can be used to stimulate and direct student learning. This refers to the formative function of assessment. Formative assessments contribute to learning by generating feedback. Here, feedback is conceptualised as information about learners actual state of performance intended to modify

  4. Are reflective models appropriate for very short scales? Proofs of concept of formative models using the Ten-Item Personality Inventory.

    Science.gov (United States)

    Myszkowski, Nils; Storme, Martin; Tavani, Jean-Louis

    2018-04-27

    Because of their length and objective of broad content coverage, very short scales can show limited internal consistency and structural validity. We argue that it is because their objectives may be better aligned with formative investigations than with reflective measurement methods that capitalize on content overlap. As proofs of concept of formative investigations of short scales, we investigate the Ten Item Personality Inventory (TIPI). In Study 1, we administered the TIPI and the Big Five Inventory (BFI) to 938 adults, and fitted a formative Multiple Indicator Multiple Causes model, which consisted of the TIPI items forming 5 latent variables, which in turn predicted the 5 BFI scores. These results were replicated in Study 2, on a sample of 759 adults, with, this time, the Revised NEO Personality Inventory (NEO-PI-R) as the external criterion. The models fit the data adequately, and moderate to strong significant effects (.37<|β|<.69, all p<.001) of all 5 latent formative variables on their corresponding BFI and NEOPI-R scores were observed. This study presents a formative approach that we propose to be more consistent with the aims of scales with broad content and short length like the TIPI. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.

  5. Quantitative Literacy on the Web of Science, 2 – Mining the Health Numeracy Literature for Assessment Items

    Directory of Open Access Journals (Sweden)

    H.L. Vacher

    2009-01-01

    Full Text Available A topic search of the Web of Science (WoS database using the term “numeracy” produced a bibliography of 293 articles, reviews and editorial commentaries (Oct 2008. The citation graph of the bibliography clearly identifies five benchmark papers (1995-2001, four of which developed numeracy assessment instruments. Starting with the 80 papers that cite these benchmarks, we identified a set of 25 papers (1995-2008 in which the medical research community reports the development and/or application of health-numeracy assessments. In all we found 10 assessment instruments from which we have compiled a total of 48 assessment items. There are both general and context-specific tests, with the wide range in the latter illustrated by names such as the Diabetes Numeracy Test and the Asthma Numeracy Questionnaire. There is also a Medical Data Interpretation Test and a Subjective Numeracy Scale. Much of this literature discusses the validity and reliability of the test, and many papers include item-by-item results of the tests from when they were applied in the research reported in the papers. The research that used the tests was directed at exploring such subjects as the patients’ ability to evaluate risks and benefits in order to make informed decisions; to understand and carry out instructions in order to self-manage their medical conditions; and, in research settings, to understand what the researchers were asking in their assessments (e.g., quantified quality of life that require comparison of numerical information. We present the collection of items as a potential resource for educators interested in numeracy assessments in context.

  6. Negative affectivity in cardiovascular disease: Evaluating Type D personality assessment using item response theory

    NARCIS (Netherlands)

    Emons, Wilco H.M.; Meijer, R.R.; Denollet, Johan

    2007-01-01

    Objective: Individuals with increased levels of both negative affectivity (NA) and social inhibition (SI)—referred to as type-D personality—are at increased risk of adverse cardiac events. We used item response theory (IRT) to evaluate NA, SI, and type-D personality as measured by the DS14. The

  7. Calibration of context-specific survey items to assess youth physical activity behaviour.

    Science.gov (United States)

    Saint-Maurice, Pedro F; Welk, Gregory J; Bartee, R Todd; Heelan, Kate

    2017-05-01

    This study tests calibration models to re-scale context-specific physical activity (PA) items to accelerometer-derived PA. A total of 195 4th-12th grades children wore an Actigraph monitor and completed the Physical Activity Questionnaire (PAQ) one week later. The relative time spent in moderate-to-vigorous PA (MVPA % ) obtained from the Actigraph at recess, PE, lunch, after-school, evening and weekend was matched with a respective item score obtained from the PAQ's. Item scores from 145 participants were calibrated against objective MVPA % using multiple linear regression with age, and sex as additional predictors. Predicted minutes of MVPA for school, out-of-school and total week were tested in the remaining sample (n = 50) using equivalence testing. The results showed that PAQ β-weights ranged from 0.06 (lunch) to 4.94 (PE) MVPA % (P PAQ and accelerometer MVPA at school and out-of-school ranged from -15.6 to +3.8 min and the PAQ was within 10-15% of accelerometer measured activity. This study demonstrated that context-specific items can be calibrated to predict minutes of MVPA in groups of youth during in- and out-of-school periods.

  8. Re-Examining Test Item Issues in the TIMSS Mathematics and Science Assessments

    Science.gov (United States)

    Wang, Jianjun

    2011-01-01

    As the largest international study ever taken in history, the Trend in Mathematics and Science Study (TIMSS) has been held as a benchmark to measure U.S. student performance in the global context. In-depth analyses of the TIMSS project are conducted in this study to examine key issues of the comparative investigation: (1) item flaws in mathematics…

  9. A psychometric comparison of three scales and a single-item measure to assess sexual satisfaction.

    Science.gov (United States)

    Mark, Kristen P; Herbenick, Debby; Fortenberry, J Dennis; Sanders, Stephanie; Reece, Michael

    2014-01-01

    This study was designed to systematically compare and contrast the psychometric properties of three scales developed to measure sexual satisfaction and a single-item measure of sexual satisfaction. The Index of Sexual Satisfaction (ISS), Global Measure of Sexual Satisfaction (GMSEX), and the New Sexual Satisfaction Scale-Short (NSSS-S) were compared to one another and to a single-item measure of sexual satisfaction. Conceptualization of the constructs, distribution of scores, internal consistency, convergent validity, test-retest reliability, and factor structure were compared between the measures. A total of 211 men and 214 women completed the scales and a measure of relationship satisfaction, with 33% (n = 139) of the sample reassessed two months later. All scales demonstrated appropriate distribution of scores and adequate internal consistency. The GMSEX, NSSS-S, and the single-item measure demonstrated convergent validity. Test-retest reliability was demonstrated by the ISS, GMSEX, and NSSS-S, but not the single-item measure. Taken together, the GMSEX received the strongest psychometric support in this sample for a unidimensional measure of sexual satisfaction and the NSSS-S received the strongest psychometric support in this sample for a bidimensional measure of sexual satisfaction.

  10. Formative Assessment Probes: Mountaintop Fossil: A Puzzling Phenomenon

    Science.gov (United States)

    Keeley, Page

    2015-01-01

    This column focuses on promoting learning through assessment. This month's issue describes using formative assessment probes to uncover several ways of thinking about the puzzling discovery of a marine fossil on top of a mountain.

  11. Assessing the specificity of posttraumatic stress disorder's dysphoric items within the dysphoria model.

    Science.gov (United States)

    Armour, Cherie; Shevlin, Mark

    2013-10-01

    The factor structure of posttraumatic stress disorder (PTSD) currently used by the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV), has received limited support. A four-factor dysphoria model is widely supported. However, the dysphoria factor of this model has been hailed as a nonspecific factor of PTSD. The present study investigated the specificity of the dysphoria factor within the dysphoria model by conducting a confirmatory factor analysis while statistically controlling for the variance attributable to depression. The sample consisted of 429 individuals who met the diagnostic criteria for PTSD in the National Comorbidity Survey. The results concluded that there was no significant attenuation in any of the PTSD items. This finding is pertinent given several proposals for the removal of dysphoric items from the diagnostic criteria set of PTSD in the upcoming DSM-5.

  12. Formative Assessment Jump-Starts a Middle Grades Differentiation Initiative

    Science.gov (United States)

    Doubet, Kristina J.

    2012-01-01

    A rural middle level school had stalled in its third year of a district-wide differentiation initiative. This article describes the way teachers and the leadership team engaged in collaborative practices to put a spotlight on formative assessment. Teachers learned to systematically gather formative assessment data from their students and to use…

  13. Leading Formative Assessment Change: A 3-Phase Approach

    Science.gov (United States)

    Northwest Evaluation Association, 2016

    2016-01-01

    If you are seeking greater student engagement and growth, you need to integrate high-impact formative assessment practices into daily instruction. Read the final article in our five-part series to find advice aimed at leaders determined to bring classroom formative assessment practices district wide. Learn: (1) what you MUST consider when…

  14. An Argument for Formative Assessment with Science Learning Progressions

    Science.gov (United States)

    Alonzo, Alicia C.

    2018-01-01

    Learning progressions--particularly as defined and operationalized in science education--have significant potential to inform teachers' formative assessment practices. In this overview article, I lay out an argument for this potential, starting from definitions for "formative assessment practices" and "learning progressions"…

  15. Evolution of a Test Item

    Science.gov (United States)

    Spaan, Mary

    2007-01-01

    This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…

  16. Negative affectivity and social inhibition in cardiovascular disease: evaluating type-D personality and its assessment using item response theory.

    Science.gov (United States)

    Emons, Wilco H M; Meijer, Rob R; Denollet, Johan

    2007-07-01

    Individuals with increased levels of both negative affectivity (NA) and social inhibition (SI)-referred to as type-D personality-are at increased risk of adverse cardiac events. We used item response theory (IRT) to evaluate NA, SI, and type-D personality as measured by the DS14. The objectives of this study were (a) to evaluate the relative contribution of individual items to the measurement precision at the cutoff to distinguish type-D from non-type-D personality and (b) to investigate the comparability of NA, SI, and type-D constructs across the general population and clinical populations. Data from representative samples including 1316 respondents from the general population, 427 respondents diagnosed with coronary heart disease, and 732 persons suffering from hypertension were analyzed using the graded response IRT model. In Study 1, the information functions obtained in the IRT analysis showed that (a) all items had highest measurement precision around the cutoff and (b) items are most informative at the higher end of the scale. In Study 2, the IRT analysis showed that measurements were fairly comparable across the general population and clinical populations. The DS14 adequately measures NA and SI, with highest reliability in the trait range around the cutoff. The DS14 is a valid instrument to assess and compare type-D personality across clinical groups.

  17. Standard Errors for National Trends in International Large-Scale Assessments in the Case of Cross-National Differential Item Functioning

    Science.gov (United States)

    Sachse, Karoline A.; Haag, Nicole

    2017-01-01

    Standard errors computed according to the operational practices of international large-scale assessment studies such as the Programme for International Student Assessment's (PISA) or the Trends in International Mathematics and Science Study (TIMSS) may be biased when cross-national differential item functioning (DIF) and item parameter drift are…

  18. Integrating data-based decision making, Assessment for Learning and diagnostic testing in formative assessment

    NARCIS (Netherlands)

    van der Kleij, Fabienne; Vermeulen, Jorine; Schildkamp, Kim; Eggen, Theodorus Johannes Hendrikus Maria

    2015-01-01

    Recent research has highlighted the lack of a uniform definition of formative assessment, although its effectiveness is widely acknowledged. This paper addresses the theoretical differences and similarities amongst three approaches to formative assessment that are currently most frequently discussed

  19. Formative peer assessment in a CSCL environment: A case study.

    NARCIS (Netherlands)

    Prins, Frans; Sluijsmans, Dominique; Kirschner, Paul A.; Strijbos, Jan Willem

    2007-01-01

    In this case study our aim was to gain more insight in the possibilities of qualitative formative peer assessment in a computer supported collaborative learning (CSCL) environment. An approach was chosen in which peer assessment was operationalized in assessment assignments and assessment tools that

  20. The influence of item order on intentional response distortion in the assessment of high potentials: assessing pilot applicants.

    Science.gov (United States)

    Khorramdel, Lale; Kubinger, Klaus D; Uitz, Alexander

    2014-04-01

    An experiment was conducted to investigate the effects of item order and questionnaire content on faking good or intentional response distortion. It was hypothesized that intentional response distortion would either increase towards the end of a long questionnaire, as learning effects might make it easier to adjust responses to a faking good schema, or decrease because applicants' will to distort responses is reduced if the questionnaire lasts long enough. Furthermore, it was hypothesized that certain types of questionnaire content are especially vulnerable to response distortion. Eighty-four pre-selected pilot applicants filled out a questionnaire consisting of 516 items including items from the NEO five factor inventory (NEO FFI), NEO personality inventory revised (NEO PI-R) and business-focused inventory of personality (BIP). The positions of the items were varied within the applicant sample to test if responses are affected by item order, and applicants' response behaviour was additionally compared to that of volunteers. Applicants reported significantly higher mean scores than volunteers, and results provide some evidence of decreased faking tendencies towards the end of the questionnaire. Furthermore, it could be demonstrated that lower variances or standard deviations in combination with appropriate (often higher) mean scores can serve as an indicator for faking tendencies in group comparisons, even if effects are not significant. © 2013 International Union of Psychological Science.

  1. FORMATIVE ASSESSMENT MODEL OF LEARNING SUCCESS ACHIEVEMENTS

    Directory of Open Access Journals (Sweden)

    Mikhailova Elena Konstantinovna

    2013-05-01

    Full Text Available The paper is devoted to the problem of assessment of the school students’ learning success achievements. The problem is investigated from the viewpoint of assessing the students’ learning outcomes that is aimed to ensure the teachers and students with the means and conditions to improve the educational process and results.

  2. A Multiple-Item Scale for Assessing E-Government Service Quality

    Science.gov (United States)

    Papadomichelaki, Xenia; Mentzas, Gregoris

    A critical element in the evolution of e-governmental services is the development of sites that better serve the citizens’ needs. To deliver superior service quality, we must first understand how citizens perceive and evaluate online citizen service. This involves defining what e-government service quality is, identifying its underlying dimensions, and determining how it can be conceptualized and measured. In this article we conceptualise an e-government service quality model (e-GovQual) and then we develop, refine, validate, confirm and test a multiple-item scale for measuring e-government service quality for public administration sites where citizens seek either information or services.

  3. The Formation of an Assessment Culture

    OpenAIRE

    Lundahl, Christian

    2004-01-01

    This article deals with the relation between students’ knowledge and the national policies concerning assessment of students’ knowledge. It is argued, with an historical perspective, that the way Sweden in the 1940s came to assess students’ knowledge produced a doxa of normalization as rationalization. This doxa seems badly reflected upon due to the way political, bureaucratic and scientific knowledge production on students’ knowledge is and has been organized.

  4. Differential Item Functioning in the Assessment of ADHD Symptoms Based on Gender and Rating Format

    OpenAIRE

    Arias Martínez, Benito; Universidad de Valladolid; Arias González, Víctor B.; Facultad de Psicología Universidad de Talca Chile; Gómez Sánchez, Laura Elísabet; Universidad de Oviedo; Calleja González, María Angélica Inmaculada; Universidad de Valladolid

    2012-01-01

    El objetivo de este estudio se centró en poner a prueba la invarianza de la sintomatología del Trastorno por Déficit de Atención con Hiperactividad (TDAH) en función del género, en una muestra de 634 niños. Se comprobó, en primer lugar, el ajuste de cinco modelos factoriales mediante análisis factorial confirmatorio, y se utilizó la regresión logística ordinal como método de estimación del funcionamiento diferencial del ítem (DIF), tanto uniforme como no uniforme. Los resultados pusieron de m...

  5. Assessing the discriminating power of item and test scores in the linear factor-analysis model

    Directory of Open Access Journals (Sweden)

    Pere J. Ferrando

    2012-01-01

    Full Text Available Las propuestas rigurosas y basadas en un modelo psicométrico para estudiar el impreciso concepto de "capacidad discriminativa" son escasas y generalmente limitadas a los modelos no-lineales para items binarios. En este artículo se propone un marco general para evaluar la capacidad discriminativa de las puntuaciones en ítems y tests que son calibrados mediante el modelo de un factor común. La propuesta se organiza en torno a tres criterios: (a tipo de puntuación, (b rango de discriminación y (c aspecto específico que se evalúa. Dentro del marco propuesto: (a se discuten las relaciones entre 16 medidas, de las cuales 6 parecen ser nuevas, y (b se estudian las relaciones entre ellas. La utilidad de la propuesta en las aplicaciones psicométricas que usan el modelo factorial se ilustra mediante un ejemplo empírico.

  6. TWO-PARAMETER IRT MODEL APPLICATION TO ASSESS PROBABILISTIC CHARACTERISTICS OF PROHIBITED ITEMS DETECTION BY AVIATION SECURITY SCREENERS

    Directory of Open Access Journals (Sweden)

    Alexander K. Volkov

    2017-01-01

    Full Text Available The modern approaches to the aviation security screeners’ efficiency have been analyzedand, certain drawbacks have been considered. The main drawback is the complexity of ICAO recommendations implementation concerning taking into account of shadow x-ray image complexity factors during preparation and evaluation of prohibited items detection efficiency by aviation security screeners. Х-ray image based factors are the specific properties of the x-ray image that in- fluence the ability to detect prohibited items by aviation security screeners. The most important complexity factors are: geometric characteristics of a prohibited item; view difficulty of prohibited items; superposition of prohibited items byother objects in the bag; bag content complexity; the color similarity of prohibited and usual items in the luggage.The one-dimensional two-parameter IRT model and the related criterion of aviation security screeners’ qualification have been suggested. Within the suggested model the probabilistic detection characteristics of aviation security screeners are considered as functions of such parameters as the difference between level of qualification and level of x-ray images com- plexity, and also between the aviation security screeners’ responsibility and structure of their professional knowledge. On the basis of the given model it is possible to consider two characteristic functions: first of all, characteristic function of qualifica- tion level which describes multi-complexity level of x-ray image interpretation competency of the aviation security screener; secondly, characteristic function of the x-ray image complexity which describes the range of x-ray image interpretation com- petency of the aviation security screeners having various training levels to interpret the x-ray image of a certain level of com- plexity. The suggested complex criterion to assess the level of the aviation security screener qualification allows to evaluate his or

  7. Development of a simple 12-item theory-based instrument to assess the impact of continuing professional development on clinical behavioral intentions.

    Directory of Open Access Journals (Sweden)

    France Légaré

    Full Text Available Decision-makers in organizations providing continuing professional development (CPD have identified the need for routine assessment of its impact on practice. We sought to develop a theory-based instrument for evaluating the impact of CPD activities on health professionals' clinical behavioral intentions.Our multipronged study had four phases. 1 We systematically reviewed the literature for instruments that used socio-cognitive theories to assess healthcare professionals' clinically-oriented behavioral intentions and/or behaviors; we extracted items relating to the theoretical constructs of an integrated model of healthcare professionals' behaviors and removed duplicates. 2 A committee of researchers and CPD decision-makers selected a pool of items relevant to CPD. 3 An international group of experts (n = 70 reached consensus on the most relevant items using electronic Delphi surveys. 4 We created a preliminary instrument with the items found most relevant and assessed its factorial validity, internal consistency and reliability (weighted kappa over a two-week period among 138 physicians attending a CPD activity. Out of 72 potentially relevant instruments, 47 were analyzed. Of the 1218 items extracted from these, 16% were discarded as improperly phrased and 70% discarded as duplicates. Mapping the remaining items onto the constructs of the integrated model of healthcare professionals' behaviors yielded a minimum of 18 and a maximum of 275 items per construct. The partnership committee retained 61 items covering all seven constructs. Two iterations of the Delphi process produced consensus on a provisional 40-item questionnaire. Exploratory factorial analysis following test-retest resulted in a 12-item questionnaire. Cronbach's coefficients for the constructs varied from 0.77 to 0.85.A 12-item theory-based instrument for assessing the impact of CPD activities on health professionals' clinical behavioral intentions showed adequate validity and

  8. Guideline appraisal with AGREE II: online survey of the potential influence of AGREE II items on overall assessment of guideline quality and recommendation for use.

    Science.gov (United States)

    Hoffmann-Eßer, Wiebke; Siering, Ulrich; Neugebauer, Edmund A M; Brockhaus, Anne Catharina; McGauran, Natalie; Eikermann, Michaela

    2018-02-27

    The AGREE II instrument is the most commonly used guideline appraisal tool. It includes 23 appraisal criteria (items) organized within six domains. AGREE II also includes two overall assessments (overall guideline quality, recommendation for use). Our aim was to investigate how strongly the 23 AGREE II items influence the two overall assessments. An online survey of authors of publications on guideline appraisals with AGREE II and guideline users from a German scientific network was conducted between 10th February 2015 and 30th March 2015. Participants were asked to rate the influence of the AGREE II items on a Likert scale (0 = no influence to 5 = very strong influence). The frequencies of responses and their dispersion were presented descriptively. Fifty-eight of the 376 persons contacted (15.4%) participated in the survey and the data of the 51 respondents with prior knowledge of AGREE II were analysed. Items 7-12 of Domain 3 (rigour of development) and both items of Domain 6 (editorial independence) had the strongest influence on the two overall assessments. In addition, Items 15-17 (clarity of presentation) had a strong influence on the recommendation for use. Great variations were shown for the other items. The main limitation of the survey is the low response rate. In guideline appraisals using AGREE II, items representing rigour of guideline development and editorial independence seem to have the strongest influence on the two overall assessments. In order to ensure a transparent approach to reaching the overall assessments, we suggest the inclusion of a recommendation in the AGREE II user manual on how to consider item and domain scores. For instance, the manual could include an a-priori weighting of those items and domains that should have the strongest influence on the two overall assessments. The relevance of these assessments within AGREE II could thereby be further specified.

  9. Assessing reprogramming by chimera formation and tetraploid complementation.

    Science.gov (United States)

    Li, Xin; Xia, Bao-long; Li, Wei; Zhou, Qi

    2015-01-01

    Pluripotent stem cells can be evaluated by pluripotent markers expression, embryoid body aggregation, teratoma formation, chimera contribution and even more, tetraploid complementation. Whether iPS cells in general are functionally equivalent to normal ESCs is difficult to establish. Here, we present the detailed procedure for chimera formation and tetraploid complementation, the most stringent criterion, to assessing pluripotency.

  10. Item response modeling: a psychometric assessment of the children's fruit, vegetable, water, and physical activity self-efficacy scales among Chinese children.

    Science.gov (United States)

    Wang, Jing-Jing; Chen, Tzu-An; Baranowski, Tom; Lau, Patrick W C

    2017-09-16

    This study aimed to evaluate the psychometric properties of four self-efficacy scales (i.e., self-efficacy for fruit (FSE), vegetable (VSE), and water (WSE) intakes, and physical activity (PASE)) and to investigate their differences in item functioning across sex, age, and body weight status groups using item response modeling (IRM) and differential item functioning (DIF). Four self-efficacy scales were administrated to 763 Hong Kong Chinese children (55.2% boys) aged 8-13 years. Classical test theory (CTT) was used to examine the reliability and factorial validity of scales. IRM was conducted and DIF analyses were performed to assess the characteristics of item parameter estimates on the basis of children's sex, age and body weight status. All self-efficacy scales demonstrated adequate to excellent internal consistency reliability (Cronbach's α: 0.79-0.91). One FSE misfit item and one PASE misfit item were detected. Small DIF were found for all the scale items across children's age groups. Items with medium to large DIF were detected in different sex and body weight status groups, which will require modification. A Wright map revealed that items covered the range of the distribution of participants' self-efficacy for each scale except VSE. Several self-efficacy scales' items functioned differently by children's sex and body weight status. Additional research is required to modify the four self-efficacy scales to minimize these moderating influences for application.

  11. SAUDI TEACHERS’ PRACTICES OF FORMATIVE ASSESSMENT: A QUALITATIVE STUDY

    Directory of Open Access Journals (Sweden)

    Saeed Almuntasheri

    2016-12-01

    Full Text Available Shifting from teacher-centred to student-centred practices requires teachers to understand strategies to interact with students in science classes. Formative assessment strategies are very critical component of classroom interaction where teachers obtain information about student learning wherever possible. Traditionally, however, teachers ask questions and evaluate student responses but without investigating student contributions to the classroom interaction. This qualitative study aimed at developing teachers’ knowledge of formative assessment strategies when teaching science-based inquiry in Saudi Arabia. 12 teachers were observed when teaching science and details of one teachers’ practices of formative assessment is presented in this study. Formative assessment framework that describes assessment conversations is used and modified to observe teachers’ assessment practices. Assessment conversation consists of four-step cycles, where the teacher elicits information from students through questioning, the student responds, the teacher recognizes the student’s response, and then uses the information to develop further inquiry. Findings indicate that teachers ask questions and receive responses but rarely allow students to share their own ideas or discuss their own thinking. The study underlines the importance of integrating formative assessment strategies during scientific inquiry teaching for professional development as a way to increase student participation and allow opportunities for students’ inquiry in science classes.

  12. Connected Classroom Technology Facilitates Multiple Components of Formative Assessment Practice

    Science.gov (United States)

    Shirley, Melissa L.; Irving, Karen E.

    2015-02-01

    Formative assessment has been demonstrated to result in increased student achievement across a variety of educational contexts. When using formative assessment strategies, teachers engage students in instructional tasks that allow the teacher to uncover levels of student understanding so that the teacher may change instruction accordingly. Tools that support the implementation of formative assessment strategies are therefore likely to enhance student achievement. Connected classroom technologies (CCTs) include a family of devices that show promise in facilitating formative assessment. By promoting the use of interactive student tasks and providing both teachers and students with rapid and accurate data on student learning, CCT can provide teachers with necessary evidence for making instructional decisions about subsequent lessons. In this study, the experiences of four middle and high school science teachers in their first year of implementing the TI-Navigator™ system, a specific type of CCT, are used to characterize the ways in which CCT supports the goals of effective formative assessment. We present excerpts of participant interviews to demonstrate the alignment of CCT with several main phases of the formative assessment process. CCT was found to support implementation of a variety of instructional tasks that generate evidence of student learning for the teacher. The rapid aggregation and display of student learning evidence provided teachers with robust data on which to base subsequent instructional decisions.

  13. Formative assessment in undergraduate medical education: concept, implementation and hurdles.

    Science.gov (United States)

    Rauf, Ayesha; Shamim, Muhammad Shahid; Aly, Syed Moyn; Chundrigar, Tariq; Alam, Shams Nadeem

    2014-01-01

    Formative assessment, described as "the process of appraising, judging or evaluating students' work or performance and using this to shape and improve students' competence", is generally missing from medical schools of Pakistan. Progressive institutions conduct "formative assessment" as a fleeting part of the curriculum by using various methods that may or may not include feedback to learners. The most important factor in the success of formative assessment is the quality of feedback, shown to have the maximum impact on student accomplishment. Inclusion of formative assessment into the curriculum and its implementation will require the following: Enabling Environment, Faculty and student Training, Role of Department of Medical Education (DME). Many issues can be predicted that may jeopardize the effectiveness of formative assessment including faculty resistance, lack of motivation from students and faculty and paucity of commitment from the top administration. For improvement in medical education in Pakistan, we need to develop a system considered worthy by national and international standards. This paper will give an overview of formative assessment, its implications and recommendations for implementation in medical institutes of Pakistan.

  14. Impact of teaching and assessment format on electrocardiogram interpretation skills.

    Science.gov (United States)

    Raupach, Tobias; Hanneforth, Nathalie; Anders, Sven; Pukrop, Tobias; Th J ten Cate, Olle; Harendza, Sigrid

    2010-07-01

    Interpretation of the electrocardiogram (ECG) is a core clinical skill that should be developed in undergraduate medical education. This study assessed whether small-group peer teaching is more effective than lectures in enhancing medical students' ECG interpretation skills. In addition, the impact of assessment format on study outcome was analysed. Two consecutive cohorts of Year 4 medical students (n=335) were randomised to receive either traditional ECG lectures or the same amount of small-group, near-peer teaching during a 6-week cardiorespiratory course. Before and after the course, written assessments of ECG interpretation skills were undertaken. Whereas this final assessment yielded a considerable amount of credit points for students in the first cohort, it was merely formative in nature for the second cohort. An unannounced retention test was applied 8 weeks after the end of the cardiovascular course. A significant advantage of near-peer teaching over lectures (effect size 0.33) was noted only in the second cohort, whereas, in the setting of a summative assessment, both teaching formats appeared to be equally effective. A summative instead of a formative assessment doubled the performance increase (Cohen's d 4.9 versus 2.4), mitigating any difference between teaching formats. Within the second cohort, the significant difference between the two teaching formats was maintained in the retention test (p=0.017). However, in both cohorts, a significant decrease in student performance was detected during the 8 weeks following the cardiovascular course. Assessment format appeared to be more powerful than choice of instructional method in enhancing student learning. The effect observed in the second cohort was masked by an overriding incentive generated by the summative assessment in the first cohort. This masking effect should be considered in studies assessing the effectiveness of different teaching methods.

  15. Utilising a multi-item questionnaire to assess household food security in Australia.

    Science.gov (United States)

    Butcher, Lucy M; O'Sullivan, Therese A; Ryan, Maria M; Lo, Johnny; Devine, Amanda

    2018-03-15

    Currently, two food sufficiency questions are utilised as a proxy measure of national food security status in Australia. These questions do not capture all dimensions of food security and have been attributed to underreporting of the problem. The purpose of this study was to investigate food security using the short form of the US Household Food Security Survey Module (HFSSM) within an Australian context; and explore the relationship between food security status and multiple socio-demographic variables. Two online surveys were completed by 2334 Australian participants from November 2014 to February 2015. Surveys contained the short form of the HFSSM and twelve socio-demographic questions. Cross-tabulations chi-square tests and a multinomial logistic regression model were employed to analyse the survey data. Food security status of the respondents was classified accordingly: High or Marginal (64%, n = 1495), Low (20%, n = 460) or Very Low (16%, n = 379). Significant independent predictors of food security were age (P important issue across Australia and that certain groups, regardless of income, are particularly vulnerable. Government policy and health promotion interventions that specifically target "at risk" groups may assist to more effectively address the problem. Additionally, the use of a multi-item measure is worth considering as a national indicator of food security in Australia. © 2018 Australian Health Promotion Association.

  16. Attitudes and evaluative practices: category vs. item and subjective vs. objective constructions in everyday food assessments.

    Science.gov (United States)

    Wiggins, Sally; Potter, Jonathan

    2003-12-01

    In social psychology, evaluative expressions have traditionally been understood in terms of their relationship to, and as the expression of, underlying 'attitudes'. In contrast, discursive approaches have started to study evaluative expressions as part of varied social practices, considering what such expressions are doing rather than their relationship to attitudinal objects or other putative mental entities. In this study the latter approach will be used to examine the construction of food and drink evaluations in conversation. The data are taken from a corpus of family mealtimes recorded over a period of months. The aim of this study is to highlight two distinctions that are typically obscured in traditional attitude work ('subjective' vs. 'objective' expressions, category vs. item evaluations). A set of extracts is examined to document the presence of these distinctions in talk that evaluates food and the way they are used and rhetorically developed to perform particular activities (accepting/refusing food, complimenting the food provider, persuading someone to eat). The analysis suggests that researchers (a) should be aware of the potential significance of these distinctions; (b) should be cautious when treating evaluative terms as broadly equivalent and (c) should be cautious when blurring categories and instances. This analysis raises the broader question of how far evaluative practices may be specific to particular domains, and what this specificity might consist in. It is concluded that research in this area could benefit from starting to focus on the role of evaluations in practices and charting their association with specific topics and objects.

  17. Exploring Plausible Causes of Differential Item Functioning in the PISA Science Assessment: Language, Curriculum or Culture

    Science.gov (United States)

    Huang, Xiaoting; Wilson, Mark; Wang, Lei

    2016-01-01

    In recent years, large-scale international assessments have been increasingly used to evaluate and compare the quality of education across regions and countries. However, measurement variance between different versions of these assessments often posts threats to the validity of such cross-cultural comparisons. In this study, we investigated the…

  18. Assessing cross-cultural item bias in questionnaires: Acculturation and the Measurement of Social Support and Family Cohesion for Adolescents

    OpenAIRE

    Hemert, Dianne A. van; Baerveldt, Chris; Vermande, Marjolijn

    2001-01-01

    Amethod is presented for evaluating the presence and size of cross-cultural item biases. The examined items concern parental support and family cohesion in a Likert-type questionnaire for adolescents in The Netherlands. Each evaluated item has two versions, a collectivist and an individualistic one, that measure the same theoretical construct. The standardized difference between the score means of the item versions, called the ?e score, gives an indication of the cultural bias of the item. As...

  19. Symptoms of anxiety in depression: assessment of item performance of the Hamilton Anxiety Rating Scale in patients with depression.

    Science.gov (United States)

    Vaccarino, Anthony L; Evans, Kenneth R; Sills, Terrence L; Kalali, Amir H

    2008-01-01

    Although diagnostically dissociable, anxiety is strongly co-morbid with depression. To examine further the clinical symptoms of anxiety in major depressive disorder (MDD), a non-parametric item response analysis on "blinded" data from four pharmaceutical company clinical trials was performed on the Hamilton Anxiety Rating Scale (HAMA) across levels of depressive severity. The severity of depressive symptoms was assessed using the 17-item Hamilton Depression Rating Scale (HAMD). HAMA and HAMD measures were supplied for each patient on each of two post-screen visits (n=1,668 observations). Option characteristic curves were generated for all 14 HAMA items to determine the probability of scoring a particular option on the HAMA in relation to the total HAMD score. Additional analyses were conducted using Pearson's product-moment correlations. Results showed that anxiety-related symptomatology generally increased as a function of overall depressive severity, though there were clear differences between individual anxiety symptoms in their relationship with depressive severity. In particular, anxious mood, tension, insomnia, difficulties in concentration and memory, and depressed mood were found to discriminate over the full range of HAMD scores, increasing continuously with increases in depressive severity. By contrast, many somatic-related symptoms, including muscular, sensory, cardiovascular, respiratory, gastro-intestinal, and genito-urinary were manifested primarily at higher levels of depression and did not discriminate well at lower HAMD scores. These results demonstrate anxiety as a core feature of depression, and the relationship between anxiety-related symptoms and depression should be considered in the assessment of depression and evaluation of treatment strategies and outcome.

  20. The Dimensional Assessment of Personality Psychopathology Basic Questionnaire: shortened versions item analysis.

    Science.gov (United States)

    Aluja, Anton; Blanch, Àngel; Blanco, Eduardo; Martí-Guiu, Maite; Balada, Ferran

    2015-01-13

    This study has been designed to evaluate and replicate the psychometric properties of the Dimensional Assessment of Personality Psychopathology-Basic Questionnaire (DAPP-BQ) and the DAPP-BQ short form (DAPP-SF) in a large Spanish general population sample. Additionally, we have generated a reduced form called DAPP-90, using a strategy based on a structural equation modeling (SEM) methodology in two independent samples, a calibration and a validation sample. The DAPP-90 scales obtained a more satisfactory fit on SEM adjustment values (average: TLI > .97 and RMSEA assessment of patients in hospital consultation or in brief psychological assessments.

  1. An approach for estimating item sensitivity to within-person change over time: An illustration using the Alzheimer's Disease Assessment Scale-Cognitive subscale (ADAS-Cog).

    Science.gov (United States)

    Dowling, N Maritza; Bolt, Daniel M; Deng, Sien

    2016-12-01

    When assessments are primarily used to measure change over time, it is important to evaluate items according to their sensitivity to change, specifically. Items that demonstrate good sensitivity to between-person differences at baseline may not show good sensitivity to change over time, and vice versa. In this study, we applied a longitudinal factor model of change to a widely used cognitive test designed to assess global cognitive status in dementia, and contrasted the relative sensitivity of items to change. Statistically nested models were estimated introducing distinct latent factors related to initial status differences between test-takers and within-person latent change across successive time points of measurement. Models were estimated using all available longitudinal item-level data from the Alzheimer's Disease Assessment Scale-Cognitive subscale, including participants representing the full-spectrum of disease status who were enrolled in the multisite Alzheimer's Disease Neuroimaging Initiative. Five of the 13 Alzheimer's Disease Assessment Scale-Cognitive items demonstrated noticeably higher loadings with respect to sensitivity to change. Attending to performance change on only these 5 items yielded a clearer picture of cognitive decline more consistent with theoretical expectations in comparison to the full 13-item scale. Items that show good psychometric properties in cross-sectional studies are not necessarily the best items at measuring change over time, such as cognitive decline. Applications of the methodological approach described and illustrated in this study can advance our understanding regarding the types of items that best detect fine-grained early pathological changes in cognition. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  2. Implementing Formative Assessment in Engineering Education: The Use of the Online Assessment System Etude

    Science.gov (United States)

    Dopper, Sofia M.; Sjoer, Ellen

    2004-01-01

    This article describes the possibilities offered by the online assessment system Etude to achieve the benefits of formative assessment. In order to find out the way this works in practice, we carried out an experiment with the use of Etude for formative assessment in the course on collaborative report writing. Results show that online formative…

  3. Examination of validity of fall risk assessment items for screening high fall risk elderly among the healthy community-dwelling Japanese population

    OpenAIRE

    DEMURA, Shinichi; SATO, Susumu; YAMAJI, Shunsuke; KASUGA, Kosho; NAGASAWA, Yoshinori

    2010-01-01

    We aimed to examine the validity of fall risk assessment items for the healthy community-dwelling elderly Japanese population. Participants were 1122 healthy elderly individuals aged 60 years and over (380 males and 742 females). The percentage who had experienced a fall was 15.8%. This study used fall experience and 50 fall risk assessment items representing the five risk factors (symptoms of falling, physical function, disease and physical symptom, environment, and behavior and character), ...

  4. Psychometrical Assessment and Item Analysis of the General Health Questionnaire in Victims of Terrorism

    Science.gov (United States)

    Delgado-Gomez, David; Lopez-Castroman, Jorge; de Leon-Martinez, Victoria; Baca-Garcia, Enrique; Cabanas-Arrate, Maria Luisa; Sanchez-Gonzalez, Antonio; Aguado, David

    2013-01-01

    There is a need to assess the psychiatric morbidity that appears as a consequence of terrorist attacks. The General Health Questionnaire (GHQ) has been used to this end, but its psychometric properties have never been evaluated in a population affected by terrorism. A sample of 891 participants included 162 direct victims of terrorist attacks and…

  5. e-GovQual: A Multiple-Item Scale for Assessing e-Government Service Quality

    Science.gov (United States)

    Papadomichelaki, Xenia; Mentzas, Gregoris

    2012-01-01

    A critical element in the evolution of governmental services through the internet is the development of sites that better serve the citizens' needs. To deliver superior service quality, we must first understand how citizens perceive and evaluate online. Citizen assessment is built on defining quality, identifying underlying dimensions, and…

  6. Public policies: right to learn and formative assessment

    Directory of Open Access Journals (Sweden)

    Antonio Chizzotti

    2016-09-01

    Full Text Available This paper deals with the right to learn in school type education and considers the assessment as assurance of teaching and learning quality. It deals with the current evaluation processes and discriminatory misconceptions of merely summative assessments, which tend to qualify students. This text evaluates the punitive bias of meritocratic grading of learning and argues that only formative assessment can ensure the right to learn

  7. Students’ Perception on Formative and Shared Assessment: Connecting two Universities through the Blogosphere

    Directory of Open Access Journals (Sweden)

    Daniel Martos-Garcia

    2017-01-01

    Full Text Available The aim of this study was to evaluate differences in physical education students’ perception on an educational innovation based on formative and peer assessment through the blogosphere. The sample was made up of 253 students from two Spanish universities. Data was collected using a self-reported questionnaire and t tests were employed in order to find differences among students’ groups. Results show significant differences in almost all of the items on which the students were questioned. Basque students were more satisfied with the assessment tool used than the Valencian students. Students found the blogosphere more active, meaningful, functional and motivating and that it made for collaborative learning in comparison to other traditional evaluation methods. They also showed disapproval related to the demands on attendance, continuity and the greater effort required. For future occasions, negotiation about assessment criteria with the students should be implemented right at the very start of the course.

  8. Formative Assessment: Exploring Tunisian Cooperative Teachers Practices in Physical Education

    OpenAIRE

    Melki Hasan; S. Bouzid Mohamed; Haweni Aymen; Fadhloun Mourad; Mrayeh Meher; Souissi Nizar

    2017-01-01

    Purpose: This article is based on questions related to the formative assessment of preparatory trainee ship in the professional life of Physical Education teachers. In general, in the first training program, the traineeship represents an integral part of training. In this sense, the traineeship offers a vital opportunity for future teacher to gain practical experience in the real environment, given that formative evaluation is a process of collecting evidence from trainees by cooperative teac...

  9. A multidimensional assessment of the validity and utility of alcohol use disorder severity as determined by item response theory models.

    Science.gov (United States)

    Dawson, Deborah A; Saha, Tulshi D; Grant, Bridget F

    2010-02-01

    The relative severity of the 11 DSM-IV alcohol use disorder (AUD) criteria are represented by their severity threshold scores, an item response theory (IRT) model parameter inversely proportional to their prevalence. These scores can be used to create a continuous severity measure comprising the total number of criteria endorsed, each weighted by its relative severity. This paper assesses the validity of the severity ranking of the 11 criteria and the overall severity score with respect to known AUD correlates, including alcohol consumption, psychological functioning, family history, antisociality, and early initiation of drinking, in a representative population sample of U.S. past-year drinkers (n=26,946). The unadjusted mean values for all validating measures increased steadily with the severity threshold score, except that legal problems, the criterion with the highest score, was associated with lower values than expected. After adjusting for the total number of criteria endorsed, this direct relationship was no longer evident. The overall severity score was no more highly correlated with the validating measures than a simple count of criteria endorsed, nor did the two measures yield different risk curves. This reflects both within-criterion variation in severity and the fact that the number of criteria endorsed and their severity are so highly correlated that severity is essentially redundant. Attempts to formulate a scalar measure of AUD will do as well by relying on simple counts of criteria or symptom items as by using scales weighted by IRT measures of severity. Published by Elsevier Ireland Ltd.

  10. Transforming paper-based assessment forms to a digital format

    DEFF Research Database (Denmark)

    Jonasen, Tanja Svarre; Lunn, Tine Bieber; Helle, Tina

    2017-01-01

    Background: The aim of this paper is to provide the reader with an overall impression of the stepwise user-centred design approach including the specific methods used and lessons learned when transforming paper-based assessment forms into a prototype app, taking the Housing Enabler as an example....... Results: The design iterations resulted in the development of a Housing Enabler prototype app. The prototype app has several features and options that are new compared with the original paper-based Housing Enabler assessment form. These new features include a user friendly overview of the assessment form......; easy navigation by swiping back and forth between items; onsite data analysis; and ranking of the accessibility score, photo documentation and a data export facility. Conclusion: Based on the presented stepwise approach, a high-fidelity Housing Enabler prototype app was successfully developed...

  11. Development and Standardization of the Diagnostic Adaptive Behavior Scale: Application of Item Response Theory to the Assessment of Adaptive Behavior

    Science.gov (United States)

    Tassé, Marc J.; Schalock, Robert L.; Thissen, David; Balboni, Giulia; Bersani, Henry, Jr.; Borthwick-Duffy, Sharon A.; Spreat, Scott; Widaman, Keith F.; Zhang, Dalun; Navas, Patricia

    2016-01-01

    The Diagnostic Adaptive Behavior Scale (DABS) was developed using item response theory (IRT) methods and was constructed to provide the most precise and valid adaptive behavior information at or near the cutoff point of making a decision regarding a diagnosis of intellectual disability. The DABS initial item pool consisted of 260 items. Using IRT…

  12. Assessing cross-cultural item bias in questionnaires : Acculturation and the Measurement of Social Support and Family Cohesion for Adolescents

    NARCIS (Netherlands)

    Hemert, Dianne A. van; Baerveldt, Chris; Vermande, Marjolijn

    2001-01-01

    Amethod is presented for evaluating the presence and size of cross-cultural item biases. The examined items concern parental support and family cohesion in a Likert-type questionnaire for adolescents in The Netherlands. Each evaluated item has two versions, a collectivist and an individualistic one,

  13. OPTIONS FOR THE ASSESSMENT OF ITEMS OF FINANCIAL STATEMENTS AT NATIONAL, EUROPEAN AND INTERNATIONAL LEVEL

    Directory of Open Access Journals (Sweden)

    SILVIA SAMARA

    2010-01-01

    Full Text Available The main purpose of evaluation is to determine the financial position and the outcome of the entity’s activity. With the intensification of the phenomena of globalization of economies and financial markets and the emergence of phenomena such as inflation, it began to be more often used the assessment based on the current value and, in particular, on the fair value. The users of the financial statements must always be taken into when selecting a basis of evaluation. Internationally, we can observe the tendency that, by the use of a certain bases of evaluation, to respond favourably to the needs of a various range of users; a balance must be assured between the relevance of the information (their usefulness in decision-making and their reliability (their objectivity.

  14. The 4-Item Negative Symptom Assessment (NSA-4) Instrument: A Simple Tool for Evaluating Negative Symptoms in Schizophrenia Following Brief Training.

    Science.gov (United States)

    Alphs, Larry; Morlock, Robert; Coon, Cheryl; van Willigenburg, Arjen; Panagides, John

    2010-07-01

    Objective. To assess the ability of mental health professionals to use the 4-item Negative Symptom Assessment instrument, derived from the Negative Symptom Assessment-16, to rapidly determine the severity of negative symptoms of schizophrenia.Design. Open participation.Setting. Medical education conferences.Participants. Attendees at two international psychiatry conferences.Measurements. Participants read a brief set of the 4-item Negative Symptom Assessment instructions and viewed a videotape of a patient with schizophrenia. Using the 1 to 6 4-item Negative Symptom Assessment severity rating scale, they rated four negative symptom items and the overall global negative symptoms. These ratings were compared with a consensus rating determination using frequency distributions and Chi-square tests for the proportion of participant ratings that were within one point of the expert rating.Results. More than 400 medical professionals (293 physicians, 50% with a European practice, and 55% who reported past utilization of schizophrenia ratings scales) participated. Between 82.1 and 91.1 percent of the 4-items and the global rating determinations by the participants were within one rating point of the consensus expert ratings. The differences between the percentage of participant rating scores that were within one point versus the percentage that were greater than one point different from those by the consensus experts was significant (pnegative symptoms using the 4-item Negative Symptom Assessment did not generally differ among the geographic regions of practice, the professional credentialing, or their familiarity with the use of schizophrenia symptom rating instruments.Conclusion. These findings suggest that clinicians from a variety of geographic practices can, after brief training, use the 4-item Negative Symptom Assessment effectively to rapidly assess negative symptoms in patients with schizophrenia.

  15. Complement or Contamination: A Study of the Validity of Multiple-Choice Items when Assessing Reasoning Skills in Physics

    OpenAIRE

    Anders Jönsson; David Rosenlund; Fredrik Alvén

    2017-01-01

    The purpose of this study is to investigate the validity of using multiple-choice (MC) items as a complement to constructed-response (CR) items when making decisions about student performance on reasoning tasks. CR items from a national test in physics have been reformulated into MC items and students’ reasoning skills have been analyzed in two substudies. In the first study, 12 students answered the MC items and were asked to explain their answers orally. In the second study, 102 students fr...

  16. Integrating Data-Based Decision Making, Assessment for Learning and Diagnostic Testing in Formative Assessment

    Science.gov (United States)

    Van der Kleij, Fabienne M.; Vermeulen, Jorine A.; Schildkamp, Kim; Eggen, Theo J. H .M.

    2015-01-01

    Recent research has highlighted the lack of a uniform definition of formative assessment, although its effectiveness is widely acknowledged. This paper addresses the theoretical differences and similarities amongst three approaches to formative assessment that are currently most frequently discussed in educational research literature: data-based…

  17. Assessing Health Status in Inflammatory Bowel Disease using a Novel Single-Item Numeric Rating Scale

    Science.gov (United States)

    Surti, Bijal; Spiegel, Brennan; Ippoliti, Andrew; Vasiliauskas, Eric; Simpson, Peter; Shih, David; Targan, Stephan; McGovern, Dermot; Melmed, Gil Y.

    2014-01-01

    Background Current instruments used to measure disease activity and health-related quality of life (HRQOL) in patients with Crohn’s disease (CD) and ulcerative colitis (UC) are often cumbersome, time-consuming, and expensive; although used in clinical trials, they are not convenient for clinical practice. A numeric rating scale (NRS) is a quick, inexpensive, and convenient patient-reported outcome (PRO) that can capture the patient’s overall perception of health. Aims To assess the validity, reliability, and responsiveness of an NRS and evaluate its use in clinical practice in patients with CD and UC. Methods We prospectively evaluated patient-reported NRS scores and measured correlations between NRS and a range of severity measures, including physician-reported NRS, Crohn’s disease activity index (CDAI), Harvey-Bradshaw index (HBI), inflammatory bowel disease questionnaire (IBDQ), and C-reactive protein (CRP) in patients with CD. Subsequently, we evaluated the correlation between the NRS and standard measures of health status (HBI or simple colitis clinical activity index [SCCAI]) and laboratory tests (sedimentation rate [ESR], CRP, and fecal calprotectin) in patients with CD and UC. Results The patient-reported NRS showed excellent correlation with CDAI (R2=0.59, p<0.0001), IBDQ (R2=0.66, p<0.0001), and HBI (R2=0.32, p<0.0001) in patients with CD. The NRS showed poor, but statistically significant correlation with SCCAI (R2=0.25, p<0.0001) in patients with UC. The NRS did not correlate with CRP, ESR, or calprotectin. The NRS was reliable and responsive to change. Conclusions The NRS is a valid, reliable, and responsive measure that may be useful to evaluate patients with CD and possibly UC. PMID:23250673

  18. Formative Assessment and the Classroom Teacher: Recommendations for School Psychologists

    Science.gov (United States)

    Williams, Stacy A. S.; Stenglein, Katherine

    2016-01-01

    In order for school psychologists to effectively work with teachers, it is important to understand not only the context in which they work, but to understand how educators consider and subsequently use data. Therefore, the purpose of this article is to examine how formative assessments are conceptualized in teacher training and pedagogical…

  19. Formative Assessment Design for PDA Integrated Ecology Observation

    Science.gov (United States)

    Hung, Pi-Hsia; Lin, Yu-Fen; Hwang, Gwo-Jen

    2010-01-01

    Ubiquitous computing and mobile technologies provide a new perspective for designing innovative outdoor learning experiences. The purpose of this study is to propose a formative assessment design for integrating PDAs into ecology observations. Three learning activities were conducted in this study. An action research approach was applied to…

  20. Formative Assessment, Communication Skills and ICT in Initial Teacher Training

    Science.gov (United States)

    Romero-Martín, M. Rosario; Castejón-Oliva, Francisco-Javier; López-Pastor, Víctor-Manuel; Fraile-Aranda, Antonio

    2017-01-01

    The purpose of this study is to analyze the perception of students, graduates, and lecturers in relation to systems of formative and shared assessment and to the acquisition of teaching competences regarding communication and the use of Information and Communications Technology (ICT) in initial teacher education (ITE) on degrees in Primary…

  1. A Step-by-Step Study of Formative Assessment

    Science.gov (United States)

    Pietsch, Laura

    2013-01-01

    This article presents a guide to the development of formative assessments for school librarians participating in professional learning communities (PLC). It describes librarians' reading of assigned books, meeting with their PLCs, and incorporation of learned strategies in their daily instruction. Central library service readers' regular visits to…

  2. Formative assessment in teacher talk during lesson studies

    NARCIS (Netherlands)

    van Halem, Nicolette; Goei, Sui Lin; Akkerman, Sanne F.

    2016-01-01

    Purpose: The purpose of this paper is to evaluate the extent of systematic examination of students’ educational (support) needs by teachers participating in lesson study (LS) meetings within a framework of formative assessment (FA). Design/methodology/approach: The study took place in the context of

  3. Tough Choices in Designing a Formative Assessment System

    Science.gov (United States)

    Sharkey, Nancy S.; Murnane, Richard J.

    2006-01-01

    A growing number of school districts in the United States are introducing formative assessment systems to measure student skills in core subjects throughout the year. The underlying logic is that providing teachers with timely information on student skills will enable them to improve instruction and better prepare students to excel on high-stakes,…

  4. Formative Assessment and the Design of Instructional Systems.

    Science.gov (United States)

    Sadler, D. Royce

    1989-01-01

    Discusses the nature and function of formative assessment in the development of students' expertise for evaluating the quality of their own work. Highlights include the transition from teacher-supplied feedback to learner self-monitoring; qualitative judgments; communicating standards to students; multicriterion judgments; and implications for the…

  5. Developing classroom formative assessment in dutch primary mathematics education

    NARCIS (Netherlands)

    van den Berg, M.; Harskamp, E.G.; Suhre, C.J.M.

    2016-01-01

    In the last two decades Dutch primary school students scored below expectation in international mathematics tests. An explanation for this may be that teachers fail to adequately assess their students’ understanding of learning goals and provide timely feedback. To improve the teachers’ formative

  6. An online formative assessment tool to prepare students for ...

    African Journals Online (AJOL)

    Methods. Our e-learning initiative, eQuip, is a custom-built e-learning platform specifically created to align question types included in the program to be similar to those used in current assessments. We describe our formative e-learning system and present preliminary results after the first year of introduction, reporting on the ...

  7. Proposta de um instrumento de medida para avaliar a satisfação de clientes de bancos utilizando a Teoria da Resposta ao Item Proposal of tool to assess the satisfaction of bank customers using the Item Response Theory

    Directory of Open Access Journals (Sweden)

    Alceu Balbim Junior

    2011-01-01

    Full Text Available Este artigo apresenta um instrumento de medida para avaliação da satisfação de clientes de bancos utilizando a Teoria da Resposta ao Item (TRI. Satisfazer os clientes tem sido uma busca constante das organizações que procuram manterem-se competitivas no mercado. Estudos constatam a relação entre a qualidade percebida pelos clientes, a satisfação e fidelidade. A avaliação da satisfação pode ser realizada por meio da qualidade percebida pelos clientes e a construção de ferramentas de avaliação deve contemplar características específicas da atividade em questão. Embasando-se em artigos que avaliam a satisfação de clientes de bancos, propõe-se um instrumento formado por 29 itens. Os itens foram aplicados a 240 clientes a fim de avaliar a satisfação com o banco de maior relacionamento. Utilizando a Teoria da Resposta ao Item, foram identificados os parâmetros dos itens e a curva de informação. A análise do grau de discriminação dos itens indicou que todos são apropriados. A curva de informação obtida evidenciou o intervalo no qual o instrumento apresenta melhores estimativas para níveis de satisfação. O trabalho apresentou o nível médio de satisfação da amostra e a concentração de clientes nos diferentes níveis de satisfação da escala.This paper presents a model for assessing the satisfaction of bank customers using the Item Response Theory (IRT. Organizations are constantly making effort to satisfy customers seeking to remain competitive. Several studies have reported on the relationship between perceived quality, satisfaction, and loyalty. The assessment of satisfaction can be accomplished through the perceived quality, and the development of assessment tools should address specific features of the activity in question. Based on articles that assess the satisfaction of bank customers, this study proposes an assessment tool consisting of 29 items. The items were applied to 240 clients to assess their

  8. A FORMATIVE ASSESSMENT MODEL OF CRITICAL THINKING IN MATHEMATICS LEARNING IN JUNIOR HIGH SCHOOL

    Directory of Open Access Journals (Sweden)

    R. Rosnawati

    2015-12-01

    Full Text Available This study aims to obtain a valid and reliable formative evaluation model of critical thinking. The method used in this research was the research and development by integrating Borg & Gall's model and  Plomp's development model. The ten steps Borg & Gall’s model were modified into five stages as the stages in the Plomp's model. The subjects in this study were 1,446 students of junior high schools in DIY, 14 mathematics teacher, and six experts. The content validity employed was expert judgment, the empirical validity and reliability used were loading factor, item analysis used PCM 1PL, and the relationship between disposition and critical thinking skill used was structural equation modeling (SEM. The developed formative evaluation model is the procedural model. There are five aspects of critical thinking skill: mathematic reasoning, interpretation, analysis, evaluation, and inference, which entirely composed of 42 items. The validity of the critical thinking skill instruments achieves a significance degree as indicated by the lowest and the highest loading factors of 0.38 and 0.74 subsequently, the reliability of every aspect in a good category. The average level of difficulty is 0.00 with the standard deviation of 0.45 which is in a good category. The peer assessment questionnaire of critical thinking disposition consists of seven aspects: truth-seeking, open-minded, analysis, systematic, self-confidence, inquisitiveness, and maturity with 23 items. The critical thinking disposition validity achieves the significance degree as indicated by the lowest and the high factor loading of 0.66 and 0.76 subsequently, and the reliability of every aspect in a good category. Based on the analysis of the structural equation model, the model fits the data.

  9. Formative Assessment: Exploring Tunisian Cooperative Teachers Practices in Physical Education

    Directory of Open Access Journals (Sweden)

    Melki Hasan

    2017-10-01

    Full Text Available Purpose: This article is based on questions related to the formative assessment of preparatory trainee ship in the professional life of Physical Education teachers. In general, in the first training program, the traineeship represents an integral part of training. In this sense, the traineeship offers a vital opportunity for future teacher to gain practical experience in the real environment, given that formative evaluation is a process of collecting evidence from trainees by cooperative teachers to make decisions about their knowledge and skills, to guide their own instructional activities and to control their behavior. Accordingly, this study proposed to explore practices of Tunisians cooperative teachers in relation to the formative assessment. Material: To verify our proposed object, we conducted a research using a questionnaire distributed among 96 cooperative teachers in different educational institutions located in the region of the greater Tunis. During the school year 2015-2016, the questionnaire was the subject of a statistical analysis using frequencies and percentages. Results: The analysis of such data revealed a range of practices about formative estimation among cooperative teachers. In particular, each teacher acknowledged the value of guiding and encouraging student’s self-assessment. So that they could lead their students to assume a share of evaluative activity. Conclusion: Both theoretical and practical implications of these findings are discussed, and some recommendations are made for future practice.

  10. THE POTENTIAL OF BIOCHEMISTRY EDUCATION APPS IN THE FORMATIVE ASSESSMENT

    Directory of Open Access Journals (Sweden)

    M. L. Oliveira

    2015-08-01

    Full Text Available Introduction and objectives: Apps can be designed to provide usage data, and most of them do. These data are usually used to map users interests and to deliver more effective ads that are more likely to result in clicks, and sales. We have applied some of these metrics to understand how can it be used to map students’ behavior and to promote a formative assessment using educational software. The purpose of a formative assessment is to monitor student learning to provide ongoing feedback that can be used by instructors and students to improve the teaching and learning process. Thus, this modality aims to help both students and instructors to identify strengths and weaknesses that need to be developed. This study aimed to describe the potential of educational apps in the formative assessment process. Material and Methods: We have implemented assessment tools embedded in three apps (ARMET, The Cell and 3D Class used to teach: 1 Metabolic Pathways; 2 Scale of the cellular structures, and 3 Concepts from techniques used in a Biochemistry Lab course. The implemented tools allow to verify on what issues there were recurring mistakes, the total number of mistakes presented, which questions they most achieved, how long they took to perform the activity and other relevant information. Results and conclusion: Educational apps can provide transparent and coherent evaluation metrics to enable instructors to systematize more consistent criteria and indicators, reducing the subjectivity of the formative assessment process and the time spent for preparation, tabulation and analysis of assessment data. This approach allows instructors to understand better where students struggle, giving to them a more effective feedback. It also helps instructor to plan interventions to help students to perform better and to achieve the learning objectives.

  11. A Third-Order Item Response Theory Model for Modeling the Effects of Domains and Subdomains in Large-Scale Educational Assessment Surveys

    Science.gov (United States)

    Rijmen, Frank; Jeon, Minjeong; von Davier, Matthias; Rabe-Hesketh, Sophia

    2014-01-01

    Second-order item response theory models have been used for assessments consisting of several domains, such as content areas. We extend the second-order model to a third-order model for assessments that include subdomains nested in domains. Using a graphical model framework, it is shown how the model does not suffer from the curse of…

  12. Short Scales for the Assessment of Personality Traits: Development and Validation of the Portuguese Ten-Item Personality Inventory (TIPI).

    Science.gov (United States)

    Nunes, Andreia; Limpo, Teresa; Lima, César F; Castro, São Luís

    2018-01-01

    The importance of quickly assessing personality traits in many studies prompted the development of brief scales such as the Ten-Item Personality Inventory (TIPI), a measure of five personality traits (extraversion, agreeableness, conscientiousness, emotional stability, and openness). In the current study, we present the Portuguese version of TIPI and examine its psychometric properties, based on a sample of 333 Portuguese adults aged 18 to 65 years. The results revealed reliability coefficients similar to the original version (α = 0.39-0.72), very good 4-week test-retest reliability ( n = 81, r s > 0.71), expected factorial structure, high convergent validity with the Big-Five Inventory ( r s > 0.60), and correlations with self-esteem, affect, and aggressiveness similar to those found with standard measures of personality traits. Overall, our findings suggest that the Portuguese TIPI is a reliable and valid alternative to longer measures: it offers a promising tool for research contexts in which the available time for personality assessment is highly limited.

  13. Short Scales for the Assessment of Personality Traits: Development and Validation of the Portuguese Ten-Item Personality Inventory (TIPI)

    Science.gov (United States)

    Nunes, Andreia; Limpo, Teresa; Lima, César F.; Castro, São Luís

    2018-01-01

    The importance of quickly assessing personality traits in many studies prompted the development of brief scales such as the Ten-Item Personality Inventory (TIPI), a measure of five personality traits (extraversion, agreeableness, conscientiousness, emotional stability, and openness). In the current study, we present the Portuguese version of TIPI and examine its psychometric properties, based on a sample of 333 Portuguese adults aged 18 to 65 years. The results revealed reliability coefficients similar to the original version (α = 0.39–0.72), very good 4-week test–retest reliability (n = 81, rs > 0.71), expected factorial structure, high convergent validity with the Big-Five Inventory (rs > 0.60), and correlations with self-esteem, affect, and aggressiveness similar to those found with standard measures of personality traits. Overall, our findings suggest that the Portuguese TIPI is a reliable and valid alternative to longer measures: it offers a promising tool for research contexts in which the available time for personality assessment is highly limited. PMID:29674989

  14. Using Procedure Based on Item Response Theory to Evaluate Classification Consistency Indices in the Practice of Large-Scale Assessment

    Directory of Open Access Journals (Sweden)

    Shanshan Zhang

    2017-09-01

    Full Text Available In spite of the growing interest in the methods of evaluating the classification consistency (CC indices, only few researches are available in the field of applying these methods in the practice of large-scale educational assessment. In addition, only few studies considered the influence of practical factors, for example, the examinee ability distribution, the cut score location and the score scale, on the performance of CC indices. Using the newly developed Lee's procedure based on the item response theory (IRT, the main purpose of this study is to investigate the performance of CC indices when practical factors are taken into consideration. A simulation study and an empirical study were conducted under comprehensive conditions. Results suggested that with negatively skewed distribution, the CC indices were larger than with other distributions. Interactions occurred among ability distribution, cut score location, and score scale. Consequently, Lee's IRT procedure is reliable to be used in the field of large-scale educational assessment, and when reporting the indices, it should be treated with caution as testing conditions may vary a lot.

  15. Analysis of Item-Level Bias in the Bayley-III Language Subscales: The Validity and Utility of Standardized Language Assessment in a Multilingual Setting.

    Science.gov (United States)

    Goh, Shaun K Y; Tham, Elaine K H; Magiati, Iliana; Sim, Litwee; Sanmugam, Shamini; Qiu, Anqi; Daniel, Mary L; Broekman, Birit F P; Rifkin-Graboi, Anne

    2017-09-18

    The purpose of this study was to improve standardized language assessments among bilingual toddlers by investigating and removing the effects of bias due to unfamiliarity with cultural norms or a distributed language system. The Expressive and Receptive Bayley-III language scales were adapted for use in a multilingual country (Singapore). Differential item functioning (DIF) was applied to data from 459 two-year-olds without atypical language development. This involved investigating if the probability of success on each item varied according to language exposure while holding latent language ability, gender, and socioeconomic status constant. Associations with language, behavioral, and emotional problems were also examined. Five of 16 items showed DIF, 1 of which may be attributed to cultural bias and another to a distributed language system. The remaining 3 items favored toddlers with higher bilingual exposure. Removal of DIF items reduced associations between language scales and emotional and language problems, but improved the validity of the expressive scale from poor to good. Our findings indicate the importance of considering cultural and distributed language bias in standardized language assessments. We discuss possible mechanisms influencing performance on items favoring bilingual exposure, including the potential role of inhibitory processing.

  16. Exploring Formative Assessment Using Cultural Historical Activity Theory

    Directory of Open Access Journals (Sweden)

    Mandy Asghar

    2013-02-01

    Full Text Available Formative assessment is a pedagogic practice that has been the subject of much research and debate, as to how it can be used most effectively to deliver enhanced student learning in the higher education setting. Often described as a complex concept it embraces activities that range from facilitating students understanding of assessment standards, to providing formative feedback on their work; from very informal opportunities of engaging in conversations, to the very formal process of submitting drafts of work. This study aims to show how cultural historical activity theory can be used as a qualitative analysis framework to explore the complexities of formative assessment as it is used in higher education. The original data for the research was collected in 2008 by semi structured interviews and analysed using a hermeneutic phenomenological approach. For this present paper three selected transcripts were re-examined, using a case study approach that sought to understand and compare the perceptions of five academic staff, from three distinct subject areas taught within a UK university. It is proposed that using activity theory can provide insight into the complexity of such experiences, about what teachers do and why, and the influence of the community in which they are situated. Individually the cases from each subject area were analysed using activity theory exploring how the mediating artefacts of formative assessment were used; the often implicit rules that governed their use and the roles of teachers and students within the local subject community. The analysis also considered the influence each aspect of the unit of activity had on the other in understanding formative assessment practice. Subsequently the three subject cases were compared and contrasted. The findings illuminate a variety of practices, including how students and staff engage together in formative assessment activities and for some, how dialogue is used as one of the key tools

  17. Assessing Psycho-social Barriers to Rehabilitation in Injured Workers with Chronic Musculoskeletal Pain: Development and Item Properties of the Yellow Flag Questionnaire (YFQ).

    Science.gov (United States)

    Salathé, Cornelia Rolli; Trippolini, Maurizio Alen; Terribilini, Livio Claudio; Oliveri, Michael; Elfering, Achim

    2018-06-01

    Purpose To develop a multidimensional scale to asses psychosocial beliefs-the Yellow Flag Questionnaire (YFQ)-aimed at guiding interventions for workers with chronic musculoskeletal (MSK) pain. Methods Phase 1 consisted of item selection based on literature search, item development and expert consensus rounds. In phase 2, items were reduced with calculating a quality-score per item, using structure equation modeling and confirmatory factor analysis on data from 666 workers. In phase 3, Cronbach's α, and Pearson correlations coefficients were computed to compare YFQ with disability, anxiety, depression and self-efficacy and the YFQ score based on data from 253 injured workers. Regressions of YFQ total score on disability, anxiety, depression and self-efficacy were calculated. Results After phase 1, the YFQ included 116 items and 15 domains. Further reductions of items in phase 2 by applying the item quality criteria reduced the total to 48 items. Phase factor analysis with structural equation modeling confirmed 32 items in seven domains: activity, work, emotions, harm & blame, diagnosis beliefs, co-morbidity and control. Cronbach α was 0.91 for the total score, between 0.49 and 0.81 for the 7 distinct scores of each domain, respectively. Correlations between YFQ total score ranged with disability, anxiety, depression and self-efficacy was .58, .66, .73, -.51, respectively. After controlling for age and gender the YFQ total score explained between R2 27% and R2 53% variance of disability, anxiety, depression and self-efficacy. Conclusions The YFQ, a multidimensional screening scale is recommended for use to assess psychosocial beliefs of workers with chronic MSK pain. Further evaluation of the measurement properties such as the test-retest reliability, responsiveness and prognostic validity is warranted.

  18. Gender-Based Differential Item Performance in Mathematics Achievement Items.

    Science.gov (United States)

    Doolittle, Allen E.; Cleary, T. Anne

    1987-01-01

    Eight randomly equivalent samples of high school seniors were each given a unique form of the ACT Assessment Mathematics Usage Test (ACTM). Signed measures of differential item performance (DIP) were obtained for each item in the eight ACTM forms. DIP estimates were analyzed and a significant item category effect was found. (Author/LMO)

  19. Encouraging formative assessments of leadership for foundation doctors.

    Science.gov (United States)

    Hadley, Lindsay; Black, David; Welch, Jan; Reynolds, Peter; Penlington, Clare

    2015-08-01

    Clinical leadership is considered essential for maintaining and improving patient care and safety in the UK, and is incorporated in the curriculum for all trainee doctors. Despite the growing focus on the importance of leadership, and the introduction of the Medical Leadership Competency Framework (MLCF) in the UK, leadership education for doctors in training is still in its infancy. Assessment is focused on clinical skills, and trainee doctors receive very little formal feedback on their leadership competencies. In this article we describe the approach taken by Health Education Kent, Sussex and Surrey (HEKSS) to raise the profile of leadership amongst doctors in training in the South Thames Foundation School (STFS). An annual structured formative assessment in leadership for each trainee has been introduced, supported by leadership education for both trainees and their supervisors in HEKSS trusts. We analysed over 500 of these assessments from the academic year 2012/13 for foundation doctors in HEKSS trusts, in order to assess the quality of the feedback. From the analysis, potential indicators of more effective formative assessments were identified. These may be helpful in improving the leadership education programme for future years. There is a wealth of evidence to highlight the importance and value of formative assessments; however, particularly for foundation doctors, these have typically been focused on assessing clinical capabilities. This HEKSS initiative encourages doctors to recognise leadership opportunities at the beginning of their careers, seeks to help them understand the importance of acquiring leadership skills and provides structured feedback to help them improve. Leadership education for doctors in training is still in its infancy. © 2015 John Wiley & Sons Ltd.

  20. FORMATIVE ASSESSMENT IN DISTANCE EDUCATION: ENHANCING LEARNING THROUGH DIARIES

    Directory of Open Access Journals (Sweden)

    Christiane Heemann

    2015-12-01

    Full Text Available Assessment integrates the teaching and learning process and always has room for discussion in educational processes, requiring technical preparation and observation capacity from those involved. According to Perrenoud (2014, assessment for learning is a mediator in the process of curriculum construction and is closely related to the management of learning by the students. Assessment methods occupy a very important space in the pedagogical practices since assessment cannot be an act that expresses only a quantitative and formal concept. In Distance Education (DE, formative assessment also needs to be prioritized and avoid traditional evaluation which is performed through multiple-choice tests with self-correction. The use of diaries in Distance Education maintains the focus on the evaluation process and not only on the product, configuring itself as a permanent orientation of learning, both for the teacher and for the student, who jointly assume reciprocal commitments. This article presents an experiment conducted with diaries on an undergraduate course offered by Universidade Aberta do Brasil (UAB as a means of formative assessment in Distance Education.

  1. Assessing Psychopathy Among Justice Involved Adolescents with the PCL: YV: An Item Response Theory Examination Across Gender

    Science.gov (United States)

    Tsang, Siny; Schmidt, Karen M.; Vincent, Gina M.; Salekin, Randall T.; Moretti, Marlene M.; Odgers, Candice L.

    2014-01-01

    This study used an item response theory (IRT) model and a large adolescent sample of justice involved youth (N = 1,007, 38% female) to examine the item functioning of the Psychopathy Checklist – Youth Version (PCL: YV). Items that were most discriminating (or most sensitive to changes) of the latent trait (thought to be psychopathy) among adolescents included “Glibness/superficial charm”, “Lack of remorse”, and “Need for stimulation”, whereas items that were least discriminating included “Pathological lying”, “Failure to accept responsibility”, and “Lacks goals.” The items “Impulsivity” and “Irresponsibility” were the most likely to be rated high among adolescents, whereas “Parasitic lifestyle”, and “Glibness/superficial charm” were the most likely to be rated low. Evidence of differential item functioning (DIF) on four of the 13 items was found between boys and girls. “Failure to accept responsibility” and “Impulsivity” were endorsed more frequently to describe adolescent girls than boys at similar levels of the latent trait, and vice versa for “Grandiose sense of self-worth” and “Lacks goals.” The DIF findings suggest that four PCL: YV items function differently between boys and girls. PMID:25580672

  2. Investigating the Dynamics of Formative Assessment: Relationships between Teacher Knowledge, Assessment Practice and Learning

    Science.gov (United States)

    Herman, Joan; Osmundson, Ellen; Dai, Yunyun; Ringstaff, Cathy; Timms, Michael

    2015-01-01

    This exploratory study of elementary school science examines questions central to policy, practice and research on formative assessment: What is the quality of teachers' content-pedagogical and assessment knowledge? What is the relationship between teacher knowledge and assessment practice? What is the relationship between teacher knowledge,…

  3. Engineering Faculty Motivation for and Engagement in Formative Assessment

    OpenAIRE

    Stanton, Kenneth C.

    2011-01-01

    The purposes of this study were to conduct an exploratory study of the status quo of engineering faculty motivation for and engagement in formative assessment, and to conduct a preliminary validation of a motivational model, based in self-determination theory, that explains relationships between these variables. To do so, a survey instrument was first developed and validated, in accordance with a process prescribed in the literature, that measured individual engineering faculty membersâ mo...

  4. A study of the psychometric properties of 12-item World Health Organization Disability Assessment Schedule 2.0 in a large population of people with chronic musculoskeletal pain.

    Science.gov (United States)

    Saltychev, Mikhail; Bärlund, Esa; Mattie, Ryan; McCormick, Zachary; Paltamaa, Jaana; Laimi, Katri

    2017-02-01

    To assess the validity of the Finnish translation of the 12-item World Health Organization Disability Assessment Schedule (WHODAS 2.0). Cross-sectional cohort survey study. Physical and Rehabilitation Medicine outpatient university clinic. The 501 consecutive patients with chronic musculoskeletal pain. Exploratory factor analysis and a graded response model using item response theory analysis were used to assess the constructs and discrimination ability of WHODAS 2.0. The exploratory factor analysis revealed two retained factors with eigenvalues 5.15 and 1.04. Discrimination ability of all items was high or perfect, varying from 1.2 to 2.5. The difficulty levels of seven out of 12 items were shifted towards the elevated disability level. As a result, the entire test characteristic curve showed a shift towards higher levels of disability, placing it at the point of disability level of +1 (where 0 indicates the average level of disability within the sample). The present data indicate that the Finnish translation of the 12-item WHODAS 2.0 is a valid instrument for measuring restrictions of activity and participation among patients with chronic musculoskeletal pain.

  5. Item level diagnostics and model - data fit in item response theory ...

    African Journals Online (AJOL)

    Item response theory (IRT) is a framework for modeling and analyzing item response data. Item-level modeling gives IRT advantages over classical test theory. The fit of an item score pattern to an item response theory (IRT) models is a necessary condition that must be assessed for further use of item and models that best fit ...

  6. Validation of a 4-item Negative Symptom Assessment (NSA-4): a short, practical clinical tool for the assessment of negative symptoms in schizophrenia.

    Science.gov (United States)

    Alphs, Larry; Morlock, Robert; Coon, Cheryl; Cazorla, Pilar; Szegedi, Armin; Panagides, John

    2011-06-01

    The 16-item Negative Symptom Assessment (NSA-16) scale is a validated tool for evaluating negative symptoms of schizophrenia. The psychometric properties and predictive power of a four-item version (NSA-4) were compared with the NSA-16. Baseline data from 561 patients with predominant negative symptoms of schizophrenia who participated in two identically designed clinical trials were evaluated. Ordered logistic regression analysis of ratings using NSA-4 and NSA-16 were compared with ratings using several other standard tools to determine predictive validity and construct validity. Internal consistency and test--retest reliability were also analyzed. NSA-16 and NSA-4 scores were both predictive of scores on the NSA global rating (odds ratio = 0.83-0.86) and the Clinical Global Impressions--Severity scale (odds ratio = 0.91-0.93). NSA-16 and NSA-4 showed high correlation with each other (Pearson r = 0.85), similar high correlation with other measures of negative symptoms (demonstrating convergent validity), and lesser correlations with measures of other forms of psychopathology (demonstrating divergent validity). NSA-16 and NSA-4 both showed acceptable internal consistency (Cronbach α, 0.85 and 0.64, respectively) and test--retest reliability (intraclass correlation coefficient, 0.87 and 0.82). This study demonstrates that NSA-4 offers accuracy comparable to the NSA-16 in rating negative symptoms in patients with schizophrenia. Copyright © 2011 John Wiley & Sons, Ltd.

  7. Student’s Video Production as Formative Assessment

    Directory of Open Access Journals (Sweden)

    Eduardo Gama

    2017-04-01

    Full Text Available Learning assessments are subject of discussion both in their theoretical and practical approaches. The process of measuring learning in physics by high school students, either qualitatively or quantitatively, is one in which it should be possible to identify not only the concepts and contents students failed to achieve but also the reasons for the failure. We propose that students’ video production offers a very effective formative assessment tool to teachers: as a formative assessment, it produces information that allows the understanding of where and when the learning process succeeded or failed, of identifying, as a subject or as a group, the deficiencies or misunderstandings related to the theme under analysis and their interpretation by students, and it provides also a different kind of assessment, related to some other life skills, such as ability to carry on a project till its conclusion and to work cooperatively. In this paper, we describe the use of videos produced by high school students as an assessment resource. The students were asked to prepare a short video, which was then presented to the whole group and discussed. The videos reveal aspects of students’ difficulties that usually do not appear in formal assessments such as tests and questionnaires. After the use of the videos as a component of classroom assessments and the use of the discussions to rethink learning activities in the group, the videos were analysed and classified in various categories. This analysis showed a strong correlation between the technical quality of the video and the content quality of the students’ argumentation. Also, it was shown that the students do not prepare their video based on quick and easy production; they usually choose forms of video production that require careful planning and implementation, and this reflects directly on the overall quality of the video and of the learning process.

  8. Technology-Enhanced Formative Assessment of Plant Identification

    Science.gov (United States)

    Conejo, Ricardo; Garcia-Viñas, Juan Ignacio; Gastón, Aitor; Barros, Beatriz

    2016-04-01

    Developing plant identification skills is an important part of the curriculum of any botany course in higher education. Frequent practice with dried and fresh plants is necessary to recognize the diversity of forms, states, and details that a species can present. We have developed a web-based assessment system for mobile devices that is able to pose appropriate questions according to the location of the student. A student's location can be obtained using the device position or by scanning a QR code attached to a dried plant sheet in a herbarium or to a fresh plant in an arboretum. The assessment questions are complemented with elaborated feedback that, according to the students' responses, provides indications of possible mistakes and correct answers. Three experiments were designed to measure the effectiveness of the formative assessment using dried and fresh plants. Three questionnaires were used to evaluate the system performance from the students' perspective. The results clearly indicate that formative assessment is objectively effective compared to traditional methods and that the students' attitudes towards the system were very positive.

  9. Structured assessment format for evaluating operative reports in general surgery.

    Science.gov (United States)

    Vergis, Ashley; Gillman, Lawrence; Minor, Samuel; Taylor, Mark; Park, Jason

    2008-01-01

    Despite its multifaceted importance, no validated or reliable tools assess the quality of the dictated operative note. This study determined the construct validity, interrater reliability, and internal consistency of a Structured Assessment Format for Evaluating Operative Reports (SAFE-OR) in general surgery. SAFE-OR was developed by using consensus criteria set forth by the Canadian Association of General Surgeons. This instrument includes a structured assessment and a global quality rating scale. Residents divided into novice and experienced groups viewed and dictated a videotaped laparoscopic sigmoid colectomy. Blinded, independent faculty evaluators graded the transcribed reports using SAFE-OR. Twenty-one residents participated in the study. Mean structured assessment scores (out of 44) were significantly lower for novice versus experienced residents (23.3 +/- 5.2 vs 34.1 +/- 6.0, t = .001). Mean global quality scores (out of 45) were similarly lower for novice residents (25.6 +/- 4.7 vs 35.9 +/- 7.6, t = .006). Interclass correlation coefficients were .98 (95% confidence interval, .96-.99) for structured assessment and .93 (95% confidence interval, .83-.97) for global quality scales. Cronbach alpha coefficients for internal consistency were .85 for structured assessment and .96 for global quality assessment scales. SAFE-OR shows significant construct validity, excellent interrater reliability, and high internal consistency. This tool will allow educators to objectively evaluate the quality of trainee operative reports and provide a mechanism for implementing, monitoring, and refining curriculum for dictation skills.

  10. Assessing the Equivalence of Paper, Mobile Phone, and Tablet Survey Responses at a Community Mental Health Center Using Equivalent Halves of a 'Gold-Standard' Depression Item Bank.

    Science.gov (United States)

    Brodey, Benjamin B; Gonzalez, Nicole L; Elkin, Kathryn Ann; Sasiela, W Jordan; Brodey, Inger S

    2017-09-06

    The computerized administration of self-report psychiatric diagnostic and outcomes assessments has risen in popularity. If results are similar enough across different administration modalities, then new administration technologies can be used interchangeably and the choice of technology can be based on other factors, such as convenience in the study design. An assessment based on item response theory (IRT), such as the Patient-Reported Outcomes Measurement Information System (PROMIS) depression item bank, offers new possibilities for assessing the effect of technology choice upon results. To create equivalent halves of the PROMIS depression item bank and to use these halves to compare survey responses and user satisfaction among administration modalities-paper, mobile phone, or tablet-with a community mental health care population. The 28 PROMIS depression items were divided into 2 halves based on content and simulations with an established PROMIS response data set. A total of 129 participants were recruited from an outpatient public sector mental health clinic based in Memphis. All participants took both nonoverlapping halves of the PROMIS IRT-based depression items (Part A and Part B): once using paper and pencil, and once using either a mobile phone or tablet. An 8-cell randomization was done on technology used, order of technologies used, and order of PROMIS Parts A and B. Both Parts A and B were administered as fixed-length assessments and both were scored using published PROMIS IRT parameters and algorithms. All 129 participants received either Part A or B via paper assessment. Participants were also administered the opposite assessment, 63 using a mobile phone and 66 using a tablet. There was no significant difference in item response scores for Part A versus B. All 3 of the technologies yielded essentially identical assessment results and equivalent satisfaction levels. Our findings show that the PROMIS depression assessment can be divided into 2 equivalent

  11. Psychometric Validation of the World Health Organization Disability Assessment Schedule 2.0-Twelve-Item Version in Persons with Spinal Cord Injuries

    Science.gov (United States)

    Smedema, Susan Miller; Ruiz, Derek; Mohr, Michael J.

    2017-01-01

    Purpose: To evaluate the factorial and concurrent validity and internal consistency reliability of the World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) 12-item version in persons with spinal cord injuries. Method: Two hundred forty-seven adults with spinal cord injuries completed an online survey consisting of the WHODAS…

  12. Development and testing of an assessment instrument for the formative peer review of significant event analyses.

    Science.gov (United States)

    McKay, J; Murphy, D J; Bowie, P; Schmuck, M-L; Lough, M; Eva, K W

    2007-04-01

    To establish the content validity and specific aspects of reliability for an assessment instrument designed to provide formative feedback to general practitioners (GPs) on the quality of their written analysis of a significant event. Content validity was quantified by application of a content validity index. Reliability testing involved a nested design, with 5 cells, each containing 4 assessors, rating 20 unique significant event analysis (SEA) reports (10 each from experienced GPs and GPs in training) using the assessment instrument. The variance attributable to each identified variable in the study was established by analysis of variance. Generalisability theory was then used to investigate the instrument's ability to discriminate among SEA reports. Content validity was demonstrated with at least 8 of 10 experts endorsing all 10 items of the assessment instrument. The overall G coefficient for the instrument was moderate to good (G>0.70), indicating that the instrument can provide consistent information on the standard achieved by the SEA report. There was moderate inter-rater reliability (G>0.60) when four raters were used to judge the quality of the SEA. This study provides the first steps towards validating an instrument that can provide educational feedback to GPs on their analysis of significant events. The key area identified to improve instrument reliability is variation among peer assessors in their assessment of SEA reports. Further validity and reliability testing should be carried out to provide GPs, their appraisers and contractual bodies with a validated feedback instrument on this aspect of the general practice quality agenda.

  13. Can Item Keyword Feedback Help Remediate Knowledge Gaps?

    Science.gov (United States)

    Feinberg, Richard A; Clauser, Amanda L

    2016-10-01

    In graduate medical education, assessment results can effectively guide professional development when both assessment and feedback support a formative model. When individuals cannot directly access the test questions and responses, a way of using assessment results formatively is to provide item keyword feedback. The purpose of the following study was to investigate whether exposure to item keyword feedback aids in learner remediation. Participants included 319 trainees who completed a medical subspecialty in-training examination (ITE) in 2012 as first-year fellows, and then 1 year later in 2013 as second-year fellows. Performance on 2013 ITE items in which keywords were, or were not, exposed as part of the 2012 ITE score feedback was compared across groups based on the amount of time studying (preparation). For the same items common to both 2012 and 2013 ITEs, response patterns were analyzed to investigate changes in answer selection. Test takers who indicated greater amounts of preparation on the 2013 ITE did not perform better on the items in which keywords were exposed compared to those who were not exposed. The response pattern analysis substantiated overall growth in performance from the 2012 ITE. For items with incorrect responses on both attempts, examinees selected the same option 58% of the time. Results from the current study were unsuccessful in supporting the use of item keywords in aiding remediation. Unfortunately, the results did provide evidence of examinees retaining misinformation.

  14. Translation and cross-cultural adaptation of the Detailed Assessment of Speed of Handwriting 17+ to Brazilian Portuguese: conceptual, item and semantic equivalence.

    Science.gov (United States)

    Cardoso, Monique Herrera; Capellini, Simone Aparecida

    2018-02-19

    Perform a cross-cultural adaptation of the Detailed Assessment of Speed of Handwriting 17+ (DASH 17+) for Brazilians. Evaluation of (1) conceptual, item and (2) semantic equivalence, with assistance of four translators and application of a pilot study to 36 students. (1) The concepts and items are equivalent in the British and Brazilian cultures. (2) Adaptations were made concerning the English language pangram used in copying tasks and selection of the lower-case, cursive handwriting in the alphabet-writing task. Application of the pilot study verified acceptability and understanding of the proposed tasks by the students. The Brazilian Portuguese version of the DASH 17+ was presented after finalization of the conceptual, item and semantic equivalence of the instrument. Further studies on psychometric properties should be conducted with the purpose of measuring the speed of handwriting in youngsters and adults with greater reliability and validity to the procedure.

  15. Modeling human intention formation for human reliability assessment

    International Nuclear Information System (INIS)

    Woods, D.D.; Roth, E.M.; Pople, H. Jr.

    1988-01-01

    This paper describes a dynamic simulation capability for modeling how people form intentions to act in nuclear power plant emergency situations. This modeling tool, Cognitive Environment Simulation or CES, was developed based on techniques from artificial intelligence. It simulates the cognitive processes that determine situation assessment and intention formation. It can be used to investigate analytically what situations and factors lead to intention failures, what actions follow from intention failures (e.g. errors of omission, errors of commission, common mode errors), the ability to recover from errors or additional machine failures, and the effects of changes in the NPP person machine system. One application of the CES modeling environment is to enhance the measurement of the human contribution to risk in probabilistic risk assessment studies. (author)

  16. Creating a brief rating scale for the assessment of learning disabilities using reliability and true score estimates of the scale's items based on the Rasch model.

    Science.gov (United States)

    Sideridis, Georgios; Padeliadu, Susana

    2013-01-01

    The purpose of the present studies was to provide the means to create brief versions of instruments that can aid the diagnosis and classification of students with learning disabilities and comorbid disorders (e.g., attention-deficit/hyperactivity disorder). A sample of 1,108 students with and without a diagnosis of learning disabilities took part in study 1. Using information from modern theory methods (i.e., the Rasch model), a scale was created that included fewer than one third of the original battery items designed to assess reading skills. This best item synthesis was then evaluated for its predictive and criterion validity with a valid external reading battery (study 2). Using a sample of 232 students with and without learning disabilities, results indicated that the brief version of the scale was equally effective as the original scale in predicting reading achievement. Analysis of the content of the brief scale indicated that the best item synthesis involved items from cognition, motivation, strategy use, and advanced reading skills. It is suggested that multiple psychometric criteria be employed in evaluating the psychometric adequacy of scales used for the assessment and identification of learning disabilities and comorbid disorders.

  17. Study Protocol on Intentional Distortion in Personality Assessment: Relationship with Test Format, Culture, and Cognitive Ability.

    Science.gov (United States)

    Van Geert, Eline; Orhon, Altan; Cioca, Iulia A; Mamede, Rui; Golušin, Slobodan; Hubená, Barbora; Morillo, Daniel

    2016-01-01

    Self-report personality questionnaires, traditionally offered in a graded-scale format, are widely used in high-stakes contexts such as job selection. However, job applicants may intentionally distort their answers when filling in these questionnaires, undermining the validity of the test results. Forced-choice questionnaires are allegedly more resistant to intentional distortion compared to graded-scale questionnaires, but they generate ipsative data. Ipsativity violates the assumptions of classical test theory, distorting the reliability and construct validity of the scales, and producing interdependencies among the scores. This limitation is overcome in the current study by using the recently developed Thurstonian item response theory model. As online testing in job selection contexts is increasing, the focus will be on the impact of intentional distortion on personality questionnaire data collected online. The present study intends to examine the effect of three different variables on intentional distortion: (a) test format (graded-scale versus forced-choice); (b) culture, as data will be collected in three countries differing in their attitudes toward intentional distortion (the United Kingdom, Serbia, and Turkey); and (c) cognitive ability, as a possible predictor of the ability to choose the more desirable responses. Furthermore, we aim to integrate the findings using a comprehensive model of intentional distortion. In the Anticipated Results section, three main aspects are considered: (a) the limitations of the manipulation, theoretical approach, and analyses employed; (b) practical implications for job selection and for personality assessment in a broader sense; and (c) suggestions for further research.

  18. The system of indicators for regional cluster formation assessment

    Directory of Open Access Journals (Sweden)

    A. A. Mantsaeva

    2016-01-01

    Full Text Available The article shows the result of working-out the cluster formation assessment system, and each indicator of this system reflect the specific clusters property - cooperation and efficiency Completeness and depth of the system of indicators provided by systematic approach and a representing of quantitative and qualitative aspects of cluster formation process. A feature of the technique is the use of indicators that require a special accounting and enable tracking of a certain stage of cluster development. Testing the system of indicators produced by the example on the tourism industry, which is due, firstly, the high development rate of the tourist services sphere in comparison with the branches of material production, and, secondly, the increased interest in the establishment of regional tourism and recreation clusters with the country's leadership. Quantitative indicators of the formation and development of tourism and recreation clusters – geographic proximity of companies cluster members, the effectiveness of the sector for the regional economy, innovation activity, exports of goods and services, intended for the regions of the South and the North Caucasian Federal District. Universality technique ensures its empirical base - official data from Rosstat, the Federal Agency for Tourism, as well as the results of mass opinion polls carried out in all regions of the country as part of the annual “"Monitoring the quality of public and municipal services” (on the Republic of Kalmykia material. In general, we believe that the application of the developed system of indicators will contribute to intensify and improve the quality of cluster policy, implemented by the regional executive bodies and local authorities.

  19. "Straitjacket" or "Springboard for Sustainable Learning"? The Implications of Formative Assessment Practices in Vocational Learning Cultures

    Science.gov (United States)

    Davies, Jenifer; Ecclestone, Kathryn

    2008-01-01

    In contrast to theoretical and empirical insights from research into formative assessment in compulsory schooling, understanding the relationship between formative assessment, motivation and learning in vocational education has been a topic neglected by researchers. The Improving Formative Assessment project (IFA) addresses this gap, using a…

  20. Methodology for assessing thioarsenic formation potential in sulfidic landfill environments.

    Science.gov (United States)

    Zhang, Jianye; Kim, Hwidong; Townsend, Timothy

    2014-07-01

    Arsenic leaching and speciation in landfills, especially those with arsenic bearing waste and drywall disposal (such as construction and demolition (C&D) debris landfills), may be affected by high levels of sulfide through the formation of thioarsenic anions. A methodology using ion chromatography (IC) with a conductivity detector was developed for the assessment of thioarsenic formation potential in sulfidic landfill environments. Monothioarsenate (H2AsSO3(-)) and dithioarsenate (H2AsS2O2(-)) were confirmed in the IC fractions of thioarsenate synthesis mixture, consistent with previous literature results. However, the observation of AsSx(-) (x=5-8) in the supposed trithioarsenate (H2AsS3O(-)) and tetrathioarsenate (H2AsS4(-)) IC fractions suggested the presence of new arsenic polysulfide complexes. All thioarsenate anions, particularly trithioarsenate and tetrathioarsenate, were unstable upon air exposure. The method developed for thioarsenate analysis was validated and successfully used to analyze several landfill leachate samples. Thioarsenate anions were detected in the leachate of all of the C&D debris landfills tested, which accounted for approximately 8.5% of the total aqueous As in the leachate. Compared to arsenite or arsenate, thioarsenates have been reported in literature to have lower adsorption on iron oxide minerals. The presence of thioarsenates in C&D debris landfill leachate poses new concerns when evaluating the impact of arsenic mobilization in such environments. Copyright © 2014 Elsevier Ltd. All rights reserved.

  1. The development of formative assessment probes for optics education

    Science.gov (United States)

    Dokter, Erin F. C.; Pompea, Stephen M.; Sparks, Robert T.; Walker, Constance E.

    2010-08-01

    Research exploring students' knowledge of optics from elementary through college has revealed that many concepts can be difficult for students to grasp. This can be the case particularly with fundamental concepts, such as the nature of light, how light interacts with matter, and how light behaves in optical systems. The use of formative assessment probes (low-stakes questions posed to students before instruction or in real-time in the classroom) can inform instructors about student background knowledge, and can also be used as they progress through learning in class. By understanding what students know prior to instruction, and how well they are learning in real-time, instruction can be designed and modified in order to encourage the development of scientifically-accurate knowledge.

  2. Validity and reliability of the TED-QOL: a new three-item questionnaire to assess quality of life in thyroid eye disease.

    Science.gov (United States)

    Fayers, Tessa; Dolman, Peter J

    2011-12-01

    To develop and test a user-friendly questionnaire for rapidly assessing quality of life (QOL) in thyroid eye disease (TED). A three-item questionnaire, the TED-QOL, was designed and compared to the 16-item Graves Ophthalmopathy (GO)-QOL and the nine-item GO-Quality of Life Scale (QLS). 100 patients with TED were administered all three questionnaires on two occasions. Results were compared to clinical severity scores (Vision, Inflammation, Strabismus, Appearance (VISA) classification). Main outcomes were construct and criterion validity, test-retest reliability, duration, comprehension and completion rates. TED-QOL correlated strongly with the other questionnaires for corresponding items (Pearson correlation: appearance 0.71, 0.62; functioning 0.69, 0.66; overall QOL 0.53). Test-retest analysis demonstrated good reliability for all three questionnaires (intraclass correlations: TED-QOL 0.81, 0.74, 0.87; GO-QOL 0.81, 0.82; GO-QLS 0.74, 0.86, 0.67). TED-QOL was significantly faster to complete (1.6 min vs GO-QOL 3.1 min, GO-QLS 2.7 min, p<0.0001) and had a higher completion rate (100% vs GO-QOL 78%, GO-QLS 94%). There was only moderate correlation between items on all three questionnaires and VISA scores. The TED-QOL is rapid and easy to complete and analyse and has similar validity and reliability to longer questionnaires. All questionnaires showed only moderate correlation with disease severity, emphasising the discrepancy between objective and subjective assessments and the importance of measuring both.

  3. Exploratory factor analysis of the 12-item Functional Assessment of Chronic Illness Therapy-Spiritual Well-Being Scale in people newly diagnosed with advanced cancer.

    Science.gov (United States)

    Bai, Mei; Dixon, Jane K

    2014-01-01

    The purpose of this study was to reexamine the factor pattern of the 12-item Functional Assessment of Chronic Illness Therapy-Spiritual Well-Being Scale (FACIT-Sp-12) using exploratory factor analysis in people newly diagnosed with advanced cancer. Principal components analysis (PCA) and 3 common factor analysis methods were used to explore the factor pattern of the FACIT-Sp-12. Factorial validity was assessed in association with quality of life (QOL). Principal factor analysis (PFA), iterative PFA, and maximum likelihood suggested retrieving 3 factors: Peace, Meaning, and Faith. Both Peace and Meaning positively related to QOL, whereas only Peace uniquely contributed to QOL. This study supported the 3-factor model of the FACIT-Sp-12. Suggestions for revision of items and further validation of the identified factor pattern were provided.

  4. Promoting Creativity through Assessment: A Formative Computer-Assisted Assessment Tool for Teachers

    Science.gov (United States)

    Cropley, David; Cropley, Arthur

    2016-01-01

    Computer-assisted assessment (CAA) is problematic when it comes to fostering creativity, because in educational thinking the essence of creativity is not finding the correct answer but generating novelty. The idea of "functional" creativity provides rubrics that can serve as the basis for forms of CAA leading to either formative or…

  5. A study on the establishment of safety assessment guidelines of commercial grade item dedication in digitalized safety systems

    International Nuclear Information System (INIS)

    Hwang, H. S.; Kim, B. R.; Oh, S. H.

    1999-01-01

    Because of obsolescing the components used in safety related systems of nuclear power plants, decreasing the number of suppliers qualified for the nuclear QA program and increasing maintenance costs of them, utilities have been considering to use commercial grade digital computers as an alternative for resolving such issues. However, commercial digital computers use the embedded pre-existing software, including operating system software, which are not developed by using nuclear grade QA program. Thus, it is necessary for utilities to establish processes for dedicating digital commercial grade items. A regulatory body also needs guidance to evaluate the digital commercial products properly. This paper surveyed the regulations and their regulatory guides, which establish the requirements for commercial grade items dedication, industry standards and guidances applicable to safety related systems. This paper provides some guidelines to be applied in evaluating the safety of digital upgrades and new digital plant protection systems in Korea

  6. Comparing the Effects of Different Smoothing Algorithms on the Assessment of Dimensionality of Ordered Categorical Items with Parallel Analysis.

    Science.gov (United States)

    Debelak, Rudolf; Tran, Ulrich S

    2016-01-01

    The analysis of polychoric correlations via principal component analysis and exploratory factor analysis are well-known approaches to determine the dimensionality of ordered categorical items. However, the application of these approaches has been considered as critical due to the possible indefiniteness of the polychoric correlation matrix. A possible solution to this problem is the application of smoothing algorithms. This study compared the effects of three smoothing algorithms, based on the Frobenius norm, the adaption of the eigenvalues and eigenvectors, and on minimum-trace factor analysis, on the accuracy of various variations of parallel analysis by the means of a simulation study. We simulated different datasets which varied with respect to the size of the respondent sample, the size of the item set, the underlying factor model, the skewness of the response distributions and the number of response categories in each item. We found that a parallel analysis and principal component analysis of smoothed polychoric and Pearson correlations led to the most accurate results in detecting the number of major factors in simulated datasets when compared to the other methods we investigated. Of the methods used for smoothing polychoric correlation matrices, we recommend the algorithm based on minimum trace factor analysis.

  7. The Development of a Formative and a Reflective Scale for the Assessment of On-Line Store Usability

    Directory of Open Access Journals (Sweden)

    Timo Christophersen

    2008-10-01

    Full Text Available In usability research, difference between formative and reflective measurement models for the assessment of latent variables has been ignored largely. As a consequence, many usability scales are misspecified. This might result in reduced scale validity because of the elimination of important usability facets within the procedure of scale development. The aim of the current study was to develop a questionnaire for the evaluation of On-line store usability (UFOS-V2 that includes both a formative and a reflective scale. 378 subjects participated in a laboratory experimental study. Each participant visited two out of 35 On-line stores. The usability and intention to buy was assessed for both stores. In addition, actual purchase behaviour was observed by combining the subjects' reward with the decision to buy. In a two-construct PLS structural equation model the formative usability scale was used as a predictor for the reflective usability measure. Results indicate that the formative usability scale UFOS-V2f forms a valid set of items for the user-based assessment of online store usability. The reflective usability scale shows high internal consistency. Positive relationships to intention and decision to buy confirm high scale validity.

  8. Teacher coaching supported by formative assessment for improving classroom practices.

    Science.gov (United States)

    Fabiano, Gregory A; Reddy, Linda A; Dudek, Christopher M

    2018-06-01

    The present study is a wait-list controlled, randomized study investigating a teacher coaching approach that emphasizes formative assessment and visual performance feedback to enhance elementary school teachers' classroom practices. The coaching model targeted instructional and behavioral management practices as measured by the Classroom Strategies Assessment System (CSAS) Observer and Teacher Forms. The sample included 89 general education teachers, stratified by grade level, and randomly assigned to 1 of 2 conditions: (a) immediate coaching, or (b) waitlist control. Results indicated that, relative to the waitlist control, teachers in immediate coaching demonstrated significantly greater improvements in observations of behavior management strategy use but not for observations of instructional strategy use. Observer- and teacher-completed ratings of behavioral management strategy use at postassessment were significantly improved by both raters; ratings of instructional strategy use were significantly improved for teacher but not observer ratings. A brief coaching intervention improved teachers' use of observed behavior management strategies and self-reported use of behavior management and instructional strategies. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  9. The Differences among Three-, Four-, and Five-Option-Item Formats in the Context of a High-Stakes English-Language Listening Test

    Science.gov (United States)

    Lee, HyeSun; Winke, Paula

    2013-01-01

    We adapted three practice College Scholastic Ability Tests (CSAT) of English listening, each with five-option items, to create four- and three-option versions by asking 73 Korean speakers or learners of English to eliminate the least plausible options in two rounds. Two hundred and sixty-four Korean high school English-language learners formed…

  10. A Comparison of Item Fit Statistics for Mixed IRT Models

    Science.gov (United States)

    Chon, Kyong Hee; Lee, Won-Chan; Dunbar, Stephen B.

    2010-01-01

    In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's G[superscript 2], Orlando and Thissen's S-X[superscript 2] and S-G[superscript 2], and Stone's chi[superscript 2*] and G[superscript 2*]. To investigate the…

  11. ITEM LEVEL DIAGNOSTICS AND MODEL - DATA FIT IN ITEM ...

    African Journals Online (AJOL)

    Global Journal

    Item response theory (IRT) is a framework for modeling and analyzing item response ... data. Though, there is an argument that the evaluation of fit in IRT modeling has been ... National Council on Measurement in Education ... model data fit should be based on three types of ... prediction should be assessed through the.

  12. Item Response Data Analysis Using Stata Item Response Theory Package

    Science.gov (United States)

    Yang, Ji Seung; Zheng, Xiaying

    2018-01-01

    The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

  13. Best practices in academic assessment in higher education: A case in formative and shared assessment

    Directory of Open Access Journals (Sweden)

    Victor Manuel López Pastor

    2011-09-01

    Full Text Available 800x600 Normal 0 21 false false false CA X-NONE X-NONE The aim of this article is three-fold: (a to present an example of best practices in formative assessment in university instruction, offering three different methods of learning and assessment to pass a subject; (b to analyze differences in academic performance depending on method of learning and assessment chosen; (c to consider professors´ and students´ evaluation of these assessment methods, as well as analyze the workload these methods suppose for both students and professors. The design is based on a single case study. The study analyzes the results obtained in a third- year course at the University of Valladolid (Spain that participated in an ECTS pilot program. Data was collected during academic year 2009-10. Total number of registered students was 77. This paper describes the procedure to develop a formative assessment system and collect data, as well as the main techniques to obtain and analyze data. Findings indicate that there are important differences in student academic performance depending on the learning and assessment method employed in an academic course. Courses are using formative and on going assessment result in significantly higher student academic performance than courses using other learning and assessment methods. Lastly, empirical data suggest that the workload is in line with the ECTS European Credit Transfer System, and is no excessive for the professor. However, students´ subjective perception is that this method involves a heavier workload. These findings may be important, given the current process of convergence towards the new Degrees and ECTS credit system, and the need to adapt these degrees and credits to students’ real workload.  

  14. Internal and External Factors Affecting Teachers' Adoption of Formative Assessment to Support Learning

    Science.gov (United States)

    Izci, Kemal

    2016-01-01

    Assessment forms an important part of instruction. Assessment that aims to support learning is known as formative assessment and it contributes student's learning gain and motivation. However, teachers rarely use assessment formatively to aid their students' learning. Thus reviewing the factors that limit or support teachers' practices of…

  15. The Role of Content and Context in PISA Interest Scales: A study of the embedded interest items in the PISA 2006 science assessment

    Science.gov (United States)

    Drechsel, Barbara; Carstensen, Claus; Prenzel, Manfred

    2011-01-01

    This paper focuses interest in science as one of the attitudinal aspects of scientific literacy. Large-scale data from the Programme for International Student Assessment (PISA) 2006 are analysed in order to describe student interest more precisely. So far the analyses have provided a general indicator of interest, aggregated over all contexts and contents in the science test. With its innovative approach PISA embeds interest items within the cognitive test unit and its contents and contexts. The main difference from conventional interest measures is that in most questionnaires, a relatively small number of interest items cover broad fields of contents and contexts. The science units represent a number of systematically differentiated scientific contexts and contents. The units' stimulus texts allow for concrete descriptions of relevant content aspects, applications, and contexts. In the analyses, multidimensional item response models are applied in order to disentangle student interest. The results indicate that multidimensional models fit the data. A two-dimensional model separating interest into two different knowledge of science dimensions described in the PISA science framework is further analysed with respect to gender, performance differences, and country. The findings give a comprehensive description of students' interest in science. The paper deals with methodological problems and describes requirements of the test construction for further assessments. The results are discussed with regard to their significance for science education.

  16. An Arrangement of the Items Influencing Assessment of the Electrotechnical Technology Course / PROEJA, campuses Campos Centro and Itaperuna: The Learners’ View

    Directory of Open Access Journals (Sweden)

    Jorge Luíz Clemente Gomes

    2016-04-01

    Full Text Available This work aims to organize pre-defined items that affect the students’ answers when assessing the Electrotechnical Technology Course / PROEJA. The research was carried out from October / 2011 to December / 2012 with questionnaires applied with 1st to 6th period students. At campus Campos Centro, “Technical Visits” and “Internship” presented high levels of importance and low satisfaction, while “Personal Realization” and “Professional Achievement” presented high levels of relevance and satisfaction. At campus Itaperuna, “Job opportunities” and “Professional Achievement” presented high levels of relevance and satisfaction. Items “Faculty” and “New Technologies”, presented high importance but low satisfaction. The research aims at improving the quality of the course.

  17. Instemmingsgeneigdheid en verskillende item- en responsformate in 'n gesommeerde selfbeoordelingskaal

    Directory of Open Access Journals (Sweden)

    Nadene Hanekom

    1998-06-01

    Full Text Available This study examines the degree of acquiescence present when the item and response formats of a summated rating scale are varied. It is often recommended that acquiescence response bias in rating scales may be controlled by using both positively and negatively worded items. Such items are generally worded in the Likert-type format of statements. The purpose of the study was to establish whether items in question format would result in a smaller degree of acquiescence than items worded as statements. the response format was also varied (five- and seven-point options to determine whether this would influence the reliability and degree of acquiescence in the scales. A twenty-item Locus of Control (LC questionnaire was used, but each item was complemented by its opposite, resulting in 40 items. The subjects, divided randomly into two groups, were second year students who had to complete four versions of the questionnaire, plus a shortened version of Bass's scale for measuring acquiescence. The LC version were questions or statements each combined with a five- or seven-point respons format. Partial counterbalancing was introduced by testing on two separate occasions, presenting the tests to the two groups in the opposite order. The degree of acquiescence was assessed by correlating the items with their opposite, and by correlating scores on each version with scores on the acquiescence questionnaire. No major difference were found between the various item and response format in relation to acquiescence. Opsomming Hierdie ondersoek is uitgevoer om te bepaal of die mate van instemmingsgeneigdheid deur die item- en responsformaat van 'n gesommeerde selfbeoordelingskaal beinvloed word. Daar word dikwels aanbeveel dat die gebruik van positief- sowel as negatiefbewoorde items in 'n vraelys instemmingsgeneigdheid beperk. Suike items word gewoonlik in die tradisionele Likertformaat as stellings geformuleer. Die doel van die ondersoek was om te bepaal of items

  18. STATE POLICY FUNDAMENTALS IN FORMATION OF A NATIONAL STANDARD OF "GREEN CONSTRUCTION" FOR ASSESSMENT OF ITEMS OF REAL PROPERTY

    Directory of Open Access Journals (Sweden)

    Kolchigin Mikhail Aleksandrovich

    2012-12-01

    Full Text Available The authors analyze the problem of implementation of principles of "green construction" in the Russian Federation. Despite the availability of the appropriate legislation in the field of environmental safety of construction, there are no legal, social, or economic incentives that may boost development of "green" technologies. Until recently, fundamentals of the state policy in the field of environmental protection of real estate development have not succeeded in motivating market players to implement advanced green technologies. However, recently, the government has begun motivating the construction industry towards the use of "green" technologies. The first activity is aimed at improving the legislation and updating the international voluntary certification according to BREAM and LEED standards. The result is the acceptance of the National Green Building Standard for real estate valuation that will open up new opportunities and prospects to the participants of the construction market. However, at the initial phase of implementation of "Fundamentals of the State Policy in the Field of Environmental Development of the Russian Federation", government authorities should provide their support to proponents of green buildings, including financial inflows.

  19. Assessing the factor structures of the 55- and 22-item versions of the conformity to masculine norms inventory.

    Science.gov (United States)

    Owen, Jesse

    2011-03-01

    The current study examined the psychometric properties of the abbreviated versions, 55- and 22-items, of the Conformity to Masculine Norms Inventory (CMNI). The authors tested the factor structure for the 11 subscales of the CMNI-55 and the global masculinity factor for the CMNI-55 and the CMNI-22. In a clinical sample of men and women (n=522), the results supported the 11-factor model. Furthermore, the factor structure was invariant for men and women. The higher order model, which tested the utility of the global masculine score, demonstrated marginal fit. The factor structures for the global masculinity score for the CMNI-22 demonstrated poor fit. Collectively, the results suggest that the CMNI-55 is better represented in a multidimensional construct. The subscales' alpha levels and factor loadings were, generally, within acceptable limits. Gender and ethnic mean level differences are also reported. © The Author(s) 2011

  20. Why sample selection matters in exploratory factor analysis: implications for the 12-item World Health Organization Disability Assessment Schedule 2.0.

    Science.gov (United States)

    Gaskin, Cadeyrn J; Lambert, Sylvie D; Bowe, Steven J; Orellana, Liliana

    2017-03-11

    Sample selection can substantially affect the solutions generated using exploratory factor analysis. Validation studies of the 12-item World Health Organization (WHO) Disability Assessment Schedule 2.0 (WHODAS 2.0) have generally involved samples in which substantial proportions of people had no, or minimal, disability. With the WHODAS 2.0 oriented towards measuring disability across six life domains (cognition, mobility, self-care, getting along, life activities, and participation in society), performing factor analysis with samples of people with disability may be more appropriate. We determined the influence of the sampling strategy on (a) the number of factors extracted and (b) the factor structure of the WHODAS 2.0. Using data from adults aged 50+ from the six countries in Wave 1 of the WHO's longitudinal Study on global AGEing and adult health (SAGE), we repeatedly selected samples (n = 750) using two strategies: (1) simple random sampling that reproduced nationally representative distributions of WHODAS 2.0 summary scores for each country (i.e., positively skewed distributions with many zero scores indicating the absence of disability), and (2) stratified random sampling with weights designed to obtain approximately symmetric distributions of summary scores for each country (i.e. predominantly including people with varying degrees of disability). Samples with skewed distributions typically produced one-factor solutions, except for the two countries with the lowest percentages of zero scores, in which the majority of samples produced two factors. Samples with approximately symmetric distributions, generally produced two- or three-factor solutions. In the two-factor solutions, the getting along domain items loaded on one factor (commonly with a cognition domain item), with remaining items loading on a second factor. In the three-factor solutions, the getting along and self-care domain items loaded separately on two factors and three other domains

  1. Refining Inquiry with Multi-Form Assessment: Formative and summative assessment functions for flexible inquiry

    Science.gov (United States)

    Zuiker, Steven; Reid Whitaker, J.

    2014-04-01

    This paper describes the 5E+I/A inquiry model and reports a case study of one curricular enactment by a US fifth-grade classroom. A literature review establishes the model's conceptual adequacy with respect to longstanding research related to both the 5E inquiry model and multiple, incremental innovations of it. As a collective line of research, the review highlights a common emphasis on formative assessment, at times coupled either with differentiated instruction strategies or with activities that target the generalization of learning. The 5E+I/A model contributes a multi-level assessment strategy that balances formative and summative functions of multiple forms of assessment in order to support classroom participation while still attending to individual achievement. The case report documents the enactment of a weeklong 5E+I/A curricular design as a preliminary account of the model's empirical adequacy. A descriptive and analytical narrative illustrates variable ways that multi-level assessment makes student thinking visible and pedagogical decision-making more powerful. In light of both, it also documents productive adaptations to a flexible curricular design and considers future research to advance this collective line of inquiry.

  2. Investigating Computer-Based Formative Assessments in a Medical Terminology Course

    Science.gov (United States)

    Wilbanks, Jammie T.

    2012-01-01

    Research has been conducted on the effectiveness of formative assessments and on effectively teaching medical terminology; however, research had not been conducted on the use of formative assessments in a medical terminology course. A quantitative study was performed which captured data from a pretest, self-assessment, four module exams, and a…

  3. Formative Assessment as a Vehicle for Changing Classroom Practice in a Specific Cultural Context

    Science.gov (United States)

    Chen, Jingping

    2015-01-01

    In this commentary, I interpret Xinying Yin and Gayle Ann Buck's collaborative action research from a social-cultural perspective. Classroom implementation of formative assessment is viewed as interaction between this assessment method and the local learning culture. I first identify Yin and Buck's definition of the formative assessment, and then…

  4. Interpretations of Formative Assessment in the Teaching of English at Two Chinese Universities: A Sociocultural Perspective

    Science.gov (United States)

    Chen, Qiuxian; Kettle, Margaret; Klenowski, Val; May, Lyn

    2013-01-01

    Formative assessment is increasingly being implemented through policy initiatives in Chinese educational contexts. As an approach to assessment, formative assessment derives many of its key principles from Western contexts, notably through the work of scholars in the UK, the USA and Australia. The question for this paper is the ways that formative…

  5. A compliance assessment of midpoint formative assessments completed by APPE preceptors.

    Science.gov (United States)

    Lea Bonner, C; Staton, April G; Naro, Patricia B; McCullough, Elizabeth; Lynn Stevenson, T; Williamson, Margaret; Sheffield, Melody C; Miller, Mindi; Fetterman, James W; Fan, Shirley; Momary, Kathryn M

    Experiential pharmacy preceptors should provide formative and summative feedback during a learning experience. Preceptors are required to provide colleges and schools of pharmacy with assessments or evaluations of students' performance. Students and experiential programs value on-time completion of midpoint evaluations by preceptors. The objective of this study was to determine the number of on-time electronically documented formative midpoint evaluations completed by preceptors during advanced pharmacy practice experiences (APPEs). Compliance rates of on-time electronically documented formative midpoint evaluations were reviewed by the Office of Experiential Education of a five-member consortium during the two-year study period prior to the adoption of Standards 2016. Pearson chi-square test and generalized linear models were used to determine if statistically significant differences were present. Average midpoint compliance rates for the two-year research period were 40.7% and 41% respectively. No statistical significance was noted comparing compliance rates for year one versus year two. However, statistical significance was present when comparing compliance rates between schools during year two. Feedback from students and preceptors pointed to the need for brief formal midpoint evaluations that require minimal time to complete, user friendly experiential management software, and methods for documenting verbal feedback through student self-reflection. Additional education and training to both affiliate and faculty preceptors on the importance of written formative feedback at midpoint is critical to remaining in compliance with Standards 2016. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Formative assessment practices in Bhutanese secondary schools and its impact on Quality of Education

    DEFF Research Database (Denmark)

    Utha, Karma

    Using case study approach, the dissertation provides the notions and practices of formative assessment in Bhutanese Secondary Schools. It includes the teachers’ understanding of the practice of student-centered teaching and learning, which is regarded as a precondition for effective formative...... assessment. It also take account of those features of formative assessment which are much more favored by students and teachers in the case study schools....

  7. Connecting Lines of Research on Task Model Variables, Automatic Item Generation, and Learning Progressions in Game-Based Assessment

    Science.gov (United States)

    Graf, Edith Aurora

    2014-01-01

    In "How Task Features Impact Evidence from Assessments Embedded in Simulations and Games," Almond, Kim, Velasquez, and Shute have prepared a thought-provoking piece contrasting the roles of task model variables in a traditional assessment of mathematics word problems to their roles in "Newton's Playground," a game designed…

  8. Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2013-01-01

    Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

  9. Reliability and construct validity of the Spanish version of the 6-item CTS symptoms scale for outcomes assessment in carpal tunnel syndrome.

    Science.gov (United States)

    Rosales, Roberto S; Martin-Hidalgo, Yolanda; Reboso-Morales, Luis; Atroshi, Isam

    2016-03-03

    The purpose of this study was to assess the reliability and construct validity of the Spanish version of the 6-item carpal tunnel syndrome (CTS) symptoms scale (CTS-6). In this cross-sectional study 40 patients diagnosed with CTS based on clinical and neurophysiologic criteria, completed the standard Spanish versions of the CTS-6 and the disabilities of the arm, shoulder and hand (QuickDASH) scales on two occasions with a 1-week interval. Internal-consistency reliability was assessed with the Cronbach alpha coefficient and test-retest reliability with the intraclass correlation coefficient, two way random effect model and absolute agreement definition (ICC2,1). Cross-sectional precision was analyzed with the Standard Error of the Measurement (SEM). Longitudinal precision for test-retest reliability coefficient was assessed with the Standard Error of the Measurement difference (SEMdiff) and the Minimal Detectable Change at 95 % confidence level (MDC95). For assessing construct validity it was hypothesized that the CTS-6 would have a strong positive correlation with the QuickDASH, analyzed with the Pearson correlation coefficient (r). The standard Spanish version of the CTS-6 presented a Cronbach alpha of 0.81 with a SEM of 0.3. Test-retest reliability showed an ICC of 0.85 with a SRMdiff of 0.36 and a MDC95 of 0.7. The correlation between CTS-6 and the QuickDASH was concordant with the a priori formulated construct hypothesis (r 0.69) CONCLUSIONS: The standard Spanish version of the 6-item CTS symptoms scale showed good internal consistency, test-retest reliability and construct validity for outcomes assessment in CTS. The CTS-6 will be useful to clinicians and researchers in Spanish speaking parts of the world. The use of standardized outcome measures across countries also will facilitate comparison of research results in carpal tunnel syndrome.

  10. Diagnostic Value of Subjective Memory Complaints Assessed with a Single Item in Dominantly Inherited Alzheimer’s Disease: Results of the DIAN Study

    Directory of Open Access Journals (Sweden)

    Christoph Laske

    2015-01-01

    Full Text Available Objective. We examined the diagnostic value of subjective memory complaints (SMCs assessed with a single item in a large cross-sectional cohort consisting of families with autosomal dominant Alzheimer’s disease (ADAD participating in the Dominantly Inherited Alzheimer Network (DIAN. Methods. The baseline sample of 183 mutation carriers (MCs and 117 noncarriers (NCs was divided according to Clinical Dementia Rating (CDR scale into preclinical (CDR 0; MCs: n=107; NCs: n=109, early symptomatic (CDR 0.5; MCs: n=48; NCs: n=8, and dementia stage (CDR ≥ 1; MCs: n=28; NCs: n=0. These groups were subdivided by the presence or absence of SMCs. Results. At CDR 0, SMCs were present in 12.1% of MCs and 9.2% of NCs (P=0.6. At CDR 0.5, SMCs were present in 66.7% of MCs and 62.5% of NCs (P=1.0. At CDR ≥ 1, SMCs were present in 96.4% of MCs. SMCs in MCs were significantly associated with CDR, logical memory scores, Geriatric Depression Scale, education, and estimated years to onset. Conclusions. The present study shows that SMCs assessed by a single-item scale have no diagnostic value to identify preclinical ADAD in asymptomatic individuals. These results demonstrate the need of further improvement of SMC measures that should be examined in large clinical trials.

  11. Evaluation of a Multiple-Stimulus Presentation Format for Assessing Reinforcer Preferences.

    Science.gov (United States)

    DeLeon, Iser G.; Iwata, Brian A.

    1996-01-01

    A study of seven adults with profound developmental disabilities compared methods for presenting stimuli during reinforcer-preference assessments. It found that a multiple-stimulus format in which selections were made without replacement may share the advantages of a paired-stimulus format and a multiple-stimulus format with replacement, while…

  12. Empirical assessment of loyalty drivers using consumers’ retail format choice

    Directory of Open Access Journals (Sweden)

    Gindi, A.A.

    2017-05-01

    Full Text Available Using Stimulus–Organism–Response (S-O-R framework, this study examines Stimulus– Response relationships of fresh vegetable consumers’ behavior in Klang Valley, Malaysia. In particular, the study focused on how loyalty drivers affect retail formats choice by the fresh vegetable (FV consumers. The Stimuli that pertain to loyalty drivers include promotional activities, perceived price and social interaction and the Response is the retail format choice. Three hypotheses were developed and tested with the data collected from a survey using simple random sampling technique. Structural Equation Model (SEM was used in analyzing the data. Results of the study revealed that Stimuli (loyalty drivers influence Response (retail format choice for the different FV markets in Malaysia. Based on the finding of the research, Malaysian retailers have different marketing strategies to be considered with regards to loyalty drivers.

  13. An online formative assessment tool to prepare students for summative assessment in physiology

    Directory of Open Access Journals (Sweden)

    Samantha Kerr

    2016-05-01

    Full Text Available Background. The didactic approach to teaching physiology in our university has traditionally included the delivery of lectures to large groups, illustrating concepts and referencing recommended textbooks. Importantly, at undergraduate level, our assessments demand a level of application of physiological mechanisms to recognised pathophysiological conditions. Objective. To bridge the gap between lectured material and the application of physiological concepts to pathophysiological conditions, we developed a technological tool approach that augments traditional teaching. Methods. Our e-learning initiative, eQuip, is a custom-built e-learning platform specifically created to align question types included in the program to be similar to those used in current assessments. We describe our formative e-learning system and present preliminary results after the first year of introduction, reporting on the performances and perceptions of 2nd-year physiology students. Results. Students who made use of eQuip for at least three of the teaching blocks achieved significantly better results than those who did not use the program (p=0.0032. Questionnaire feedback was positive with regard to the administration processes and usefulness of eQuip. Students reported particularly liking the ease of access to information; however, <60% of them felt that eQuip motivated them to learn. Conclusion. These results are consistent with the literature, which shows that students who made use of an online formative assessment tool performed better in summative assessment tasks. Despite the improved performance of students, the questionnaire results showed that student motives for using online learning tools indicated that they lack self-directed learning skills and seek easy access to information.

  14. A comparative study on assessment procedures and metric properties of two scoring systems of the Coma Recovery Scale-Revised items: standard and modified scores.

    Science.gov (United States)

    Sattin, Davide; Lovaglio, Piergiorgio; Brenna, Greta; Covelli, Venusia; Rossi Sebastiano, Davide; Duran, Dunja; Minati, Ludovico; Giovannetti, Ambra Mara; Rosazza, Cristina; Bersano, Anna; Nigri, Anna; Ferraro, Stefania; Leonardi, Matilde

    2017-09-01

    The study compared the metric characteristics (discriminant capacity and factorial structure) of two different methods for scoring the items of the Coma Recovery Scale-Revised and it analysed scale scores collected using the standard assessment procedure and a new proposed method. Cross sectional design/methodological study. Inpatient, neurological unit. A total of 153 patients with disorders of consciousness were consecutively enrolled between 2011 and 2013. All patients were assessed with the Coma Recovery Scale-Revised using standard (rater 1) and inverted (rater 2) procedures. Coma Recovery Scale-Revised score, number of cognitive and reflex behaviours and diagnosis. Regarding patient assessment, rater 1 using standard and rater 2 using inverted procedures obtained the same best scores for each subscale of the Coma Recovery Scale-Revised for all patients, so no clinical (and statistical) difference was found between the two procedures. In 11 patients (7.7%), rater 2 noted that some Coma Recovery Scale-Revised codified behavioural responses were not found during assessment, although higher response categories were present. A total of 51 (36%) patients presented the same Coma Recovery Scale-Revised scores of 7 or 8 using a standard score, whereas no overlap was found using the modified score. Unidimensionality was confirmed for both score systems. The Coma Recovery Scale Modified Score showed a higher discriminant capacity than the standard score and a monofactorial structure was also supported. The inverted assessment procedure could be a useful evaluation method for the assessment of patients with disorder of consciousness diagnosis.

  15. Computer assisted formative assessment: supporting students to become more reflective learners

    OpenAIRE

    Whitelock, Denise M.

    2007-01-01

    e-Assessment is being advocated in the UK as our way of introducing a more personalised learning agenda throughout the Higher Education sector. This paper discusses the findings from two projects where formative e-assessment has contributed to students taking more control of their own learning. One study set out to provide further insights into the role of electronic formative assessment and to point the way forward to new assessment practices, capitalising on a range of open source tools. Th...

  16. The Technical Quality of Test Items Generated Using a Systematic Approach to Item Writing.

    Science.gov (United States)

    Siskind, Theresa G.; Anderson, Lorin W.

    The study was designed to examine the similarity of response options generated by different item writers using a systematic approach to item writing. The similarity of response options to student responses for the same item stems presented in an open-ended format was also examined. A non-systematic (subject matter expertise) approach and a…

  17. If I Experience Formative Assessment Whilst Studying at University, Will I Put It into Practice Later as a Teacher? Formative and Shared Assessment in Initial Teacher Education (ITE)

    Science.gov (United States)

    Hamodi, Carolina; López-Pastor, Víctor Manuel; López-Pastor, Ana Teresa

    2017-01-01

    The aim of this article is to analyse whether having experience of formative assessment during their initial teacher education courses (ITE) influences graduates' subsequent practice as teachers. That is, if the assessment methods that university students are subject to during their learning process are then actually employed by them during their…

  18. Exploring Different Types of Assessment Items to Measure Linguistically Diverse Students' Understanding of Energy and Matter in Chemistry

    Science.gov (United States)

    Ryoo, Kihyun; Toutkoushian, Emily; Bedell, Kristin

    2018-01-01

    Energy and matter are fundamental, yet challenging concepts in middle school chemistry due to their abstract, unobservable nature. Although it is important for science teachers to elicit a range of students' ideas to design and revise their instruction, capturing such varied ideas using traditional assessments consisting of multiple-choice items…

  19. The Meaning of Goodness-of-Fit Tests: Commentary on "Goodness-of-Fit Assessment of Item Response Theory Models"

    Science.gov (United States)

    Thissen, David

    2013-01-01

    In this commentary, David Thissen states that "Goodness-of-fit assessment for IRT models is maturing; it has come a long way from zero." Thissen then references prior works on "goodness of fit" in the index of Lord and Novick's (1968) classic text; Yen (1984); Drasgow, Levine, Tsien, Williams, and Mead (1995); Chen and…

  20. North Star Ambulatory Assessment, 6-minute walk test and timed items in ambulant boys with Duchenne muscular dystrophy.

    Science.gov (United States)

    Mazzone, Elena; Martinelli, Diego; Berardinelli, Angela; Messina, Sonia; D'Amico, Adele; Vasco, Gessica; Main, Marion; Doglio, Luca; Politano, Luisa; Cavallaro, Filippo; Frosini, Silvia; Bello, Luca; Carlesi, Adelina; Bonetti, Anna Maria; Zucchini, Elisabetta; De Sanctis, Roberto; Scutifero, Marianna; Bianco, Flaviana; Rossi, Francesca; Motta, Maria Chiara; Sacco, Annalisa; Donati, Maria Alice; Mongini, Tiziana; Pini, Antonella; Battini, Roberta; Pegoraro, Elena; Pane, Marika; Pasquini, Elisabetta; Bruno, Claudio; Vita, Giuseppe; de Waure, Chiara; Bertini, Enrico; Mercuri, Eugenio

    2010-11-01

    The North Star Ambulatory Assessment is a functional scale specifically designed for ambulant boys affected by Duchenne muscular dystrophy (DMD). Recently the 6-minute walk test has also been used as an outcome measure in trials in DMD. The aim of our study was to assess a large cohort of ambulant boys affected by DMD using both North Star Assessment and 6-minute walk test. More specifically, we wished to establish the spectrum of findings for each measure and their correlation. This is a prospective multicentric study involving 10 centers. The cohort included 112 ambulant DMD boys of age ranging between 4.10 and 17 years (mean 8.18±2.3 DS). Ninety-one of the 112 were on steroids: 37/91 on intermittent and 54/91 on daily regimen. The scores on the North Star assessment ranged from 6/34 to 34/34. The distance on the 6-minute walk test ranged from 127 to 560.6 m. The time to walk 10 m was between 3 and 15 s. The time to rise from the floor ranged from 1 to 27.5 s. Some patients were unable to rise from the floor. As expected the results changed with age and were overall better in children treated with daily steroids. The North Star assessment had a moderate to good correlation with 6-minute walk test and with timed rising from floor but less with 10 m timed walk/run test. The 6-minute walk test in contrast had better correlation with 10 m timed walk/run test than with timed rising from floor. These findings suggest that a combination of these outcome measures can be effectively used in ambulant DMD boys and will provide information on different aspects of motor function, that may not be captured using a single measure. Copyright © 2010. Published by Elsevier B.V.

  1. Implementing Curriculum-Embedded Formative Assessment in Primary School Science Classrooms

    Science.gov (United States)

    Hondrich, Annika Lena; Hertel, Silke; Adl-Amini, Katja; Klieme, Eckhard

    2016-01-01

    The implementation of formative assessment strategies is challenging for teachers. We evaluated teachers' implementation fidelity of a curriculum-embedded formative assessment programme for primary school science education, investigating both material-supported, direct application and subsequent transfer. Furthermore, the relationship between…

  2. Web-Based Quiz-Game-Like Formative Assessment: Development and Evaluation

    Science.gov (United States)

    Wang, Tzu-Hua

    2008-01-01

    This research aims to develop a multiple-choice Web-based quiz-game-like formative assessment system, named GAM-WATA. The unique design of "Ask-Hint Strategy" turns the Web-based formative assessment into an online quiz game. "Ask-Hint Strategy" is composed of "Prune Strategy" and "Call-in Strategy".…

  3. Does Formative Assessment Improve Student Learning and Performance in Soil Science?

    Science.gov (United States)

    Kopittke, Peter M.; Wehr, J. Bernhard; Menzies, Neal W.

    2012-01-01

    Soil science students are required to apply knowledge from a range of disciplines to unfamiliar scenarios to solve complex problems. To encourage deep learning (with student performance an indicator of learning), a formative assessment exercise was introduced to a second-year soil science subject. For the formative assessment exercise, students…

  4. Drawing and Writing in Digital Science Notebooks: Sources of Formative Assessment Data

    Science.gov (United States)

    Shelton, Angi; Smith, Andrew; Wiebe, Eric; Behrle, Courtney; Sirkin, Ruth; Lester, James

    2016-01-01

    Formative assessment strategies are used to direct instruction by establishing where learners' understanding is, how it is developing, informing teachers and students alike as to how they might get to their next set of goals of conceptual understanding. For the science classroom, one rich source of formative assessment data about scientific…

  5. Combination of Formative and Summative Assessment Instruments in Elementary Algebra Classes: A Prescription for Success

    Science.gov (United States)

    Peterson, Euguenia; Siadat, M. Vali

    2009-01-01

    The purpose of this study is to examine the effects of the implementation of formative assessment on student achievement in elementary algebra classes at Richard J. Daley College in Chicago, IL. The formative assessment is defined in this case as frequent, cumulative, time-restricted, multiple-choice quizzes with immediate constructive feedback.…

  6. An Action Research Study of High School English Language Arts, Intensive Reading, and Formative Assessment Principles

    Science.gov (United States)

    Welch, Karen P.

    2017-01-01

    Formative assessment has been identified as an effective pedagogical practice in the field of education, where teachers and students engage daily in an interactive process to gather evidence of the students' proficiency of a specific learning goal. The evidence collected by the teacher and a student during the formative assessment process allows…

  7. Use of Formative Classroom Assessment Techniques in a Project Management Course

    Science.gov (United States)

    Purcell, Bernice M.

    2014-01-01

    Formative assessment is considered to be an evaluation technique that informs the instructor of the level of student learning, giving evidence when it may be necessary for the instructor to make a change in delivery based upon the results. Several theories of formative assessment exist, all which propound the importance of feedback to the student.…

  8. The Effect of Computer Models as Formative Assessment on Student Understanding of the Nature of Models

    Science.gov (United States)

    Park, Mihwa; Liu, Xiufeng; Smith, Erica; Waight, Noemi

    2017-01-01

    This study reports the effect of computer models as formative assessment on high school students' understanding of the nature of models. Nine high school teachers integrated computer models and associated formative assessments into their yearlong high school chemistry course. A pre-test and post-test of students' understanding of the nature of…

  9. Diverse Delivery Methods and Strong Psychological Benefits: A Review of Online Formative Assessment

    Science.gov (United States)

    McLaughlin, T.; Yan, Z.

    2017-01-01

    This article is a review of literature on online formative assessment (OFA). It includes a narrative summary that synthesizes the research on the diverse delivery methods of OFA, as well as the empirical literature regarding the strong psychological benefits and limitations. Online formative assessment can be delivered using many traditional…

  10. Using automatic item generation to create multiple-choice test items.

    Science.gov (United States)

    Gierl, Mark J; Lai, Hollis; Turner, Simon R

    2012-08-01

    Many tests of medical knowledge, from the undergraduate level to the level of certification and licensure, contain multiple-choice items. Although these are efficient in measuring examinees' knowledge and skills across diverse content areas, multiple-choice items are time-consuming and expensive to create. Changes in student assessment brought about by new forms of computer-based testing have created the demand for large numbers of multiple-choice items. Our current approaches to item development cannot meet this demand. We present a methodology for developing multiple-choice items based on automatic item generation (AIG) concepts and procedures. We describe a three-stage approach to AIG and we illustrate this approach by generating multiple-choice items for a medical licensure test in the content area of surgery. To generate multiple-choice items, our method requires a three-stage process. Firstly, a cognitive model is created by content specialists. Secondly, item models are developed using the content from the cognitive model. Thirdly, items are generated from the item models using computer software. Using this methodology, we generated 1248 multiple-choice items from one item model. Automatic item generation is a process that involves using models to generate items using computer technology. With our method, content specialists identify and structure the content for the test items, and computer technology systematically combines the content to generate new test items. By combining these outcomes, items can be generated automatically. © Blackwell Publishing Ltd 2012.

  11. The comparability of English, French and Dutch scores on the Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F: an assessment of differential item functioning in patients with systemic sclerosis.

    Directory of Open Access Journals (Sweden)

    Linda Kwakkenbos

    Full Text Available The Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F is commonly used to assess fatigue in rheumatic diseases, and has shown to discriminate better across levels of the fatigue spectrum than other commonly used measures. The aim of this study was to assess the cross-language measurement equivalence of the English, French, and Dutch versions of the FACIT-F in systemic sclerosis (SSc patients.The FACIT-F was completed by 871 English-speaking Canadian, 238 French-speaking Canadian and 230 Dutch SSc patients. Confirmatory factor analysis was used to assess the factor structure in the three samples. The Multiple-Indicator Multiple-Cause (MIMIC model was utilized to assess differential item functioning (DIF, comparing English versus French and versus Dutch patient responses separately.A unidimensional factor model showed good fit in all samples. Comparing French versus English patients, statistically significant, but small-magnitude DIF was found for 3 of 13 items. French patients had 0.04 of a standard deviation (SD lower latent fatigue scores than English patients and there was an increase of only 0.03 SD after accounting for DIF. For the Dutch versus English comparison, 4 items showed small, but statistically significant, DIF. Dutch patients had 0.20 SD lower latent fatigue scores than English patients. After correcting for DIF, there was a reduction of 0.16 SD in this difference.There was statistically significant DIF in several items, but the overall effect on fatigue scores was minimal. English, French and Dutch versions of the FACIT-F can be reasonably treated as having equivalent scoring metrics.

  12. The Comparability of English, French and Dutch Scores on the Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F): An Assessment of Differential Item Functioning in Patients with Systemic Sclerosis

    Science.gov (United States)

    Kwakkenbos, Linda; Willems, Linda M.; Baron, Murray; Hudson, Marie; Cella, David; van den Ende, Cornelia H. M.; Thombs, Brett D.

    2014-01-01

    Objective The Functional Assessment of Chronic Illness Therapy- Fatigue (FACIT-F) is commonly used to assess fatigue in rheumatic diseases, and has shown to discriminate better across levels of the fatigue spectrum than other commonly used measures. The aim of this study was to assess the cross-language measurement equivalence of the English, French, and Dutch versions of the FACIT-F in systemic sclerosis (SSc) patients. Methods The FACIT-F was completed by 871 English-speaking Canadian, 238 French-speaking Canadian and 230 Dutch SSc patients. Confirmatory factor analysis was used to assess the factor structure in the three samples. The Multiple-Indicator Multiple-Cause (MIMIC) model was utilized to assess differential item functioning (DIF), comparing English versus French and versus Dutch patient responses separately. Results A unidimensional factor model showed good fit in all samples. Comparing French versus English patients, statistically significant, but small-magnitude DIF was found for 3 of 13 items. French patients had 0.04 of a standard deviation (SD) lower latent fatigue scores than English patients and there was an increase of only 0.03 SD after accounting for DIF. For the Dutch versus English comparison, 4 items showed small, but statistically significant, DIF. Dutch patients had 0.20 SD lower latent fatigue scores than English patients. After correcting for DIF, there was a reduction of 0.16 SD in this difference. Conclusions There was statistically significant DIF in several items, but the overall effect on fatigue scores was minimal. English, French and Dutch versions of the FACIT-F can be reasonably treated as having equivalent scoring metrics. PMID:24638101

  13. Can cancer patients assess the influence of pain on functions? A randomised, controlled study of the pain interference items in the Brief Pain Inventory

    Directory of Open Access Journals (Sweden)

    Kaasa Stein

    2007-03-01

    Full Text Available Abstract Background The Brief Pain Inventory (BPI is recommended as a pain measurement tool by the Expert Working Group of the European Association of Palliative Care. The BPI is designed to assess both pain severity and interference with functions caused by pain. The purpose of this study was to investigate if pain interference items are influenced by other factors than pain. Methods We asked adult cancer patients to complete the original and a revised BPI on two study days. In the original version of the BPI the patients were asked how, during the last 24 hours, pain has interfered with functions. In the revised BPI this question was changed to how, during the last 24 hours, these functions are affected in general. Heath related quality of life was assessed at both study days applying the European Organization for Research and Treatment of Cancer quality of life questionnaire. Results Forty-eight of the 55 included patients completed both assessments. The BPI pain intensities scores and the health related quality of life scores were similar at the two study days. Except for mood this study observed no significant distinctions between the patients' BPI interference items scores in the original (pain influence on function and the revised BPI (function in general. Seventeen patients reported higher influence from pain on functions than the total influence on function from all causes. Conclusion We observed similar scores in the original BPI interference scores (pain influence on function compared with the revised BPI interference scores (decreased function in general. This finding might imply that the BPI interference scale measures are partly responded to as more of a global interference measure.

  14. Concurrent Validation of the Clinical Opiate Withdrawal Scale (COWS) and Single-Item Indices against the Clinical Institute Narcotic Assessment (CINA) Opioid Withdrawal Instrument

    Science.gov (United States)

    Tompkins, D. Andrew; Bigelow, George E.; Harrison, Joseph A.; Johnson, Rolley E.; Fudala, Paul J.; Strain, Eric C.

    2009-01-01

    Introduction The Clinical Opiate Withdrawal Scale (COWS) is an 11-item clinician-administered scale assessing opioid withdrawal. Though commonly used in clinical practice, it has not been systematically validated. The present study validated the COWS in comparison to the validated Clinical Institute Narcotic Assessment (CINA) scale. Method Opioid-dependent volunteers were enrolled in a residential trial and stabilized on morphine 30 mg given subcutaneously four times daily. Subjects then underwent double-blind, randomized challenges of intramuscularly administered placebo and naloxone (0.4 mg) on separate days, during which the COWS, CINA, and visual analog scale (VAS) assessments were concurrently obtained. Subjects completing both challenges were included (N=46). Correlations between mean peak COWS and CINA scores as well as self-report VAS questions were calculated. Results Mean peak COWS and CINA scores of 7.6 and 24.4, respectively, occurred on average 30 minutes post-injection of naloxone. Mean COWS and CINA scores 30 minutes after placebo injection were 1.3 and 18.9, respectively. The Pearson correlation coefficient for peak COWS and CINA scores during the naloxone challenge session was 0.85 (p<0.001). Peak COWS scores also correlated well with peak VAS self-report scores of bad drug effect (r=0.57, p<0.001) and feeling sick (r=0.57, p<0.001), providing additional evidence of concurrent validity. Placebo was not associated with any significant elevation of COWS, CINA, or VAS scores, indicating discriminant validity. Cronbach’s alpha for the COWS was 0.78, indicating good internal consistency (reliability). Discussion COWS, CINA, and certain VAS items are all valid measurement tools for acute opiate withdrawal. PMID:19647958

  15. Assessment of free and cued recall in Alzheimer's disease and vascular and frontotemporal dementia with 24-item Grober and Buschke test.

    Science.gov (United States)

    Cerciello, Milena; Isella, Valeria; Proserpi, Alice; Papagno, Costanza

    2017-01-01

    Alzheimer's disease (AD), vascular dementia (VaD) and frontotemporal dementia (FTD) are the most common forms of dementia. It is well known that memory deficits in AD are different from those in VaD and FTD, especially with respect to cued recall. The aim of this clinical study was to compare the memory performance in 15 AD, 10 VaD and 9 FTD patients and 20 normal controls by means of a 24-item Grober-Buschke test [8]. The patients' groups were comparable in terms of severity of dementia. We considered free and total recall (free plus cued) both in immediate and delayed recall and computed an Index of Sensitivity to Cueing (ISC) [8] for immediate and delayed trials. We assessed whether cued recall predicted the subsequent free recall across our patients' groups. We found that AD patients recalled fewer items from the beginning and were less sensitive to cueing supporting the hypothesis that memory disorders in AD depend on encoding and storage deficit. In immediate recall VaD and FTD showed a similar memory performance and a stronger sensitivity to cueing than AD, suggesting that memory disorders in these patients are due to a difficulty in spontaneously implementing efficient retrieval strategies. However, we found a lower ISC in the delayed recall compared to the immediate trials in VaD than FTD due to a higher forgetting in VaD.

  16. Methodology for the development and calibration of the SCI-QOL item banks.

    Science.gov (United States)

    Tulsky, David S; Kisala, Pamela A; Victorson, David; Choi, Seung W; Gershon, Richard; Heinemann, Allen W; Cella, David

    2015-05-01

    To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Individual interviews (n=44) and focus groups (n=65 individuals with SCI and n=42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n=877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n=245) to assess test-retest reliability and stability. A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury--Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM.

  17. Assessment of serotonergic system in formation of memory and learning

    Directory of Open Access Journals (Sweden)

    J. C. da Silva

    2017-11-01

    Full Text Available Abstract We evaluated the involvement of the serotonergic system on memory formation and learning processes in healthy adults Wistar rats. Fifty-seven rats of 5 groups had one serotonergic nuclei damaged by an electric current. Electrolytic lesion was carried out using a continuous current of 2mA during two seconds by stereotactic surgery. Animals were submitted to learning and memory tests. Rats presented different responses in the memory tests depending on the serotonergic nucleus involved. Both explicit and implicit memory may be affected after lesion although some groups showed significant difference and others did not. A damage in the serotonergic nucleus was able to cause impairment in the memory of Wistar. The formation of implicit and explicit memory is impaired after injury in some serotonergic nuclei.

  18. Grounding formative assessment in high-school chemistry classrooms: Connections between professional development and teacher practice

    Science.gov (United States)

    Cisterna Alburquerque, Dante Igor

    This study describes and analyzes the experiences of two high-school chemistry teachers who participated in a team-based professional development program to learn about and enact formative assessment in their classrooms. The overall purpose of this study is to explain how participation in this professional development influenced both teachers' classroom enactment of formative assessment practices. This study focuses on 1) teachers' participation in the professional development program, 2) teachers' enactment of formative assessment, and 3) factors that enabled or hindered enactment of formative assessment. Drawing on cultural-historical activity theory (CHAT) and using evidence from teacher lessons, teacher interviews, professional development meetings as data sources, this single embedded case study analyzes how these two teachers who participated in the same learning team and have similar characteristics (i.e., teaching in the same school, teaching the same courses and population of students, and using the same materials) differentially used the professional development learning about formative assessment as mediating tools to improve their classroom instruction. The learning team experience contributed to both teachers' development of a better understanding of formative assessment---especially in recognizing that their current grading and assessment practices were not appropriate to promote student learning---and the co-creation of artifacts to gather evidence of students' ideas. Although both teachers demonstrated understanding about how formative assessment may serve to promote student learning and had a set of tools available to utilize for formative assessment use, they did not enact these tools in the same way. One teacher appropriated formative assessment as mediating tool to verify if the students were following her explanations, and to check if the students were able to provide the correct response. The other teacher used the mediating tool to promote

  19. Online Formative Assessment in Higher Education: Its Pros and Cons

    Science.gov (United States)

    Baleni, Zwelijongile Gaylard

    2015-01-01

    Online and blended learning have become common educational strategy in higher education. Lecturers have to re-theorise certain basic concerns of teaching, learning and assessment in non-traditional environments. These concerns include perceptions such as cogency and trustworthiness of assessment in online environments in relation to serving the…

  20. Improving Formative Assessment Practice with Educational Information Technology

    Directory of Open Access Journals (Sweden)

    Terry Vendlinski

    2006-12-01

    Full Text Available This paper describes a web-based assessment design tool, the ADDS, that provides teachers both a structure and the resources required to develop and use quality assessments. The tool is applicable across subject domains. The heart of the ADDS is an assessment design workspace that allows teachers to decide the attributes of an assessment, as well as the context and type of responses the students will generate, as part of their assessment design process. While the tool is very flexible and allows the above steps to be done in any order (or skipped entirely, our goal was to streamline and scaffold the process for teachers by organizing all the materials for them in one place and to provide resources they could use or reuse to create assessments for their students. The tool allows teachers to deliver the assessments to their students either online or on paper. Initial results from our first teacher study suggest that teachers who used the tool developed assessments that were more cognitively demanding of students and addressed the "big ideas" rather than disassociated facts of a domain.

  1. Correlation between the pain numeric rating scale and the 12-item WHO Disability Assessment Schedule 2.0 in patients with musculoskeletal pain.

    Science.gov (United States)

    Saltychev, Mikhail; Bärlund, Esa; Laimi, Katri

    2018-03-01

    The aim of this study was to assess the correlation between pain severity measured on a numeric rating scale and restrictions of functioning measured with the WHO Disability Assessment Schedule (WHODAS 2.0). This was a cross-sectional study of 1207 patients with musculoskeletal pain conditions. Correlation was assessed using Spearman's and Pearson tests. Although all the Spearman's rank correlations between WHODAS 2.0 items and pain severity were statistically significant, they were mostly weak, with only a few moderate associations for 'S2 household responsibilities', 'S8 washing', 'S9 dressing', and 'S12 day-to-day work'. The correlation between the WHODAS 2.0 total score and pain severity was also moderate: 0.41 [95% confidence interval (CI): 0.36-0.45] for average pain and 0.42 (95% CI: 0.37-0.46) for worst pain. The correlation between the WHODAS 2.0 total score and pain level was also assessed using Pearson's product-moment correlation, yielding figures that were similar to Spearman's correlation: 0.42 (Pcorrelation between pain severity measured by numeric rating scale and functioning level measured by WHODAS 2.0 was weak to moderate, with slightly stronger associations in physical domains of functioning.

  2. An NCME Instructional Module on Polytomous Item Response Theory Models

    Science.gov (United States)

    Penfield, Randall David

    2014-01-01

    A polytomous item is one for which the responses are scored according to three or more categories. Given the increasing use of polytomous items in assessment practices, item response theory (IRT) models specialized for polytomous items are becoming increasingly common. The purpose of this ITEMS module is to provide an accessible overview of…

  3. Geochemical parameters of radioelements applied to assess uranium prospects in geological formation

    International Nuclear Information System (INIS)

    Ma Zhongxiang.

    1988-01-01

    Based on geochemical characteristics of radioelements and the theory of facieology, the author describes the characteristics of the distribution of U, Th and K in sedimentary formation and the relationship between their combined parameters MA and MB and uranium mineralization in geological formation. The ranges of MA and MB in uraniferous geological formation used to assess four different levels of uranium mineralization in regional investigation are obtained from the comparision of combined parameters MA and MB in the geological formation with different levels of mineralization and the experience is provided for quantitatively assessing uranium prospects in geological by multi-parameter model of radioelements

  4. An online formative assessment tool to prepare students

    African Journals Online (AJOL)

    feedback to students and promotes student learning, whereas summative assessment is .... Kolb's experiential learning cycle, as it offers students the opportunity for ..... Theory and learning in medical education: How theory can inform practice.

  5. Formative assessment with IMS LD and IMS QTI

    NARCIS (Netherlands)

    Tattersall, Colin; Burgos, Daniel; Vogten, Hubert

    2006-01-01

    The presentation descibes work on the integtation of LD and QTI, and the realisation of this research via CCSI. Furthermore, the presentation describes ongoing work in TENCompetence in the area of assessment.

  6. The importance of rating scale design in the measurement of patient-reported outcomes using questionnaires or item banks.

    Science.gov (United States)

    Khadka, Jyoti; McAlinden, Colm; Gothwal, Vijaya K; Lamoureux, Ecosse L; Pesudovs, Konrad

    2012-06-26

    To investigate the effect of rating scale designs (question formats and response categories) on item difficulty calibrations and assess the impact that rating scale differences have on overall vision-related activity limitation (VRAL) scores. Sixteen existing patient-reported outcome instruments (PROs) suitable for cataract assessment, with different rating scales, were self-administered by patients on a cataract surgery waiting list. A total of 226 VRAL items from these PROs in their native rating scales were included in an item bank and calibrated using Rasch analysis. Fifteen item/content areas (e.g., reading newspapers) appearing in at least three different PROs were identified. Within each content area, item calibrations were compared and their range calculated. Similarly, five PROs having at least three items in common with the Visual Function (VF-14) were compared in terms of average item measures. A total of 614 patients (mean age ± SD, 74.1 ± 9.4 years) participated. Items with the same content varied in their calibration by as much as two logits; "reading the small print" had the largest range (1.99 logits) followed by "watching TV" (1.60). Compared with the VF-14 (0.00 logits), the rating scale of the Visual Disability Assessment (1.13 logits) produced the most difficult items and the Cataract Symptom Scale (0.24 logits) produced the least difficult items. The VRAL item bank was suboptimally targeted to the ability level of the participants (2.00 logits). Rating scale designs have a significant effect on item calibrations. Therefore, constructing item banks from existing items in their native formats carries risks to face validity and transmission of problems inherent in existing instruments, such as poor targeting.

  7. What Do You Think You Are Measuring? A Mixed-Methods Procedure for Assessing the Content Validity of Test Items and Theory-Based Scaling

    Science.gov (United States)

    Koller, Ingrid; Levenson, Michael R.; Glück, Judith

    2017-01-01

    The valid measurement of latent constructs is crucial for psychological research. Here, we present a mixed-methods procedure for improving the precision of construct definitions, determining the content validity of items, evaluating the representativeness of items for the target construct, generating test items, and analyzing items on a theoretical basis. To illustrate the mixed-methods content-scaling-structure (CSS) procedure, we analyze the Adult Self-Transcendence Inventory, a self-report measure of wisdom (ASTI, Levenson et al., 2005). A content-validity analysis of the ASTI items was used as the basis of psychometric analyses using multidimensional item response models (N = 1215). We found that the new procedure produced important suggestions concerning five subdimensions of the ASTI that were not identifiable using exploratory methods. The study shows that the application of the suggested procedure leads to a deeper understanding of latent constructs. It also demonstrates the advantages of theory-based item analysis. PMID:28270777

  8. Deep Impact: How a Job-Embedded Formative Assessment Professional Development Model Affected Teacher Practice

    Directory of Open Access Journals (Sweden)

    Thomas A. Stewart

    2014-02-01

    Full Text Available This study supports the work of Black and Wiliam (1998, who demonstrated that when teachers effectively utilize formative assessment strategies, student learning increases significantly. However, the researchers also found a “poverty of practice” among teachers, in that few fully understood how to implement classroom formative assessment. This qualitative case study examined a series of voluntary workshops offered at one middle school designed to address this poverty of practice. Data were gathered via semi-structured interviews. These research questions framed the study: (1 What role did a professional learning community structure play in shaping workshop participants’ perceived effectiveness of a voluntary formative assessment initiative? (2 How did this initiative affect workshop participants’ perceptions of their knowledge of formative assessment and differentiation strategies? (3 How did it affect workshop participants’ perceptions of their abilities to teach others about formative assessment and differentiated instruction? (4 How did it affect school-wide use of classroom-level strategies? Results indicated that teacher workshop participants experienced a growth in their capacity to use and teach others various formative assessment strategies, and even non-participating teachers reported greater use of formative assessment in their own instruction. Workshop participants and non-participating teachers perceived little growth in the area of differentiation of instruction, which contradicted some administrator perceptions.

  9. Methodological quality of diagnostic accuracy studies on non-invasive coronary CT angiography: influence of QUADAS (Quality Assessment of Diagnostic Accuracy Studies included in systematic reviews) items on sensitivity and specificity

    Energy Technology Data Exchange (ETDEWEB)

    Schueler, Sabine; Walther, Stefan; Schuetz, Georg M. [Humboldt-Universitaet zu Berlin, Freie Universitaet Berlin, Charite Medical School, Department of Radiology, Berlin (Germany); Schlattmann, Peter [University Hospital of Friedrich Schiller University Jena, Department of Medical Statistics, Informatics, and Documentation, Jena (Germany); Dewey, Marc [Humboldt-Universitaet zu Berlin, Freie Universitaet Berlin, Charite Medical School, Department of Radiology, Berlin (Germany); Charite, Institut fuer Radiologie, Berlin (Germany)

    2013-06-15

    To evaluate the methodological quality of diagnostic accuracy studies on coronary computed tomography (CT) angiography using the QUADAS (Quality Assessment of Diagnostic Accuracy Studies included in systematic reviews) tool. Each QUADAS item was individually defined to adapt it to the special requirements of studies on coronary CT angiography. Two independent investigators analysed 118 studies using 12 QUADAS items. Meta-regression and pooled analyses were performed to identify possible effects of methodological quality items on estimates of diagnostic accuracy. The overall methodological quality of coronary CT studies was merely moderate. They fulfilled a median of 7.5 out of 12 items. Only 9 of the 118 studies fulfilled more than 75 % of possible QUADAS items. One QUADAS item (''Uninterpretable Results'') showed a significant influence (P = 0.02) on estimates of diagnostic accuracy with ''no fulfilment'' increasing specificity from 86 to 90 %. Furthermore, pooled analysis revealed that each QUADAS item that is not fulfilled has the potential to change estimates of diagnostic accuracy. The methodological quality of studies investigating the diagnostic accuracy of non-invasive coronary CT is only moderate and was found to affect the sensitivity and specificity. An improvement is highly desirable because good methodology is crucial for adequately assessing imaging technologies. (orig.)

  10. Methodological quality of diagnostic accuracy studies on non-invasive coronary CT angiography: influence of QUADAS (Quality Assessment of Diagnostic Accuracy Studies included in systematic reviews) items on sensitivity and specificity

    International Nuclear Information System (INIS)

    Schueler, Sabine; Walther, Stefan; Schuetz, Georg M.; Schlattmann, Peter; Dewey, Marc

    2013-01-01

    To evaluate the methodological quality of diagnostic accuracy studies on coronary computed tomography (CT) angiography using the QUADAS (Quality Assessment of Diagnostic Accuracy Studies included in systematic reviews) tool. Each QUADAS item was individually defined to adapt it to the special requirements of studies on coronary CT angiography. Two independent investigators analysed 118 studies using 12 QUADAS items. Meta-regression and pooled analyses were performed to identify possible effects of methodological quality items on estimates of diagnostic accuracy. The overall methodological quality of coronary CT studies was merely moderate. They fulfilled a median of 7.5 out of 12 items. Only 9 of the 118 studies fulfilled more than 75 % of possible QUADAS items. One QUADAS item (''Uninterpretable Results'') showed a significant influence (P = 0.02) on estimates of diagnostic accuracy with ''no fulfilment'' increasing specificity from 86 to 90 %. Furthermore, pooled analysis revealed that each QUADAS item that is not fulfilled has the potential to change estimates of diagnostic accuracy. The methodological quality of studies investigating the diagnostic accuracy of non-invasive coronary CT is only moderate and was found to affect the sensitivity and specificity. An improvement is highly desirable because good methodology is crucial for adequately assessing imaging technologies. (orig.)

  11. Formative assessment framework proposal for transversal competencies: Application to analysis and problem-solving competence

    Directory of Open Access Journals (Sweden)

    Pedro Gómez-Gasquet

    2018-04-01

    Full Text Available Purpose: In the last years, there is an increasing interest in the manner that transversal competences (TC are introduced in the curricula. Transversal competences are generic and relevant skills that students have to develop through the several stages of the educational degrees. This paper analyses TCs in the context of the learning process of undergraduate and postgraduate courses. The main aim of this paper is to propose a framework to improve results. The framework facilities the student's training and one of the important pieces is undoubtedly that he has constant feedback from his assessments that allowing to improve the learning. An applying in the analysis and problem solving competence in the context of Master Degree in Advanced Engineering Production, Logistics and Supply Chain at the UPV is carried out. Design/methodology/approach: The work is the result of several years of professional experience in the application of the concept of transversal competence in the UPV with undergraduate and graduate students. As a result of this work and various educational innovation projects, a team of experts has been created, which has been discussing some aspects relevant to the improvement of the teaching-learning process. One of these areas of work has been in relation to the integration of various proposals on the application and deployment of transversal competences. With respect to this work, a conceptual proposal is proposed that has subsequently been empirically validated through the analysis of the results of several groups of students in a degree. Findings: The main result that is offered in the work is a framework that allows identifying the elements that are part of the learning process in the area of transversal competences. Likewise, the different items that are part of the framework are linked to the student's life cycle, and a temporal scope is established for their deployment. Practical implications: One of the most noteworthy

  12. "Formative ""Use"" of Assessment Information: It's a Process, so Let's Say What We Mean"

    Directory of Open Access Journals (Sweden)

    Robert Good

    2011-02-01

    Full Text Available The term formative assessment is often used to describe a type of assessment. The purpose of this paper is to challenge the use of this phrase given that formative assessment as a noun phrase ignores the well-established understanding that it is a process more than an object. A model that combines content, context, and strategies is presented as one way to view the process nature of assessing formatively. The alternate phrase formative use of assessment information is suggested as a more appropriate way to describe how content, context, and strategies can be used together in order to close the gap between where a student is performing currently and the intended learning goal.

  13. Hydrocarbon potential assessment of Ngimbang formation, Rihen field of Northeast Java Basin

    Science.gov (United States)

    Pandito, R. H.; Haris, A.; Zainal, R. M.; Riyanto, A.

    2017-07-01

    The assessment of Ngimbang formation at Rihen field of Northeast Java Basin has been conducted to identify the hydrocarbon potential by analyzing the response of passive seismic on the proven reservoir zone and proposing a tectonic evolution model. In the case of petroleum exploration in Northeast Java basin, the Ngimbang formation cannot be simply overemphasized. East Java Basin has been well known as one of the mature basins producing hydrocarbons in Indonesia. This basin was stratigraphically composed of several formations from the old to the young i.e., the basement, Ngimbang, Kujung, Tuban, Ngerayong, Wonocolo, Kawengan and Lidah formation. All of these formations have proven to become hydrocarbon producer. The Ngrayong formation, which is geologically dominated by channels, has become a production formation. The Kujung formation that has been known with the reef build up has produced more than 102 million barrel of oil. The Ngimbang formation so far has not been comprehensively assessed in term its role as a source rock and a reservoir. In 2013, one exploratory well has been drilled at Ngimbang formation and shown a gas discovery, which is indicated on Drill Stem Test (DST) reading for more than 22 MMSCFD of gas. This discovery opens new prospect in exploring the Ngimbang formation.

  14. Item-saving assessment of self-care performance in children with developmental disabilities: A prospective caregiver-report computerized adaptive test

    Science.gov (United States)

    Chen, Cheng-Te; Chen, Yu-Lan; Lin, Yu-Ching; Hsieh, Ching-Lin; Tzeng, Jeng-Yi

    2018-01-01

    Objective The purpose of this study was to construct a computerized adaptive test (CAT) for measuring self-care performance (the CAT-SC) in children with developmental disabilities (DD) aged from 6 months to 12 years in a content-inclusive, precise, and efficient fashion. Methods The study was divided into 3 phases: (1) item bank development, (2) item testing, and (3) a simulation study to determine the stopping rules for the administration of the CAT-SC. A total of 215 caregivers of children with DD were interviewed with the 73-item CAT-SC item bank. An item response theory model was adopted for examining the construct validity to estimate item parameters after investigation of the unidimensionality, equality of slope parameters, item fitness, and differential item functioning (DIF). In the last phase, the reliability and concurrent validity of the CAT-SC were evaluated. Results The final CAT-SC item bank contained 56 items. The stopping rules suggested were (a) reliability coefficient greater than 0.9 or (b) 14 items administered. The results of simulation also showed that 85% of the estimated self-care performance scores would reach a reliability higher than 0.9 with a mean test length of 8.5 items, and the mean reliability for the rest was 0.86. Administering the CAT-SC could reduce the number of items administered by 75% to 84%. In addition, self-care performances estimated by the CAT-SC and the full item bank were very similar to each other (Pearson r = 0.98). Conclusion The newly developed CAT-SC can efficiently measure self-care performance in children with DD whose performances are comparable to those of TD children aged from 6 months to 12 years as precisely as the whole item bank. The item bank of the CAT-SC has good reliability and a unidimensional self-care construct, and the CAT can estimate self-care performance with less than 25% of the items in the item bank. Therefore, the CAT-SC could be useful for measuring self-care performance in children with

  15. Formative assessment as mediation | de Vos | Perspectives in ...

    African Journals Online (AJOL)

    This article examines the inherent tensions resulting from CRA's links to both behaviourism and constructivism and argues that more nuance and interpretation is required if the assessor is to engage his/her students with criterion-based assessment from a constructivist paradigm. One way to negotiate the tensions between ...

  16. Using case method to explicitly teach formative assessment in preservice teacher science education

    Science.gov (United States)

    Bentz, Amy Elizabeth

    The process of formative assessment improves student understanding; however, the topic of formative assessment in preservice education has been severely neglected. Since a major goal of teacher education is to create reflective teaching professionals, preservice teachers should be provided an opportunity to critically reflect on the use of formative assessment in the classroom. Case method is an instructional methodology that allows learners to engage in and reflect on real-world situations. Case based pedagogy can play an important role in enhancing preservice teachers' ability to reflect on teaching and learning by encouraging alternative ways of thinking about assessment. Although the literature on formative assessment and case methodology are extensive, using case method to explore the formative assessment process is, at best, sparse. The purpose of this study is to answer the following research questions: To what extent does the implementation of formative assessment cases in methods instruction influence preservice elementary science teachers' knowledge of formative assessment? What descriptive characteristics change between the preservice teachers' pre-case and post-case written reflection that would demonstrate learning had occurred? To investigate these questions, preservice teachers in an elementary methods course were asked to reflect on and discuss five cases. Pre/post-case data was analyzed. Results indicate that the preservice teachers modified their ideas to reflect the themes that were represented within the cases and modified their reflections to include specific ideas or examples taken directly from the case discussions. Comparing pre- and post-case reflections, the data supports a noted change in how the preservice teachers interpreted the case content. The preservice teachers began to evaluate the case content, question the lack of formative assessment concepts and strategies within the case, and apply formative assessment concepts and

  17. Assessment of bone formation capacity using in vivo transplantation assays: procedure and tissue analysis

    DEFF Research Database (Denmark)

    Abdallah, Basem; Ditzel, Nicholas; Kassem, Moustapha

    2008-01-01

    In vivo assessment of bone formation (osteogenesis) potential by isolated cells is an important method for analysis of cells and factors control ling bone formation. Currently, cell implantation mixed with hydroxyapa-tite/tricalcium phosphate in an open system (subcutaneous implantation) in immun...

  18. Differential items functioning to assess aggressiveness in college students / Funcionamento diferencial de itens para avaliar a agressividade de universitários

    Directory of Open Access Journals (Sweden)

    Fermino Fernandes Sisto

    2008-01-01

    Full Text Available In this research evidences of construct validity were searched analyzing the differential functioning items related to aggressiveness. The participants were 445 college students of both genders, attending the courses of Engineering, Computing and Psychology. The scale of aggressiveness composed by 81 items was collectively applied, in the classroom, to the students who consented to participate in the study. The items of the instrument were studied by means of the Rasch model. Twenty-eight items presented differential functioning item, 15 were characterized as typical for females and 13 for males. The reliability coefficients were 0.99 to the items and 0.86 to the persons. It was concluded that the aggressiveness can be measured separately on the basis of gender.

  19. Evaluating design-based formative assessment practices in outdoor science teaching

    DEFF Research Database (Denmark)

    Hartmeyer, Rikke; Stevenson, Matthew Peter; Bentsen, Peter

    2016-01-01

    Background and purpose: Research in formative assessment often pays close attention to the strategies which can be used by teachers. However, less emphasis in the literature seems to have been paid to study the application of formative assessment designs in practice. In this paper, we argue...... that a formative assessment design that we call Eva-Mapping, which is developed on the principles of design-based research, can be a productive starting point for disseminating and further developing formative assessment practices in outdoor science teaching. Sample, design and methods: We conducted an evaluation...... of the design, based on video-elicited focus group interviews with two groups of experienced science teachers. Both groups consisted of teachers who taught science outside the classroom on a regular basis. These groups watched identical video sequences which were recorded during lessons in which teachers...

  20. Effect of Repeated/Spaced Formative Assessments on Medical School Final Exam Performance

    Directory of Open Access Journals (Sweden)

    Edward K. Chang

    2017-06-01

    Discussion: Performance on weekly formative assessments was predictive of final exam scores. Struggling medical students will benefit from extra cumulative practice exams while students who are excelling do not need extra practice.

  1. Reducing the item number to obtain the same-length self-assessment scales: a systematic approach using result of graphical loglinear rasch models

    DEFF Research Database (Denmark)

    Nielsen, Tine; Kreiner, Svend

    2011-01-01

    The Revised Danish Learning Styles Inventory (R-D-LSI) (Nielsen 2005), which is an adaptation of Sternberg- Wagner Thinking Styles Inventory (Sternberg, 1997), comprises 14 subscales, each measuring a separate learning style. Of these 14 subscales, 9 are eight items long and 5 are seven items long...... Inventory (D-SA-LSI) comprising 14 subscales each with an item length of seven. The systematic approach to item reduction based on results of GLLRM will be presented and exemplified by its application to the R-D-LSI....

  2. Calibration of Automatically Generated Items Using Bayesian Hierarchical Modeling.

    Science.gov (United States)

    Johnson, Matthew S.; Sinharay, Sandip

    For complex educational assessments, there is an increasing use of "item families," which are groups of related items. However, calibration or scoring for such an assessment requires fitting models that take into account the dependence structure inherent among the items that belong to the same item family. C. Glas and W. van der Linden…

  3. An Interview with Marcia Tate: Formative Assessment and Brain Based Learning

    OpenAIRE

    Shaughnessy, Michael

    2016-01-01

    In this interview, Dr. Marcia Tate discusses her work and focuses on critical issues in brain based learning, and the need for both formative and summative assessment. Tangential issues such as grade retention, and response to intervention are also discussed. It is hope that this interview will assist teachers in the instructional and learning process and aid in both formative and summative assessment.  

  4. Assessing the test-retest repeatability of the Vietnamese version of the National Eye Institute 25-item Visual Function Questionnaire among bilateral cataract patients for a Vietnamese population.

    Science.gov (United States)

    To, Kien Gia; Meuleners, Lynn; Chen, Huei-Yang; Lee, Andy; Do, Dung Van; Duong, Dat Van; Phi, Tien Duy; Tran, Hoang Huy; Nguyen, Nguyen Do

    2014-06-01

    To determine the test-retest repeatability of the National Eye Institute 25-item Visual Function Questionnaire (NEI VFQ-25) for use with older Vietnamese adults with bilateral cataract. The questionnaire was translated into Vietnamese and back-translated into English by two independent translators. Patients with bilateral cataract aged 50 and older completed the questionnaire on two separate occasions, one to two weeks after first administration of the questionnaire. Test-retest repeatability was assessed using the Cronbach's α and intraclass correlation coefficients. The average age of participants was 67 ± 8 years and most participants were female (73%). Internal consistency was acceptable with the α coefficient above 0.7 for all subscales and intraclass correlation coefficients were 0.6 or greater in all subscales. The Vietnamese NEI VFQ-25 is reliable for use in studies assessing vision-related quality of life in older adults with bilateral cataract in Vietnam. We propose some modifications to the NEI-VFQ questions to reflect activities of older people in Vietnam. © 2013 ACOTA.

  5. Employing Computer Assisted Assessment (CAA to facilitate formative assessment in the State Secondary School: a case study

    Directory of Open Access Journals (Sweden)

    Effimia Karagianni

    2012-02-01

    Full Text Available Based on theories of assessment as well as on the pedagogical and administrative advantages Computer Assisted Assessment (CAA has to offer in foreign language learning, the study presented in this paper examines how computers can facilitate the formative assessment of EFL learners and enhance their feeling of responsibility towards monitoring their progress. The subjects of the study were twenty five 14-year-old students attending the third class of a State Gymnasium in Greece. The instruments utilized were questionnaires on motivation and learning styles, three quizzes designed with the software Hot Potatoes, a self–assessment questionnaire and an evaluation questionnaire showing the subjects’ attitudes towards the experience of using computers for assessing purposes. After reviewing formative assessment, CAA and how these two can be combined, the paper focuses on the description of the three class quizzes used in the study. Ιnformation from the questionnaires filled in by students combined with the results of the quizzes, shows how computers can be used to provide continuous ongoing measurement of students’ progress needed for formative assessment. The results are also used to show how students and teachers can benefit from formative CAA and the extent to which such kind of assessment could be applicable in the Greek state school reality.

  6. Perspectives and Practices of Elementary Teachers Using an Internet-Based Formative Assessment Tool: The Case of "Assessing Mathematics Concepts"

    Science.gov (United States)

    Martin, Christie S.; Polly, Drew; Wang, Chuang; Lambert, Richard G.; Pugalee, David K.

    2016-01-01

    This study examined the influence of professional development on elementary school teachers' perceptions of and use of an internet-based formative assessment tool focused on students' number sense skills. Data sources include teacher-participants' pre and post survey, open ended response on post survey, use of the assessment tool and their written…

  7. Safety assessment methodology for waste repositories in deep geological formations

    International Nuclear Information System (INIS)

    Chapuis, A.M.; Lewi, J.; Pradel, J.; Queniart, D.; Raimbault, P.; Assouline, M.

    1986-06-01

    The long term safety of a nuclear waste repository relies on the evaluation of the doses which could be transferred to man in the future. This implies a detailed knowledge of the medium where the waste will be confined, the identification of the basic phenomena which govern the migration of the radionuclides and the investigation of all possible scenarios that may affect the integrity of the barriers between the waste and the biosphere. Inside the Institute of protection and nuclear safety of the French Atomic Energy Commission (CEA/IPSN), the Department of the Safety Analysis (DAS) is currently developing a methodology for assessing the safety of future geological waste repositories, and is in charge of the modelling development, while the Department of Technical Protection (DPT) is in charge of the geological experimental studies. Both aspects of this program are presented. The methodology for risk assessment stresses the needs for coordination between data acquisition and model development which should result in the obtention of an efficient tool for safety evaluation. Progress needs to be made in source and geosphere modelling. Much more sophisticated models could be used than the ones which is described; however sensitivity analysis will determine the level of sophistication which is necessary to implement. Participation to international validation programs are also very important for gaining confidence in the approaches which have been chosen

  8. Perceptions and attitudes of formative assessments in middle-school science classes

    Science.gov (United States)

    Chauncey, Penny Denyse

    No Child Left Behind mandates utilizing summative assessment to measure schools' effectiveness. The problem is that summative assessment measures students' knowledge without depth of understanding. The goal of public education, however, is to prepare students to think critically at higher levels. The purpose of this study was to examine any difference between formative assessment incorporated in instruction as opposed to the usual, more summative methods in terms of attitudes and academic achievement of middle-school science students. Maslow's theory emphasizes that individuals must have basic needs met before they can advance to higher levels. Formative assessment enables students to master one level at a time. The research questions focused on whether statistically significant differences existed between classrooms using these two types of assessments on academic tests and an attitude survey. Using a quantitative quasi-experimental control-group design, data were obtained from a sample of 430 middle-school science students in 6 classes. One control and 2 experimental classes were assigned to each teacher. Results of the independent t tests revealed academic achievement was significantly greater for groups that utilized formative assessment. No significant difference in attitudes was noted. Recommendations include incorporating formative assessment results with the summative results. Findings from this study could contribute to positive social change by prompting educational stakeholders to examine local and state policies on curriculum as well as funding based on summative scores alone. Use of formative assessment can lead to improved academic success.

  9. The feasibility of a multi-format Web-based assessment of physicians' communication skills.

    Science.gov (United States)

    Kim, Sara; Brock, Douglas M; Hess, Brian J; Holmboe, Eric S; Gallagher, Thomas H; Lipner, Rebecca S; Mazor, Kathleen M

    2011-09-01

    Little is known about the best approaches and format for measuring physicians' communication skills in an online environment. This study examines the reliability and validity of scores from two Web-based communication skill assessment formats. We created two online communication skill assessment formats: (a) MCQ (multiple-choice questions) consisting of video-based multiple-choice questions; (b) multi-format including video-based multiple-choice questions with rationales, Likert-type scales, and free text responses of what physicians would say to a patient. We randomized 100 general internists to each test format. Peer and patient ratings collected via the American Board of Internal Medicine (ABIM) served as validity sources. Seventy-seven internists completed the tests (MCQ: 38; multi-format: 39). The adjusted reliability was 0.74 for both formats. Excellent communicators, as based on their peer and patient ratings, performed slightly better on both tests than adequate communicators, though this difference was not statistically significant. Physicians in both groups rated test format innovative (4.2 out of 5.0). The acceptable reliability and participants' overall positive experiences point to the value of ongoing research into rigorous Web-based communication skills assessment. With efficient and reliable scoring, the Web offers an important way to measure and potentially enhance physicians' communication skills. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  10. Use of UV-C radiation to disinfect non-critical patient care items: a laboratory assessment of the Nanoclave Cabinet

    Directory of Open Access Journals (Sweden)

    Moore Ginny

    2012-08-01

    Full Text Available Abstract Background The near-patient environment is often heavily contaminated, yet the decontamination of near-patient surfaces and equipment is often poor. The Nanoclave Cabinet produces large amounts of ultraviolet-C (UV-C radiation (53 W/m2 and is designed to rapidly disinfect individual items of clinical equipment. Controlled laboratory studies were conducted to assess its ability to eradicate a range of potential pathogens including Clostridium difficile spores and Adenovirus from different types of surface. Methods Each test surface was inoculated with known levels of vegetative bacteria (106 cfu/cm2, C. difficile spores (102-106 cfu/cm2 or Adenovirus (109 viral genomes, placed in the Nanoclave Cabinet and exposed for up to 6 minutes to the UV-C light source. Survival of bacterial contaminants was determined via conventional cultivation techniques. Degradation of viral DNA was determined via PCR. Results were compared to the number of colonies or level of DNA recovered from non-exposed control surfaces. Experiments were repeated to incorporate organic soils and to compare the efficacy of the Nanoclave Cabinet to that of antimicrobial wipes. Results After exposing 8 common non-critical patient care items to two 30-second UV-C irradiation cycles, bacterial numbers on 40 of 51 target sites were consistently reduced to below detectable levels (≥ 4.7 log10 reduction. Bacterial load was reduced but still persisted on other sites. Objects that proved difficult to disinfect using the Nanoclave Cabinet (e.g. blood pressure cuff were also difficult to disinfect using antimicrobial wipes. The efficacy of the Nanoclave Cabinet was not affected by the presence of organic soils. Clostridium difficile spores were more resistant to UV-C irradiation than vegetative bacteria. However, two 60-second irradiation cycles were sufficient to reduce the number of surface-associated spores from 103 cfu/cm2 to below detectable levels. A 3 log10 reduction in

  11. Factoring handedness data: I. Item analysis.

    Science.gov (United States)

    Messinger, H B; Messinger, M I

    1995-12-01

    Recently in this journal Peters and Murphy challenged the validity of factor analyses done on bimodal handedness data, suggesting instead that right- and left-handers be studied separately. But bimodality may be avoidable if attention is paid to Oldfield's questionnaire format and instructions for the subjects. Two characteristics appear crucial: a two-column LEFT-RIGHT format for the body of the instrument and what we call Oldfield's Admonition: not to indicate strong preference for handedness item, such as write, unless "... the preference is so strong that you would never try to use the other hand unless absolutely forced to...". Attaining unimodality of an item distribution would seem to overcome the objections of Peters and Murphy. In a 1984 survey in Boston we used Oldfield's ten-item questionnaire exactly as published. This produced unimodal item distributions. With reflection of the five-point item scale and a logarithmic transformation, we achieved a degree of normalization for the items. Two surveys elsewhere based on Oldfield's 20-item list but with changes in the questionnaire format and the instructions, yielded markedly different item distributions with peaks at each extreme and sometimes in the middle as well.

  12. Adaptive screening for depression--recalibration of an item bank for the assessment of depression in persons with mental and somatic diseases and evaluation in a simulated computer-adaptive test environment.

    Science.gov (United States)

    Forkmann, Thomas; Kroehne, Ulf; Wirtz, Markus; Norra, Christine; Baumeister, Harald; Gauggel, Siegfried; Elhan, Atilla Halil; Tennant, Alan; Boecker, Maren

    2013-11-01

    This study conducted a simulation study for computer-adaptive testing based on the Aachen Depression Item Bank (ADIB), which was developed for the assessment of depression in persons with somatic diseases. Prior to computer-adaptive test simulation, the ADIB was newly calibrated. Recalibration was performed in a sample of 161 patients treated for a depressive syndrome, 103 patients from cardiology, and 103 patients from otorhinolaryngology (mean age 44.1, SD=14.0; 44.7% female) and was cross-validated in a sample of 117 patients undergoing rehabilitation for cardiac diseases (mean age 58.4, SD=10.5; 24.8% women). Unidimensionality of the itembank was checked and a Rasch analysis was performed that evaluated local dependency (LD), differential item functioning (DIF), item fit and reliability. CAT-simulation was conducted with the total sample and additional simulated data. Recalibration resulted in a strictly unidimensional item bank with 36 items, showing good Rasch model fit (item fit residualsLD. CAT simulation revealed that 13 items on average were necessary to estimate depression in the range of -2 and +2 logits when terminating at SE≤0.32 and 4 items if using SE≤0.50. Receiver Operating Characteristics analysis showed that θ estimates based on the CAT algorithm have good criterion validity with regard to depression diagnoses (Area Under the Curve≥.78 for all cut-off criteria). The recalibration of the ADIB succeeded and the simulation studies conducted suggest that it has good screening performance in the samples investigated and that it may reasonably add to the improvement of depression assessment. © 2013.

  13. Assessing Goodness of Fit in Item Response Theory with Nonparametric Models: A Comparison of Posterior Probabilities and Kernel-Smoothing Approaches

    Science.gov (United States)

    Sueiro, Manuel J.; Abad, Francisco J.

    2011-01-01

    The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…

  14. Item Modeling Concept Based on Multimedia Authoring

    Directory of Open Access Journals (Sweden)

    Janez Stergar

    2008-09-01

    Full Text Available In this paper a modern item design framework for computer based assessment based on Flash authoring environment will be introduced. Question design will be discussed as well as the multimedia authoring environment used for item modeling emphasized. Item type templates are a structured means of collecting and storing item information that can be used to improve the efficiency and security of the innovative item design process. Templates can modernize the item design, enhance and speed up the development process. Along with content creation, multimedia has vast potential for use in innovative testing. The introduced item design template is based on taxonomy of innovative items which have great potential for expanding the content areas and construct coverage of an assessment. The presented item design approach is based on GUI's – one for question design based on implemented item design templates and one for user interaction tracking/retrieval. The concept of user interfaces based on Flash technology will be discussed as well as implementation of the innovative approach of the item design forms with multimedia authoring. Also an innovative method for user interaction storage/retrieval based on PHP extending Flash capabilities in the proposed framework will be introduced.

  15. Disposal of radioactive waste in evaporite formations - a review of published radiological assessments and their relevance to the UK

    International Nuclear Information System (INIS)

    Lawson, G.

    1983-11-01

    Radiological assessments of the disposal of radioactive waste in evaporite formations, principally halite, have been reviewed. These assessments were carried out in the USA, the Netherlands, Denmark and West Germany. The general nature of evaporite formations in the UK is discussed and comments are given on the broad relevance of the assessments to the potential disposal of radioactive waste in UK evaporite formations. (author)

  16. Formative Assessment in Confucian Heritage Culture Classrooms: Activity Theory Analysis of Tensions, Contradictions and Hybrid Practices

    Science.gov (United States)

    Thanh Pham, Thi Hong; Renshaw, Peter

    2015-01-01

    Formative assessment has recently become a preferred assessment strategy in educational institutions worldwide. However, it is not easy to implement in Asian classrooms, because local cultures and institutional constraints potentially hinder the practice. This one-semester study aimed to use the "third space", as the core of the third…

  17. A Needs Assessment, Development, and Formative Evaluation of a Health Promotion Smartphone Application for College Students

    Science.gov (United States)

    Miller, Tiffany; Chandler, Laura; Mouttapa, Michele

    2015-01-01

    Background: Approximately half of college students who completed the National College Health Assessment 2013 indicated a greater need for health-related information. University-based smartphone applications may help students better access this information. Purpose: This study describes the needs assessment, development, and formative evaluation of…

  18. Teachers' Use of Learning Progression-Based Formative Assessment in Water Instruction

    Science.gov (United States)

    Covitt, Beth A.; Gunckel, Kristin L.; Caplan, Bess; Syswerda, Sara

    2018-01-01

    While learning progressions (LPs) hold promise as instructional tools, researchers are still in the early stages of understanding how teachers use LPs in formative assessment practices. We report on a study that assessed teachers' proficiency in using a LP for student ideas about hydrologic systems. Research questions were: (a) what were teachers'…

  19. Implementing Summative Assessment with a Formative Flavour: A Case Study in a Large Class

    Science.gov (United States)

    Broadbent, Jaclyn; Panadero, Ernesto; Boud, David

    2018-01-01

    Teaching a large class can present real challenges in design, management and standardisation of assessment practices. One of the main dilemmas for university teachers is how to implement effective formative assessment practices with accompanying high-quality feedback consistently over time with large classroom groups. This article reports on how…

  20. Teacher Agency within the Context of Formative Teacher Assessment: An In-Depth Analysis

    Science.gov (United States)

    Verberg, Christel P. M.; Tigelaar, Dineke E. H.; van Veen, Klaas; Verloop, Nico

    2016-01-01

    Teachers' agency has an effect on their own learning process at the workplace. In this study we explored the extent to which teachers participating in a formative teacher assessment procedure developed a sense of agency. We investigated not only whether teachers participating in a such an assessment procedure experienced agency and thus felt in…

  1. The effect of regulation feedback in a computer-based formative assessment on information problem solving

    NARCIS (Netherlands)

    Timmers, Caroline; Walraven, Amber; Veldkamp, Bernard P.

    2015-01-01

    This study examines the effect of regulation feedback in a computer-based formative assessment in the context of searching for information online. Fifty 13-year-old students completed two randomly selected assessment tasks, receiving automated regulation feedback between them. Student performance

  2. Quality assessment of observational studies in a drug-safety systematic review, comparison of two tools: the Newcastle–Ottawa Scale and the RTI item bank

    Directory of Open Access Journals (Sweden)

    Margulis AV

    2014-10-01

    Full Text Available Andrea V Margulis,1 Manel Pladevall,1 Nuria Riera-Guardia,1 Cristina Varas-Lorenzo,1 Lorna Hazell,2,3 Nancy D Berkman,4 Meera Viswanathan,4 Susana Perez-Gutthann,1 1RTI Health Solutions, Barcelona, Spain; 2Drug Safety Research Unit, Southampton, UK; 3Associate Department of the School of Pharmacy and Biomedical Sciences, University of Portsmouth, Portsmouth, UK; 4RTI International, Research Triangle Park, NC, USA Background: The study objective was to compare the Newcastle–Ottawa Scale (NOS and the RTI item bank (RTI-IB and estimate interrater agreement using the RTI-IB within a systematic review on the cardiovascular safety of glucose-lowering drugs. Methods: We tailored both tools and added four questions to the RTI-IB. Two reviewers assessed the quality of the 44 included studies with both tools, (independently for the RTI-IB and agreed on which responses conveyed low, unclear, or high risk of bias. For each question in the RTI-IB (n=31, the observed interrater agreement was calculated as the percentage of studies given the same bias assessment by both reviewers; chance-adjusted interrater agreement was estimated with the first-order agreement coefficient (AC1 statistic. Results: The NOS required less tailoring and was easier to use than the RTI-IB, but the RTI-IB produced a more thorough assessment. The RTI-IB includes most of the domains measured in the NOS. Median observed interrater agreement for the RTI-IB was 75% (25th percentile [p25] =61%; p75 =89%; median AC1 statistic was 0.64 (p25 =0.51; p75 =0.86. Conclusion: The RTI-IB facilitates a more complete quality assessment than the NOS but is more burdensome. The observed agreement and AC1 statistic in this study were higher than those reported by the RTI-IB's developers. Keywords: systematic review, meta-analysis, quality assessment, AC1

  3. Performance assessment of an alpha waste deposit in a clay formation

    International Nuclear Information System (INIS)

    Quercia, F.; D'Alessandro, M.; Saltelli, A.

    1987-01-01

    The probabilistic code LISA (Long term Isolation Safety Assessment) has been used to assess the risk related to the disposal of alpha waste in a geological formation. The code has been modified to take into account waste form properties and leaching processes pertinent to alpha waste produced at fuel reprocessing plants. The exercise refers to a repository in a deep clay formation located at Harwell (U.K.) where some hydrogeological data were available. Radionuclide migration through repository and geological barriers has been simulated together with biosphere contamination. Results of the assessment are presented as dose rate (or risk) distributions; a sensitivity analysis on input parameters has been performed

  4. Formative assessment as a vehicle for changing classroom practice in a specific cultural context

    Science.gov (United States)

    Chen, Jingping

    2015-09-01

    In this commentary, I interpret Xinying Yin and Gayle Ann Buck's collaborative action research from a social-cultural perspective. Classroom implementation of formative assessment is viewed as interaction between this assessment method and the local learning culture. I first identify Yin and Buck's definition of the formative assessment, and then analyze the role of formative assessment in the change of local learning culture. Based on the practice of Yin and Buck I emphasize the significance of their "bottom up" strategy to the teachers' epistemological change. I believe that this strategy may provide practicable solutions to current Chinese educational problems as well as a means for science educators to shift toward systematic professional development.

  5. Student-generated reading questions: diagnosing student thinking with diverse formative assessments.

    Science.gov (United States)

    Offerdahl, Erika G; Montplaisir, Lisa

    2014-01-01

    Formative assessment has long been identified as a critical element to teaching for conceptual development in science. It is therefore important for university instructors to have an arsenal of formative assessment tools at their disposal which enable them to effectively uncover and diagnose all students' thinking, not just the most vocal or assertive. We illustrate the utility of one type of formative assessment prompt (reading question assignment) in producing high-quality evidence of student thinking (student-generated reading questions). Specifically, we characterized student assessment data using three distinct analytic frames to exemplify their effectiveness in diagnosing student learning in relationship to three sample learning outcomes. Our data will be useful for university faculty, particularly those engaged in teaching upper-level biochemistry courses and their prerequisites, as they provide an alternative mechanism for uncovering and diagnosing student understanding. © 2013 by The International Union of Biochemistry and Molecular Biology.

  6. On the Effect of Online Formative Assessment on Iranian Lower Intermediate EFL Learners Reading Comprehension

    Directory of Open Access Journals (Sweden)

    Farideh Peyghambarian

    2015-03-01

    Full Text Available Online Formative Assessment (OFA improves EFL students’ reading comprehension enabling them to have a better performance in reading comprehension tests. To lend support to the above mentioned claim, a quasi-experimental study was conducted in Mashhad, Iran. 48 female lower intermediate EFL students took part in this study. Participants were assigned to control and treatment groups.  Participants in both groups received a formative assessment program lasting for 10 sessions. Formative assessment in treatment group was conducted by the site itself, and participants in control group were assessed by the teacher. It was found that participants in treatment group significantly outperformed those in control group. This finding indicated OFA as an effective learning tool in EFL reading comprehension classrooms.

  7. Análise de Teoria de Resposta ao Item de um instrumento breve de avaliação de comportamentos antissociais = Item Response Theory Analysis of a brief instrument for assessing antisocial behaviors

    Directory of Open Access Journals (Sweden)

    Hauck Filho, Nelson

    2014-01-01

    Full Text Available Comportamentos antissociais são comuns a diversas condições psicopatológicas, incluindo transtornos da personalidade (e. g. , antissocial e narcisista e transtornos do humor (e. g. , transtorno bipolar. Todavia, até o momento, havia uma importante lacuna no contexto brasileiro no que diz respeito à avaliação breve dos comportamentos antissociais em indivíduos adultos de contextos não carcerários. Em virtude disso, o presente estudo teve como objetivo a construção e a análise mediante Teoria de Resposta ao Item de um instrumento breve para uso em pesquisas e rastreio junto à população geral adulta. As análises das respostas de 204 estudantes universitários (média de idades = 23,56 anos; DP = 7,70; 60,6% mulheres a um conjunto de itens permitiram reter 13 itens com excelentes propriedades psicométricas. Esses itens se mostraram avaliativos de um fator geral de antissocialidade, interpretável como uma propensão ao antagonismo, à não cooperação e à agressão em uma diversidade de contextos sociais. Limitações do estudo são discutidas ao final

  8. Teaching English as a Language not Subject by Employing Formative Assessment

    Directory of Open Access Journals (Sweden)

    Muhammad Tufail Chandio

    2015-12-01

    Full Text Available English is a second language (L2 in Sindh, Pakistan. Most of the public sector schools in Sindh teach English as a subject rather than a language. Besides, they do not distinguish between generic pedagogy and distinctive approaches used for teaching English as a first language (L1 and second language (L2. In addition, the erroneous traditional assessment focuses on only writing and reading skills and the listening and speaking skills of L2 remain excluded. There is a great emphasis on summative assessments, which contribute to a qualification; however, formative assessments, which provide timely and continuous appraisal and feedback, remain ignored. Summative assessment employs only paper-and- pencil based test, while the other current means of alternative assessments like self-assessment, peer-assessment, and portfolio assessment have not been incorporated, and explored yet. Teaching English as a subject not as a language, employing summative assessment not formative, depending on paper-and-pencil based test, and not using the alternative modes of assessment are some of the questions this study will deal with. The study under discussion suggests that current approaches employed for teaching English are misplaced as these take a subject teaching approach rather than a language teaching approach. It also argues for the paradigm shift from a product to process approach to assessment by administering modern alternative assessments.

  9. The Development and Evaluation of an Online Formative Assessment upon Single-Player Game in E-Learning Environment

    Science.gov (United States)

    Tsai, Fu-Hsing

    2013-01-01

    This study developed a game-based formative assessment, called tic-tac-toe quiz for single-player version (TRIS-Q-SP), in an energy education e-learning system. This assessment game combined tic-tac-toe with online assessment, and revised the rule of tic-tac-toe for stimulating students to use online formative assessment actively. Additionally, to…

  10. Screening for depression and assessing change in severity of depression. Is the Geriatric Depression Scale (30-.15- and 8- item versions) useful for both purposes in nursing home patients?

    NARCIS (Netherlands)

    Smalbrugge, M.; Jongenelis, L.; Pot, A.M.; Eefsting, J.A.; Beekman, A.T.F.

    2008-01-01

    The objectives of this study were to determine the ability of the 30-, 15- and 8-item versions of the GDS for screening and assessing change in severity of depression in nursing home patients. The GDS and the MADRS were administered to 350 elderly NH-patients by trained interviewers. The presence of

  11. The contribution of formative assessment and self-efficacy to inquiry learning

    DEFF Research Database (Denmark)

    Dolin, Jens; Evans, Robert Harry

    2013-01-01

    This chapter suggests the use of formative assessment in inquiry lessons as a helpful source of positive personal capacity beliefs for both teachers and students. The challenge most commonly experienced when first using inquiry learning methods is that pupils and even teachers become uncertain...... of their abilities to use inquiry and ‘give-up’ on it. With the use of formative assessment combined with conscious efforts to increase self-efficacy among students, teachers can help provide students with the confidence and motivation to engage in inquiry methods. Such student engagement can in-turn affirm teachers......’ inquiry teaching efforts and raise the likelihood that they will continue to improve them. We see inquiry methods as the motor for changing teacher practice and formative assessment methods combined with capacity beliefs as the fuel that keeps the motor running. The central position of the chapter is how...

  12. Effects of formative assessments to develop self-regulation among sixth grade students: Results from a randomized controlled intervention

    NARCIS (Netherlands)

    Meusen-Beekman, Kelly; Joosten-ten Brinke, Desirée; Boshuizen, Els

    2018-01-01

    This article presents the results of a formative assessment intervention in writing assignments in sixth grade. We examined whether formative assessments would improve self-regulation, motivation and self-efficacy among sixth graders, and whether differential effects exist between formative

  13. Team Objective Structured Bedside Assessment (TOSBA) as formative assessment in undergraduate Obstetrics and Gynaecology: a cohort study.

    LENUS (Irish Health Repository)

    Deane, Richard P

    2015-10-09

    Team Objective Structured Bedside Assessment (TOSBA) is a learning approach in which a team of medical students undertake a set of structured clinical tasks with real patients in order to reach a diagnosis and formulate a management plan and receive immediate feedback on their performance from a facilitator. TOSBA was introduced as formative assessment to an 8-week undergraduate teaching programme in Obstetrics and Gynaecology (O&G) in 2013\\/14. Each student completed 5 TOSBA sessions during the rotation. The aim of the study was to evaluate TOSBA as a teaching method to provide formative assessment for medical students during their clinical rotation. The research questions were: Does TOSBA improve clinical, communication and\\/or reasoning skills? Does TOSBA provide quality feedback?

  14. Assessing normative cut points through differential item functioning analysis: An example from the adaptation of the Middlesex Elderly Assessment of Mental State (MEAMS for use as a cognitive screening test in Turkey

    Directory of Open Access Journals (Sweden)

    Kutlay Sehim

    2006-03-01

    Full Text Available Abstract Background The Middlesex Elderly Assessment of Mental State (MEAMS was developed as a screening test to detect cognitive impairment in the elderly. It includes 12 subtests, each having a 'pass score'. A series of tasks were undertaken to adapt the measure for use in the adult population in Turkey and to determine the validity of existing cut points for passing subtests, given the wide range of educational level in the Turkish population. This study focuses on identifying and validating the scoring system of the MEAMS for Turkish adult population. Methods After the translation procedure, 350 normal subjects and 158 acquired brain injury patients were assessed by the Turkish version of MEAMS. Initially, appropriate pass scores for the normal population were determined through ANOVA post-hoc tests according to age, gender and education. Rasch analysis was then used to test the internal construct validity of the scale and the validity of the cut points for pass scores on the pooled data by using Differential Item Functioning (DIF analysis within the framework of the Rasch model. Results Data with the initially modified pass scores were analyzed. DIF was found for certain subtests by age and education, but not for gender. Following this, pass scores were further adjusted and data re-fitted to the model. All subtests were found to fit the Rasch model (mean item fit 0.184, SD 0.319; person fit -0.224, SD 0.557 and DIF was then found to be absent. Thus the final pass scores for all subtests were determined. Conclusion The MEAMS offers a valid assessment of cognitive state for the adult Turkish population, and the revised cut points accommodate for age and education. Further studies are required to ascertain the validity in different diagnostic groups.

  15. Assessing normative cut points through differential item functioning analysis: an example from the adaptation of the Middlesex Elderly Assessment of Mental State (MEAMS) for use as a cognitive screening test in Turkey.

    Science.gov (United States)

    Tennant, Alan; Küçükdeveci, Ayse A; Kutlay, Sehim; Elhan, Atilla H

    2006-03-23

    The Middlesex Elderly Assessment of Mental State (MEAMS) was developed as a screening test to detect cognitive impairment in the elderly. It includes 12 subtests, each having a 'pass score'. A series of tasks were undertaken to adapt the measure for use in the adult population in Turkey and to determine the validity of existing cut points for passing subtests, given the wide range of educational level in the Turkish population. This study focuses on identifying and validating the scoring system of the MEAMS for Turkish adult population. After the translation procedure, 350 normal subjects and 158 acquired brain injury patients were assessed by the Turkish version of MEAMS. Initially, appropriate pass scores for the normal population were determined through ANOVA post-hoc tests according to age, gender and education. Rasch analysis was then used to test the internal construct validity of the scale and the validity of the cut points for pass scores on the pooled data by using Differential Item Functioning (DIF) analysis within the framework of the Rasch model. Data with the initially modified pass scores were analyzed. DIF was found for certain subtests by age and education, but not for gender. Following this, pass scores were further adjusted and data re-fitted to the model. All subtests were found to fit the Rasch model (mean item fit 0.184, SD 0.319; person fit -0.224, SD 0.557) and DIF was then found to be absent. Thus the final pass scores for all subtests were determined. The MEAMS offers a valid assessment of cognitive state for the adult Turkish population, and the revised cut points accommodate for age and education. Further studies are required to ascertain the validity in different diagnostic groups.

  16. EFFECT OF FEEDBACK IN FORMATIVE ASSESSMENT IN THE STUDENT LEARNING ACTIVITIES ON CHEMICAL COURSE TO THE FORMATION OF HABITS OF MIND

    Directory of Open Access Journals (Sweden)

    Nahadi -

    2015-04-01

    Full Text Available The aim of this study was to find the impact of feedback in formative assessment in the learning process activity and students learning outcomes on learning chemistry. The method used on this study was quasi experiment research with non-equivalent control group design. The result showed that the application of feedback in formative assessment has a positive impact toward students learning process activity. Students become more enthusiastic, motivated, and more active on the learning process. Thus in this study can be conclude that feedback in formative assessment have a positive impact toward the learning process activity to form a habits of mind.

  17. Funcionamento diferencial de itens para avaliar a agressividade de universitários Differential items functioning to assess aggressiveness in college students

    Directory of Open Access Journals (Sweden)

    Fermino Fernandes Sisto

    2008-01-01

    Full Text Available Nesta pesquisa buscou-se evidência de validade de construto relacionada ao funcionamento dos itens para diferenciar sexos em um instrumento de agressividade. Participaram 445 universitários, de ambos os sexos, dos cursos de Engenharia, Computação e Psicologia. A escala de agressividade composta por 81 itens foi aplicada coletivamente, em sala de aula, nos estudantes que consentiram em participar do estudo. Os itens do instrumento foram analisados por meio do modelo Rasch. Vinte e oito itens apresentaram funcionamento diferencial, sendo 15 condutas mais características de pessoas do sexo feminino e outras 13 mais características do masculino. Os índices de precisão foram de 0,99 para os itens e 0,86 para as pessoas. Conclui-se que a agressividade pode ser medida separadamente em razão do sexo.In this research evidences of construct validity were searched analyzing the differential functioning items related to aggressiveness. The participants were 445 college students of both genders, attending the courses of Engineering, Computing and Psychology. The scale of aggressiveness composed by 81 items was collectively applied, in the classroom, to the students who consented to participate in the study. The items of the instrument were studied by means of the Rasch model. Twenty-eight items presented differential functioning item, 15 were characterized as typical for females and 13 for males. The reliability coefficients were 0.99 to the items and 0.86 to the persons. It was concluded that the aggressiveness can be measured separately on the basis of gender.

  18. Linking a Learning Progression for Natural Selection to Teachers' Enactment of Formative Assessment

    Science.gov (United States)

    Furtak, Erin Marie

    2012-01-01

    Learning progressions, or representations of how student ideas develop in a domain, hold promise as tools to support teachers' formative assessment practices. The ideas represented in a learning progression might help teachers to identify and make inferences about evidence collected of student thinking, necessary precursors to modifying…

  19. Information Literacy Follow-Through: Enhancing Preservice Teachers' Information Evaluation Skills through Formative Assessment

    Science.gov (United States)

    Seely, Sara Robertson; Fry, Sara Winstead; Ruppel, Margie

    2011-01-01

    An investigation into preservice teachers' information evaluation skills at a large university suggests that formative assessment can improve student achievement. Preservice teachers were asked to apply information evaluation skills in the areas of currency, relevancy, authority, accuracy, and purpose. The study used quantitative methods to assess…

  20. Formative Assessment and the Intuitive Incorporation of Research-Based Instruction Techniques

    Science.gov (United States)

    Kuiper, Paula; VanOeffelen, Rachel; Veldkamp, Simon; Bokma, Isaac; Breems, Luke; Fynewever, Herb

    2015-01-01

    Using Max Weber's theory of ideal types, the authors classify the formative assessment techniques used by 12 college instructors. Their data reveal two pairs of opposing preferences: (1) highly preplanned vs. highly emergent and (2) focused on individual students vs. focused on the class as a whole. Using interview data, they illustrate how each…

  1. Class Size Reduction or Rapid Formative Assessment?: A Comparison of Cost-Effectiveness

    Science.gov (United States)

    Yeh, Stuart S.

    2009-01-01

    The cost-effectiveness of class size reduction (CSR) was compared with the cost-effectiveness of rapid formative assessment, a promising alternative for raising student achievement. Drawing upon existing meta-analyses of the effects of student-teacher ratio, evaluations of CSR in Tennessee, California, and Wisconsin, and RAND cost estimates, CSR…

  2. Promoting Prospective Elementary Teachers' Learning to Use Formative Assessment for Life Science Instruction

    Science.gov (United States)

    Sabel, Jaime L.; Forbes, Cory T.; Zangori, Laura

    2015-06-01

    To support elementary students' learning of core, standards-based life science concepts highlighted in the Next Generation Science Standards, prospective elementary teachers should develop an understanding of life science concepts and learn to apply their content knowledge in instructional practice to craft elementary science learning environments grounded in students' thinking. To do so, teachers must learn to use high-leverage instructional practices, such as formative assessment, to engage students in scientific practices and connect instruction to students' ideas. However, teachers may not understand formative assessment or possess sufficient science content knowledge to effectively engage in related instructional practices. To address these needs, we developed and conducted research within an innovative course for preservice elementary teachers built upon two pillars—life science concepts and formative assessment. An embedded mixed methods study was used to evaluate the effect of the intervention on preservice teachers' (n = 49) content knowledge and ability to engage in formative assessment practices for science. Findings showed that increased life content knowledge over the semester helped preservice teachers engage more productively in anticipating and evaluating students' ideas, but not in identifying effective instructional strategies to respond to those ideas.

  3. Why Video Games Can Be a Good Fit for Formative Assessment

    Science.gov (United States)

    Bauer, Malcolm; Wylie, Caroline; Jackson, Tanner; Mislevy, Bob; Hoffman-John, Erin; John, Michael; Corrigan, Seth

    2017-01-01

    This paper explores the relation between formative assessment principles and their analogues in video games that game designers have been developing over the past 35 years. We identify important parallels between the two that should enable effective and efficient use of well-designed video games in the classroom as part of an overall learning…

  4. What's Stalling Learning? Using a Formative Assessment Tool to Address Critical Incidents in Class

    Science.gov (United States)

    Hessler, H. Brooke; Taggart, Amy Rupiper

    2011-01-01

    We report on the use of Brookfield's (1995) formative assessment tool, the "Critical Incident Questionnaire" (CIQ) to help students and teachers identify and discuss key factors affecting learning. We offer insight into two major areas: (1) testing and adapting the existing tool to improve teaching and learning; and (2) identifying…

  5. A First Step Towards Synthesising Rubrics and Video for the Formative Assessment of Complex Skills

    NARCIS (Netherlands)

    Ackermans, Kevin; Rusman, Ellen; Brand-Gruwel, Saskia; Specht, Marcus

    2016-01-01

    Abstract. The performance objectives used for the formative assessment of com- plex skills are generally set through text-based analytic rubrics[1]. Moreover, video modeling examples are a widely applied method of observational learning, providing students with context-rich modeling examples of

  6. Enhancing Formative Assessment Practice and Encouraging Middle School Mathematics Engagement and Persistence

    Science.gov (United States)

    Beesley, Andrea D.; Clark, Tedra F.; Dempsey, Kathleen; Tweed, Anne

    2018-01-01

    In the transition to middle school, and during the middle school years, students' motivation for mathematics tends to decline from what it was during elementary school. Formative assessment strategies in mathematics can help support motivation by building confidence for challenging tasks. In this study, the authors developed and piloted a…

  7. Think Pair Share with Formative Assessment for Junior High School Student

    Science.gov (United States)

    Pradana, O. R. Y.; Sujadi, I.; Pramudya, I.

    2017-09-01

    Geometry is a science related to abstract thinking ability so that not many students are able to understand this material well. In this case, the learning model plays a crucial role in improving student achievement. This means that a less precise learning model will cause difficulties for students. Therefore, this study provides a quantitative explanation of the Think Pair Share learning model combined with the formative assessment. This study aims to test the Think Pair Share with the formative assessment on junior high school students. This research uses a quantitative approach of Pretest-Posttest in control group and experiment group. ANOVA test and Scheffe test used to analyse the effectiveness this learning. Findings in this study are student achievement on the material geometry with Think Pair Share using formative assessment has increased significantly. This happens probably because this learning makes students become more active during learning. Hope in the future, Think Pair Share with formative assessment be a useful learning for teachers and this learning applied by the teacher around the world especially on the material geometry.

  8. Motivational beliefs, student effort, and feedback behaviour in computer-based formative assessment

    NARCIS (Netherlands)

    Timmers, C.F.; Braber-van den Broek, J.; van den Berg, Stéphanie Martine

    2013-01-01

    Feedback can only be effective when students seek feedback and process it. This study examines the relations between students' motivational beliefs, effort invested in a computer-based formative assessment, and feedback behaviour. Feedback behaviour is represented by whether a student seeks feedback

  9. The effects of a digital formative assessment tool on spelling achievement : Results of a randomized experiment

    NARCIS (Netherlands)

    Faber, Janke M.; Visscher, Adrie J.

    2018-01-01

    In this study, a randomized experimental design was used to examine the effects of a digital formative assessment tool on spelling achievement of third grade students (eight-to nine-years-olds). The sample consisted of 30 experimental schools (n = 619) and 39 control schools (n = 986). Experimental

  10. Student Perceptions of Online Homework Use for Formative Assessment of Learning in Organic Chemistry

    Science.gov (United States)

    Richards-Babb, Michelle; Curtis, Reagan; Georgieva, Zornitsa; Penn, John H.

    2015-01-01

    Use of online homework as a formative assessment tool for organic chemistry coursework was examined. Student perceptions of online homework in terms of (i) its ranking relative to other course aspects, (ii) their learning of organic chemistry, and (iii) whether it improved their study habits and how students used it as a learning tool were…

  11. A Pedagogical Alliance for Trust, Wellbeing and the Identification of Errors for Learning and Formative Assessment

    Science.gov (United States)

    Leighton, Jacqueline P.; Bustos Gómez, María Clara

    2018-01-01

    Formative assessments and feedback are vital to enhancing learning outcomes but require that learners feel at ease identifying their errors, and receiving feedback from a trusted source--teachers. An experimental test of a new theoretical framework was conducted to cultivate a pedagogical alliance to enhance students' (a) trust in the teacher, (b)…

  12. SUPPORTING TEACHERS IN IMPLEMENTING FORMATIVE ASSESSMENT PRACTICES IN EARTH SYSTEMS SCIENCE

    Science.gov (United States)

    Harris, C. J.; Penuel, W. R.; Haydel Debarger, A.; Blank, J. G.

    2009-12-01

    An important purpose of formative assessment is to elicit student thinking to use in instruction to help all students learn and inform next steps in teaching. However, formative assessment practices are difficult to implement and thus present a formidable challenge for many science teachers. A critical need in geoscience education is a framework for providing teachers with real-time assessment tools as well as professional development to learn how to use formative assessment to improve instruction. Here, we describe a comprehensive support system, developed for our NSF-funded Contingent Pedagogies project, for addressing the challenge of helping teachers to use formative assessment to enhance student learning in middle school Earth Systems science. Our support system is designed to improve student understanding about the geosphere by integrating classroom network technology, interactive formative assessments, and contingent curricular activities to guide teachers from formative assessment to instructional decision-making and improved student learning. To accomplish this, we are using a new classroom network technology, Group Scribbles, in the context of an innovative middle-grades Earth Science curriculum called Investigating Earth Systems (IES). Group Scribbles, developed at SRI International, is a collaborative software tool that allows individual students to compose “scribbles” (i.e., drawings and notes), on “post-it” notes in a private workspace (a notebook computer) in response to a public task. They can post these notes anonymously to a shared, public workspace (a teacher-controlled large screen monitor) that becomes the centerpiece of group and class discussion. To help teachers implement formative assessment practices, we have introduced a key resource, called a teaching routine, to help teachers take advantage of Group Scribbles for more interactive assessments. Routine refers to a sequence of repeatable interactions that, over time, become

  13. How to start with technology-enhanced formative assessment of 21st century skills in your classroom(s)?

    NARCIS (Netherlands)

    Rusman, Ellen; Martínez-Monés, Alejandra; Tasouris, Christodoulos; Economou, Anastasia

    2014-01-01

    Workshop participants will learn to: Understand the reasons behind the shift from assessment of learning to assessment for learning; Make a difference between the objectives of formative and summative assessment; Distinguish between different formative eAassessment methods; Understand the benefits

  14. The Influence of an Internet-Based Formative Assessment Tool on Primary Grades Students' Number Sense Achievement

    Science.gov (United States)

    Polly, Drew; Wang, Chuang; Martin, Christie; Lambert, Richard G.; Pugalee, David K.; Middleton, Catharina Win

    2017-01-01

    This study examined primary grades students' achievement on number sense tasks administered through an Internet-based formative assessment tool, Assessing Math Concepts Anywhere. Data were analyzed from 2,357 students in teachers' classrooms who had participated in a year-long professional development program on mathematics formative assessment,…

  15. Negative affect impairs associative memory but not item memory.

    OpenAIRE

    Bisby, J. A.; Burgess, N.

    2014-01-01

    The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine the effects of emotion on memory for items and their associations. By presenting neutral and negative items with background contexts, Experiment 1 ...

  16. Risk assessment of PCDD/Fs levels in human tissues related to major food items based on chemical analyses and micro-EROD assay.

    Science.gov (United States)

    Tsang, H L; Wu, S C; Wong, C K C; Leung, C K M; Tao, S; Wong, M H

    2009-10-01

    Nine groups of food items (freshwater fish, marine fish, pork, chicken, chicken eggs, leafy, non-leafy vegetables, rice and flour) and three types of human samples (human milk, maternal serum and cord serum) were collected for the analysis of PCDD/Fs. Results of chemical analysis revealed PCDD/Fs concentrations (pg g(-1) fat) in the following ascending order: pork (0.289 pg g(-1) fat), grass carp (Ctenopharyngodon idellus) (freshwater fish) (0.407), golden thread (Nemipterus virgatus) (marine fish) (0.511), chicken (0.529), mandarin fish (Siniperca kneri) (marine fish) (0.535), chicken egg (0.552), and snubnose pompano (Trachinotus blochii) (marine fish) (1.219). The results of micro-EROD assay showed relatively higher PCDD/Fs levels in fish (2.65 pg g(-1) fat) when compared with pork (0.47), eggs (0.33), chicken (0.13), flour (0.07), vegetables (0.05 pg g(-1) wet wt) and rice (0.05). The estimated average daily intake of PCDD/Fs of 3.51 pg EROD-TEQ/kg bw/day was within the range of WHO Tolerable Daily Intake (1-4 pg WHO-TEQ/kg bw/day) and was higher than the Provisional Tolerable Daily Intake (PMTL) (70 pg for dioxins and dioxin-like PCBs) recommended by the Joint FAO/WHO Expert Committee on Food Additives (JECFA) [Joint FAO/WHO Expert Committee on Food Additives (JECFA), Summary and conclusions of the fifty-seventh meeting, JECFA, 2001.]. Nevertheless, the current findings were significantly lower than the TDI (14 pg WHO-TEQ/kg/bw/day) recommended by the Scientific Committee on Food of the Europe Commission [European Scientific Committee on Food (EU SCF), Opinions on the SCF on the risk assessment of dioxins and dioxin-like PCBs in food, 2000.]. However, it should be noted that micro-EROD assay overestimates the PCDD/Fs levels by 2 to 7 folds which may also amplify the PCDD/Fs levels accordingly. Although the levels of PCDD/Fs obtained from micro-EROD assay were much higher than those obtained by chemical analysis by 2 to 7 folds, it provides a cost-effective and

  17. Analysis of Nonequivalent Assessments across Different Linguistic Groups Using a Mixed Methods Approach: Understanding the Causes of Differential Item Functioning by Cognitive Interviewing

    Science.gov (United States)

    Benítez, Isabel; Padilla, José-Luis

    2014-01-01

    Differential item functioning (DIF) can undermine the validity of cross-lingual comparisons. While a lot of efficient statistics for detecting DIF are available, few general findings have been found to explain DIF results. The objective of the article was to study DIF sources by using a mixed method design. The design involves a quantitative phase…

  18. Item response modeling: A psychometric assessment of the children's fruit, vegetable, water, and physical activity self-efficacy scales among Chinese children

    Science.gov (United States)

    This study aimed to evaluate the psychometric properties of four self-efficacy scales (i.e., self-efficacy for fruit (FSE), vegetable (VSE), and water (WSE) intakes, and physical activity (PASE)) and to investigate their differences in item functioning across sex, age, and body weight status groups ...

  19. Quality of life assessed with the medical outcomes study short form 36-item health survey of patients on renal replacement therapy: A systematic review and meta-analysis

    NARCIS (Netherlands)

    Y.S. Liem (Ylian Serina); J.L. Bosch (Johanna); L.R. Arends (Lidia); M.H. Heijenbrok-Kal (Majanka); M.G.M. Hunink (Myriam)

    2007-01-01

    textabstractObjectives: The Medical Outcomes Study Short Form 36-Item Health Survey (SF-36) is the most widely used generic instrument to estimate quality of life of patients on renal replacement therapy. Purpose of this study was to summarize and compare the published literature on quality of

  20. Understanding The Impact of Formative Assessment Strategies on First Year University Students’ Conceptual Understanding of Chemical Concepts

    OpenAIRE

    Mehmet Aydeniz; Aybuke Pabuccu

    2011-01-01

    This study investigated the effects of formative assessment strategies on students’ conceptual understanding in a freshmen college chemistry course in Turkey. Our sample consists of 96 students; 27 males, 69 females. The formative assessment strategies such as reflection on exams, and collective problem solving sessions were used throughout the course. Data were collected through pre and post-test methodology. The findings reveal that the formative assessment strategies used in this study led...

  1. An adaptive community-based participatory approach to formative assessment with high schools for obesity intervention*.

    Science.gov (United States)

    Kong, Alberta S; Farnsworth, Seth; Canaca, Jose A; Harris, Amanda; Palley, Gabriel; Sussman, Andrew L

    2012-03-01

    In the emerging debate around obesity intervention in schools, recent calls have been made for researchers to include local community opinions in the design of interventions. Community-based participatory research (CBPR) is an effective approach for forming community partnerships and integrating local opinions. We used CBPR principles to conduct formative research in identifying acceptable and potentially sustainable obesity intervention strategies in 8 New Mexico school communities. We collected formative data from 8 high schools on areas of community interest for school health improvement through collaboration with local School Health Advisory Councils (SHACs) and interviews with students and parents. A survey based on formative results was created to assess acceptability of specific intervention strategies and was provided to SHACs. Quantitative data were analyzed using descriptive statistics while qualitative data were evaluated using an iterative analytic process for thematic identification. Key themes identified through the formative process included lack of healthy food options, infrequent curricular/extracurricular physical activity opportunities, and inadequate exposure to health/nutritional information. Key strategies identified as most acceptable by SHAC members included healthier food options and preparation, a healthy foods marketing campaign, yearly taste tests, an after-school noncompetitive physical activity program, and community linkages to physical activity opportunities. An adaptive CBPR approach for formative assessment can be used to identify obesity intervention strategies that address community school health concerns. Eight high school SHACs identified 6 school-based strategies to address parental and student concerns related to obesity. © 2012, American School Health Association.

  2. A report on the piloting of a novel computer-based medical case simulation for teaching and formative assessment of diagnostic laboratory testing

    Directory of Open Access Journals (Sweden)

    Clarence D. Kreiter

    2011-01-01

    Full Text Available Objectives: Insufficient attention has been given to how information from computer-based clinical case simulations is presented, collected, and scored. Research is needed on how best to design such simulations to acquire valid performance assessment data that can act as useful feedback for educational applications. This report describes a study of a new simulation format with design features aimed at improving both its formative assessment feedback and educational function. Methods: Case simulation software (LabCAPS was developed to target a highly focused and well-defined measurement goal with a response format that allowed objective scoring. Data from an eight-case computer-based performance assessment administered in a pilot study to 13 second-year medical students was analyzed using classical test theory and generalizability analysis. In addition, a similar analysis was conducted on an administration in a less controlled setting, but to a much large sample (n=143, within a clinical course that utilized two random case subsets from a library of 18 cases. Results: Classical test theory case-level item analysis of the pilot assessment yielded an average case discrimination of 0.37, and all eight cases were positively discriminating (range=0.11–0.56. Classical test theory coefficient alpha and the decision study showed the eight-case performance assessment to have an observed reliability of σ=G=0.70. The decision study further demonstrated that a G=0.80 could be attained with approximately 3 h and 15 min of testing. The less-controlled educational application within a large medical class produced a somewhat lower reliability for eight cases (G=0.53. Students gave high ratings to the logic of the simulation interface, its educational value, and to the fidelity of the tasks. Conclusions: LabCAPS software shows the potential to provide formative assessment of medical students’ skill at diagnostic test ordering and to provide valid feedback to

  3. FORMATIVE ASSESSMENT IN A POSTGRADUATE TRAINING PROGRAM– DOES THE MODEL WORK?

    Directory of Open Access Journals (Sweden)

    Sonali Sarkar

    2013-09-01

    Full Text Available Background: Lack of assessment and feedback based on observation is one of the most serious deficiencies in the current medical education practice. Formative assessment strategies in postgraduate education can be affective when they are integral to the learning process. Seminars and journal club presentations are integral to the postgraduate education in all medical institutions. Methods: This study was done to assess a structured tool for evaluation of seminars and journal clubs by postgraduates in Community Medicine (as part of formative assessment based on rater reliability and efficacy of feedback. Results: The scale having five domains namely justification for the topic or the journal article, presentation skills, slide preparation, slide content and discussion, had high inter-rater reliability with intra class coefficient of 0.861 (95% CI 0.632 to 0.958, ‘p’ of 0.000. There was a significant improvement of the students over three journal club presentations in four out of five domains.Conclusions: This study has shown that use of rating scales during seminar and journal club presentations, when combined with feedback, can be an effective tool in formative assessment thereby supporting and enhancing the learning process.

  4. Differential effects of two types of formative assessment in predicting performance of first-year medical students.

    Science.gov (United States)

    Krasne, Sally; Wimmers, Paul F; Relan, Anju; Drake, Thomas A

    2006-05-01

    Formative assessments are systematically designed instructional interventions to assess and provide feedback on students' strengths and weaknesses in the course of teaching and learning. Despite their known benefits to student attitudes and learning, medical school curricula have been slow to integrate such assessments into the curriculum. This study investigates how performance on two different modes of formative assessment relate to each other and to performance on summative assessments in an integrated, medical-school environment. Two types of formative assessment were administered to 146 first-year medical students each week over 8 weeks: a timed, closed-book component to assess factual recall and image recognition, and an un-timed, open-book component to assess higher order reasoning including the ability to identify and access appropriate resources and to integrate and apply knowledge. Analogous summative assessments were administered in the ninth week. Models relating formative and summative assessment performance were tested using Structural Equation Modeling. Two latent variables underlying achievement on formative and summative assessments could be identified; a "formative-assessment factor" and a "summative-assessment factor," with the former predicting the latter. A latent variable underlying achievement on open-book formative assessments was highly predictive of achievement on both open- and closed-book summative assessments, whereas a latent variable underlying closed-book assessments only predicted performance on the closed-book summative assessment. Formative assessments can be used as effective predictive tools of summative performance in medical school. Open-book, un-timed assessments of higher order processes appeared to be better predictors of overall summative performance than closed-book, timed assessments of factual recall and image recognition.

  5. An Investigation of the Programme for International Student Assessment 2012 in Terms of Formative Assessment Use: Turkey Example

    OpenAIRE

    Tavşancıl, Ezel; Altıntaş, Özge; Ayan, Cansu

    2017-01-01

    The purpose of this research is to determine whether student oriented teaching, experience oriented teaching, teacher support and the class size predict the usage of formative assessment in mathematics. This study is designed as a predictive research that falls in the correlational survey model, one of the general survey models. The sample of the study consists of PISA 2012 Turkey data (4848 students). The data were obtained from the students and school questionnaires used within the scope of...

  6. Development of Web-Based Formative Assessment Model to Enhance Physics Concepts of Students

    Directory of Open Access Journals (Sweden)

    Ediyanto Ediyanto

    2015-03-01

    Full Text Available Pengembangan Model Penilaian Formatif Berbasis Web untuk Meningkatkan Pemahaman Konsep Fisika Siswa   Abstract: There are two approaches of learning assessment, called formative and summative. The formative assessment is applicable because it involves students directly during the process, may im-prove these students perceptive. The limited time in class makes this process difficult, then the de-velopment of both online and offline formative assessment, provide responsive feedback for teachers and students, is definitely needed. This research goal is to produce a model of web-based formative assessment for physics. This study used research design and development of the formative assess-ment-model. Questionnaire is used for product validation, consist of validation of textbook,  instrument of pre and post-learning quizzes and web product.The result of quantitative analysis shows that the developed product is valid without any revision. Based on qualitative data, the product revision follows comments and suggestions from expert’s validation, teachers and students. The product testing shows that the formative assessment-model may improve students’ conceptual comprehension. Key Words: formatice assessment-model, students’ conceptual comprehension of physics, web-based   Abstrak: Penilaian terbagi menjadi dua macam yaitu penilaian formatif dan penilaian sumatif. Penilaian formatif tepat digunakan karena prosesnya melibatkan siswa secara langsung di dalam proses pembelajaran dan mampu meningkatkan pemahaman konsep siswa. Keterbatasan waktu di kelas menyebabkan proses ini sulit dilakukan, maka perlu dikembangkan model penilaian formatif secara online dan off-line yang dapat memberikan umpan balik yang cepat bagi siswa dan guru. Tujuan dari penelitian adalah menghasilkan model web-based penilaian formatif untuk pembelajaran fisika. Penelitian menggunakan rancangan penelitian dan pengembangan model penilaian formatif. Instrumen yang digunakan

  7. A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

    Science.gov (United States)

    Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

    2018-04-10

    To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading .3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.

  8. Does the Order of Item Difficulty of the Addenbrooke's Cognitive Examination Add Anything to Subdomain Scores in the Clinical Assessment of Dementia?

    Science.gov (United States)

    McGrory, Sarah; Starr, John M; Shenkin, Susan D; Austin, Elizabeth J; Hodges, John R

    2015-01-01

    The Addenbrooke's Cognitive Examination (ACE) is used to measure cognition across a range of domains in dementia. Identifying the order in which cognitive decline occurs across items, and whether this varies between dementia aetiologies could add more information to subdomain scores. ACE-Revised data from 350 patients were split into three groups: Alzheimer's type (n = 131), predominantly frontal (n = 119) and other frontotemporal lobe degenerative disorders (n = 100). Results of factor analysis and Mokken scaling analysis were compared. Principal component analysis revealed one factor for each group. Confirmatory factor analysis found that the one-factor model fit two samples poorly. Mokken analyses revealed different item ordering in terms of difficulty for each group. The different patterns for each diagnostic group could aid in the separation of these different types of dementia.

  9. Does the Order of Item Difficulty of the Addenbrooke's Cognitive Examination Add Anything to Subdomain Scores in the Clinical Assessment of Dementia

    Directory of Open Access Journals (Sweden)

    Sarah McGrory

    2015-04-01

    Full Text Available Background: The Addenbrooke's Cognitive Examination (ACE is used to measure cognition across a range of domains in dementia. Identifying the order in which cognitive decline occurs across items, and whether this varies between dementia aetiologies could add more information to subdomain scores. Method: ACE-Revised data from 350 patients were split into three groups: Alzheimer's type (n = 131, predominantly frontal (n = 119 and other frontotemporal lobe degenerative disorders (n = 100. Results of factor analysis and Mokken scaling analysis were compared. Results: Principal component analysis revealed one factor for each group. Confirmatory factor analysis found that the one-factor model fit two samples poorly. Mokken analyses revealed different item ordering in terms of difficulty for each group. Conclusion: The different patterns for each diagnostic group could aid in the separation of these different types of dementia.

  10. Diagnostic radiography students' perceptions of formative peer assessment within a radiographic technique module

    International Nuclear Information System (INIS)

    Elshami, W.; Abdalla, M.E.

    2017-01-01

    Introduction: Assessment is a central part of student learning. Student involvement in peer assessment leads to significant improvement in students' performance, supports students' learning, promotes the development of evaluation skills and encourages reflection. Aim: The aim of this study is to assess perceptions of the Formative Peer Assessment (FPA) initiative within a higher education setting for undergraduate radiography students. Methods: Qualitative action research was conducted. Students were allowed to anonymously assess each other's assignments using a standardized evaluation sheet that they had been trained to use. Participants' perceptions were assessed through focus group discussion. Results: The findings showed that students' experiences with peer assessment were positive. Students acknowledged that they received valuable feedback and learned from assessing their peers. Students recommended the need for training and suggested using more than one evaluator. Conclusion: The FPA initiative in the study institution believed to be succeed as the students had a positive experience with the FPA. Students learnt from PA and from self-assessment. Implementation of PA will promote reflection and critical thinking and problem solving skills, that are important traits in radiography graduate profile as in radiography clinical practice the professional require to modify imaging techniques and critique images to ensure the quality of care. - Highlights: • Participants had a positive experience with the Formative Peer Assessment (FPA). • Students believed that the FPA had a positive impact on their learning. • FPA was time-consuming but benefits outweigh the extra time commitment. • Comprehensive training and detailed grading rubric are recommended to improve FPA.

  11. Validating the 11-Item Revised University of California Los Angeles Scale to Assess Loneliness Among Older Adults: An Evaluation of Factor Structure and Other Measurement Properties.

    Science.gov (United States)

    Lee, Joonyup; Cagle, John G

    2017-11-01

    To examine the measurement properties and factor structure of the short version of the Revised University of California Los Angeles (R-UCLA) loneliness scale from the Health and Retirement Study (HRS). Based on data from 3,706 HRS participants aged 65 + who completed the 2012 wave of the HRS and its Psychosocial Supplement, the measurement properties and factorability of the R-UCLA were examined by conducting an exploratory factor analysis (EFA) and the confirmatory factor analysis (CFA) on randomly split halves. The average score for the 11-item loneliness scale was 16.4 (standard deviation: 4.5). An evaluation of the internal consistency produced a Cronbach's α of 0.87. Results from the EFA showed that two- and three-factor models were appropriate. However, based on the results of the CFA, only a two-factor model was determined to be suitable because there was a very high correlation between two factors identified in the three-factor model, available social connections and sense of belonging. This study provides important data on the properties of the 11-item R-UCLA scale by identifying a two-factor model of loneliness: feeling isolated and available social connections. Our findings suggest the 11-item R-UCLA has good factorability and internal reliability. Copyright © 2017 American Association for Geriatric Psychiatry. Published by Elsevier Inc. All rights reserved.

  12. Formative assessment to develop oral communication competency using YouTube: self- and peer assessment in engineering

    Science.gov (United States)

    Nikolic, Sasha; Stirling, David; Ros, Montserrat

    2018-07-01

    Obtaining oral communication competency is an important skill for engineering students to prepare them for interacting and working in any professional setting. For engineers, it is also important to be able to present technical information to non-technical audiences. To ensure oral competency, a non-graded formative assessment approach using video with self- and peer assessment was introduced into a final-year engineering thesis course. A low workload approach was used due to growing student numbers and higher pressures on academic staff. A quasi-experimental design was used to investigate the differences between traditional delivery, self-assessment and combined self-assessment with peer feedback. The study found that the formative models were seen by students to help develop their presentation skills. However, the results showed no significant improvement compared to the traditional method. This could be due to previous presentation practice within the degree or more probable, the lack of incentive for weaker students to engage and improve due to the ungraded nature of the activity.

  13. A comparison of three methods of assessing differential item functioning (DIF) in the Hospital Anxiety Depression Scale: ordinal logistic regression, Rasch analysis and the Mantel chi-square procedure.

    Science.gov (United States)

    Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C

    2014-12-01

    It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.

  14. The formation and development of corporate culture of learning organization: efficiency assessment

    Directory of Open Access Journals (Sweden)

    T. O. Tolstykh

    2017-01-01

    Full Text Available In modern conditions of digitalization of the economy, its integration with the policy society questions of formation and development of corporate culture of the learning organisation are of particular relevance. Digital transformation of business dictates the need for the emergence and development of learning organizations, creating and preserving knowledge. In this situation, the openness of issues of assessment of efficiency of processes of formation and development defines the importance of the proposed research. Corporate culture is regarded by most scholars as the most important internal resource of the organization, able to provide her with stability in a crisis and give impetus to the development and transition to qualitatively different levels of the life cycle. This position assumes that a strong corporate culture should be aimed at building a learning organization, able to quickly adapt to changes in the external and internal environment. This article examines the issue of assessment of efficiency of corporate culture; it is shown that in addition to the empirical, sociological methods and qualitative approach to evaluation, is acceptable investment approach. This option appears when you use the aggregate target-oriented and project management methods, which allows in a systematic manner to carry out the formation and development of corporate culture. The assessment should be subject to software development activities and (or development of the corporate culture of a learning organization. In evidence to draw conclusions on the example of agricultural companies, a calculation of the economic efficiency of the program of formation of corporate culture of a learning organization. Calculation of net discounted income, the net present value of the project, profitability index, project profitability, payback period. This confirms the social and economic effects of the proposed program on the formation of corporate culture of independent

  15. Assessment of biofilm formation in device-associated clinical bacterial isolates in a tertiary level hospital

    Directory of Open Access Journals (Sweden)

    Summaiya A Mulla

    2011-01-01

    Full Text Available Background: Biofilm formation is a developmental process with intercellular signals that regulate growth. Biofilms contaminate catheters, ventilators, and medical implants; they act as a source of disease for humans, animals, and plants. Aim: In this study we have done quantitative assessment of biofilm formation in device-associated clinical bacterial isolates in response to various concentrations of glucose in tryptic soya broth and with different incubation time. Materials and Methods: The study was carried out on 100 positive bacteriological cultures of medical devices, which were inserted in hospitalized patients. The bacterial isolates were processed as per microtitre plate method with tryptic soya broth alone and with varying concentrations of glucose and were observed in response to time. Results: Majority of catheter cultures were positive. Out of the total 100 bacterial isolates tested, 88 of them were biofilm formers. Incubation period of 16-20 h was found to be optimum for biofilm development. Conclusions: Availability of nutrition in the form of glucose enhances the biofilm formation by bacteria. Biofilm formation depends on adherence of bacteria to various surfaces. Time and availability of glucose are important factors for assessment of biofilm progress.

  16. Analysis and assessment of the influence coaches’ formative profile has on young footballers

    Directory of Open Access Journals (Sweden)

    Susana Irazusta Adarraga,

    2012-07-01

    Full Text Available The aim of this research is to analyze and assess the formative profile of football coaches, based on Nicholls’ Goal Theory (1984, Bandura’s Self-efficiency Theory (1986 and Deci and Ryan’s Self Determination Theory (1985. We selected three coaches from the lower categories of Real Sociedad S.A.D. and 4 players (aged 15 to 19 from each of their teams. We selected the players depending on the time they participated in the competition, to represent the footballers that play almost every minute, the ones that play around 75% of the minutes and the ones who play the smallest amount of minutes (more or less 50%. At the end of the season, these players filled in the questionnaire of Perceived Formative Climate, which involves four different variables (Motivational Climate, Trust in the players, Communication and Decisional style. The results we obtained show that there are significant differences (pd».05 in the perception of players with regard to the formative climate of their coaches. Specifically, we found these differences in four of the seven dimensions composing the formative climate (Individual Mastery Climate, Emotional Communication, Decisional Style and Reactive Communication. These results emphasize how important are coach’s criteria and the way he/she communicates with players regarding the formative quality and sports experience. Moreover, these conclusions also suggest that it is necessary to tackle it from a multidimensional perspective to be able to analyze it in depth and within the context

  17. ORIGINAL The Implementation of Continuous Assessment in Writing ...

    African Journals Online (AJOL)

    Key words: Continuous assessment, Formative assessment, Summative ... According to Teacher Education System .... research design involving both qualitative ..... Table 3: Students' Response to Items about Learning and Teaching-Load.

  18. Introducing regular formative assessment to enhance learning among dental students at Islamic International Dental College.

    Science.gov (United States)

    Riaz, Fatima; Yasmin, Shahina; Yasmin, Raheela

    2015-12-01

    To evaluate the effectiveness of Formative Assessment in enhancing learning among dental students, and to interpret the assessment from students' perspective in this regard. The experimental non-randomised controlled study was conducted from January to June 2013 at Islamic International Dental College, Islamabad, and comprised first year Bachelor of Dental Surgery students attending regular physiology lectures and tutorials. Summative assessments conducted at the end of each unit were included as pre-intervention tests. After one month's planning, central nervous system unit was delivered in a month's trial with four formative assessment and feedback sessions (one per week). Likert scale-based student feedback questionnaire was administered. Post-intervention summative assessment was done by Multiple Choice and Short Essay Questions. Data was analysed using SPSS 17. Out of 68 students, 64(94.1%) agreed that a conducive environment was maintained and 62(90%) agreed that such sessions should be continued throughout the year; 59(87%) reflected that the feedback provided by the teacher was timely and positive and ensured equitable participation; 56(82%)agreed that it enhanced their interest in the subject; 56(68%) agreed that they were now more focussed; and43(63%)were of the opinion that they have progressed in the subject through these sessions. There was highly significant improvement in the monthly post-intervention test scores compared to pre-intervention test (p=0.000). Formative assessment sessions enhanced motivation and learning in first year dental students. Organised regular sessions with students' feedback may contribute to the development of pedagogic practice.

  19. Assessment of scale formation and corrosion of drinking water supplies in Ilam city (Iran)

    OpenAIRE

    Zabihollah Yousefi; Farzad Kazemi; Reza Ali Mohammadpour

    2016-01-01

    Background: Scaling and corrosion are the two most important indexes in water quality evaluation. Pollutants are released in water due to corrosion of pipelines. The aim of this study is to assess the scale formation and corrosion of drinking water supplies in Ilam city (Iran). Methods: This research is a descriptive and cross-sectional study which is based on the 20 drinking water sources in Ilam city. Experiments were carried out in accordance with the Water and Wastewater Co. ...

  20. Evaluation of psychometric properties and differential item functioning of 8-item Child Perceptions Questionnaires using item response theory.

    Science.gov (United States)

    Yau, David T W; Wong, May C M; Lam, K F; McGrath, Colman

    2015-08-19

    Four-factor structure of the two 8-item short forms of Child Perceptions Questionnaire CPQ11-14 (RSF:8 and ISF:8) has been confirmed. However, the sum scores are typically reported in practice as a proxy of Oral health-related Quality of Life (OHRQoL), which implied a unidimensional structure. This study first assessed the unidimensionality of 8-item short forms of CPQ11-14. Item response theory (IRT) was employed to offer an alternative and complementary approach of validation and to overcome the limitations of classical test theory assumptions. A random sample of 649 12-year-old school children in Hong Kong was analyzed. Unidimensionality of the scale was tested by confirmatory factor analysis (CFA), principle component analysis (PCA) and local dependency (LD) statistic. Graded response model was fitted to the data. Contribution of each item to the scale was assessed by item information function (IIF). Reliability of the scale was assessed by test information function (TIF). Differential item functioning (DIF) across gender was identified by Wald test and expected score functions. Both CPQ11-14 RSF:8 and ISF:8 did not deviate much from the unidimensionality assumption. Results from CFA indicated acceptable fit of the one-factor model. PCA indicated that the first principle component explained >30 % of the total variation with high factor loadings for both RSF:8 and ISF:8. Almost all LD statistic items suggesting little contribution of information to the scale and item removal caused little practical impact. Comparing the TIFs, RSF:8 showed slightly better information than ISF:8. In addition to oral symptoms items, the item "Concerned with what other people think" demonstrated a uniform DIF (p Items related to oral symptoms were not informative to OHRQoL and deletion of these items is suggested. The impact of DIF across gender on the overall score was minimal. CPQ11-14 RSF:8 performed slightly better than ISF:8 in measurement precision. The 6-item short forms

  1. Evaluation of item candidates for a diabetic retinopathy quality of life item bank.

    Science.gov (United States)

    Fenwick, Eva K; Pesudovs, Konrad; Khadka, Jyoti; Rees, Gwyn; Wong, Tien Y; Lamoureux, Ecosse L

    2013-09-01

    We are developing an item bank assessing the impact of diabetic retinopathy (DR) on quality of life (QoL) using a rigorous multi-staged process combining qualitative and quantitative methods. We describe here the first two qualitative phases: content development and item evaluation. After a comprehensive literature review, items were generated from four sources: (1) 34 previously validated patient-reported outcome measures; (2) five published qualitative articles; (3) eight focus groups and 18 semi-structured interviews with 57 DR patients; and (4) seven semi-structured interviews with diabetes or ophthalmic experts. Items were then evaluated during 3 stages, namely binning (grouping) and winnowing (reduction) based on key criteria and panel consensus; development of item stems and response options; and pre-testing of items via cognitive interviews with patients. The content development phase yielded 1,165 unique items across 7 QoL domains. After 3 sessions of binning and winnowing, items were reduced to a minimally representative set (n = 312) across 9 domains of QoL: visual symptoms; ocular surface symptoms; activity limitation; mobility; emotional; health concerns; social; convenience; and economic. After 8 cognitive interviews, 42 items were amended resulting in a final set of 314 items. We have employed a systematic approach to develop items for a DR-specific QoL item bank. The psychometric properties of the nine QoL subscales will be assessed using Rasch analysis. The resulting validated item bank will allow clinicians and researchers to better understand the QoL impact of DR and DR therapies from the patient's perspective.

  2. Developing young adolescents’ self-regulation by means of formative assessment: A theoretical perspective

    Directory of Open Access Journals (Sweden)

    Kelly D. Meusen-Beekman

    2015-12-01

    Full Text Available Fostering self-regulated learning (SRL has become increasingly important at various educational levels. Most studies on SRL have been conducted in higher education. The present literature study aims toward understanding self-regulation processes of students in primary and secondary education. We explored the development of young students’ self-regulation from a theoretical perspective. In addition, effective characteristics for an intervention to develop young students’ self-regulation were examined, as well as the possibilities of implementing formative assessments in primary education to develop self-regulation. The results show that SRL can be supported in both primary and secondary education. However, at both school levels, differences were found, regarding the theoretical background of the training and the type of instructed strategy. Studies so far suggest avenues toward formative assessment, which seems to be a unifying theory of instruction that improves the learning process by developing self-regulation among students. But gaps in knowledge about the impact of formative assessments on the development of SRL strategies among primary school students require further exploration.

  3. A Model Formative Assessment Strategy to Promote Student-Centered Self-Regulated Learning in Higher Education

    Science.gov (United States)

    Bose, Jayakumar; Rengel, Zed

    2009-01-01

    Adult learners are already involved in the process of self-regulation; hence, higher education institutions should focus on strengthening students' self-regulatory skills. Self-regulation can be facilitated through formative assessment. This paper proposes a model formative assessment strategy that would complement existing university teaching,…

  4. "Complex Teaching Realities" and "Deep Rooted Cultural Traditions": Barriers to the Implementation and Internalisation of Formative Assessment in China

    Science.gov (United States)

    Poole, Adam; Adamson, Bob

    2016-01-01

    This article forms the first part of an Action Research project designed to incorporate formative assessment into the culture of learning of a bilingual school in Shanghai, China. It synthesises the empirical literature on formative assessment in China to establish some of the difficulties that teachers have faced in trying to incorporate this…

  5. Formative Self-Assessment College Classes Improves Self-Regulation and Retention in First/Second Year Community College Students

    Science.gov (United States)

    Mahlberg, Jamie

    2015-01-01

    This research examined the influence formative self-assessment had on first/second year community college student self-regulatory practices. Previous research has shown that the ability to regulate one's learning activities can improve performance in college classes, and it has long been known that the use of formative assessment improves…

  6. The effects of a digital formative assessment tool on mathematics achievement and student motivation : Results of a randomized experiment

    NARCIS (Netherlands)

    Faber, Janke; Luyten, Johannes W.; Visscher, Arend J.

    2017-01-01

    In this study a randomized experimental design was used to examine the effects of a digital formative assessment tool on mathematics achievement and motivation in grade three primary education (n schools = 79, n students = 1808). Experimental schools used a digital formative assessment tool whereas

  7. Saudi Internal Medicine Residents׳ Perceptions of the Objective Structured Clinical Examination as a Formative Assessment Tool

    Directory of Open Access Journals (Sweden)

    Salwa Alaidarous

    2016-12-01

    Full Text Available The Saudi Commission for Health Specialties first implemented the Objective Structured Clinical Examinations (OSCE as part of the final year Internal Medicine clerkship exam during the 2007–2008 academic year. This study evaluated Internal Medicine residents׳ overall perceptions of the OSCE as a formative assessment tool. It focused on residents׳ perceptions of the OSCE stations׳ attributes, determined the acceptability of the process, and provided feedback to enhance further development of the assessment tool. The main objective was to assess Internal Medicine resident test-takers׳ perceptions and acceptance of the OSCE, and to identify its strengths and weaknesses through their feedback. Sixty six residents were involved in the studied administered on November 8th 2012 at King Abdulaziz University Hospital in Jeddah, Kingdom of Saudi Arabia. Overall, resident׳s evaluation of the OSCE was favorable and encouraging. To this end, we recommend that formative assessment opportunities using the OSCE for providing feedback to students should be included in the curriculum, and continuing refinement and localized adaptation of OSCEs in use should be pursued by course directors and assessment personnel.

  8. Guide to good practices for the development of test items

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-01-01

    While the methodology used in developing test items can vary significantly, to ensure quality examinations, test items should be developed systematically. Test design and development is discussed in the DOE Guide to Good Practices for Design, Development, and Implementation of Examinations. This guide is intended to be a supplement by providing more detailed guidance on the development of specific test items. This guide addresses the development of written examination test items primarily. However, many of the concepts also apply to oral examinations, both in the classroom and on the job. This guide is intended to be used as guidance for the classroom and laboratory instructor or curriculum developer responsible for the construction of individual test items. This document focuses on written test items, but includes information relative to open-reference (open book) examination test items, as well. These test items have been categorized as short-answer, multiple-choice, or essay. Each test item format is described, examples are provided, and a procedure for development is included. The appendices provide examples for writing test items, a test item development form, and examples of various test item formats.

  9. The Possibilities and Limitations of Assessment for Learning: Exploring the Theory of Formative Assessment and the Notion of "Closing the Learning Gap"

    Science.gov (United States)

    Ninomiya, Shuichi

    2016-01-01

    Black and Wiliam (1998a, 1998b) demonstrate that formative assessment is one of the most effective strategies for promoting student learning. Since the publication of their reviews, formative assessment has gained increasing international prominence in both policy and practice. However, despite this early innovation, the theory and practice of…

  10. Using Formative Assessment to Facilitate Learner Self-Regulation: A Case Study of Assessment Practices and Student Perceptions in Hong Kong

    Science.gov (United States)

    Jing Jing, Ma

    2017-01-01

    One of the key aims of formative assessment in higher education is to enable students to become self-regulated learners (Nicol & Macfarlane-Dick, 2006). Based on Nicol and Macfarlane-Dick's (2006) framework, this exploratory study investigates which formative assessment practices proposed by them were used by one college EFL writing teacher to…

  11. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate and massive objects require a longer procedure and will therefore take longer.

  12. Spare Items validation

    International Nuclear Information System (INIS)

    Fernandez Carratala, L.

    1998-01-01

    There is an increasing difficulty for purchasing safety related spare items, with certifications by manufacturers for maintaining the original qualifications of the equipment of destination. The main reasons are, on the top of the logical evolution of technology, applied to the new manufactured components, the quitting of nuclear specific production lines and the evolution of manufacturers quality systems, originally based on nuclear codes and standards, to conventional industry standards. To face this problem, for many years different Dedication processes have been implemented to verify whether a commercial grade element is acceptable to be used in safety related applications. In the same way, due to our particular position regarding the spare part supplies, mainly from markets others than the american, C.N. Trillo has developed a methodology called Spare Items Validation. This methodology, which is originally based on dedication processes, is not a single process but a group of coordinated processes involving engineering, quality and management activities. These are to be performed on the spare item itself, its design control, its fabrication and its supply for allowing its use in destinations with specific requirements. The scope of application is not only focussed on safety related items, but also to complex design, high cost or plant reliability related components. The implementation in C.N. Trillo has been mainly curried out by merging, modifying and making the most of processes and activities which were already being performed in the company. (Author)

  13. Selecting Lower Priced Items.

    Science.gov (United States)

    Kleinert, Harold L.; And Others

    1988-01-01

    A program used to teach moderately to severely mentally handicapped students to select the lower priced items in actual shopping activities is described. Through a five-phase process, students are taught to compare prices themselves as well as take into consideration variations in the sizes of containers and varying product weights. (VW)

  14. The Role of Item Models in Automatic Item Generation

    Science.gov (United States)

    Gierl, Mark J.; Lai, Hollis

    2012-01-01

    Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…

  15. Item information and discrimination functions for trinary PCM items

    NARCIS (Netherlands)

    Akkermans, Wies; Muraki, Eiji

    1997-01-01

    For trinary partial credit items the shape of the item information and the item discrimination function is examined in relation to the item parameters. In particular, it is shown that these functions are unimodal if δ2 – δ1 < 4 ln 2 and bimodal otherwise. The locations and values of the maxima are

  16. Validation of the 36-item version of the WHO Disability Assessment Schedule 2.0 (WHODAS 2.0) for assessing women's disability and functioning associated with maternal morbidity.

    Science.gov (United States)

    Silveira, Carla; Parpinelli, Mary Angela; Pacagnella, Rodolfo Carvalho; Andreucci, Carla Betina; Angelini, Carina Robles; Ferreira, Elton Carlos; Cecatti, José Guilherme

    2017-02-01

    Objective  To validate the translation and adaptation to Brazilian Portuguese of 36 items from the World Health Organizaton Disability Assessment Schedule 2.0 (WHODAS 2.0), regarding their content and structure (construct), in a female population after pregnancy. Methods  This is a validation of an instrument for the evaluation of disability and functioning and an assessment of its psychometric properties, performed in a tertiary maternity and a referral center specialized in high-risk pregnancies in Brazil. A sample of 638 women in different postpartum periods who had either a normal or a complicated pregnancy was included. The structure was evaluated by exploratory factor analysis (EFA) and confirmatory factor analysis (CFA), while the content and relationships among the domains were assessed through Pearson's correlation coefficient. The sociodemographic characteristics were identified, and the mean scores with their standard deviations for the 36 questions of the WHODAS 2.0 were calculated. The internal consistency was evaluated byCronbach's α. Results  Cronbach's α was higher than 0.79 for both sets of questons of the questionnaire. The EFA and CFA for the main 32 questions exhibited a total variance of 54.7% (Kaiser-Meyer-Olkin [KMO] measure of sampling adequacy =  0.934; p  < 0.001) and 53.47% (KMO = 0.934; p  < 0.001) respectively. There was a significant correlation among the 6 domains (r = 0.571-0.876), and a moderate correlation among all domains (r = 0.476-0.694). Conclusion  The version of the WHODAS 2.0 instrument adapted to Brazilian Portuguese showed good psychometric properties in this sample, and therefore could be applied to populations of women regarding their reproductive history. Thieme-Revinter Publicações Ltda Rio de Janeiro, Brazil.

  17. Use of Multi-Response Format Test in the Assessment of Medical Students’ Critical Thinking Ability

    Science.gov (United States)

    Mafinejad, Mahboobeh Khabaz; Monajemi, Alireza; Jalili, Mohammad; Soltani, Akbar; Rasouli, Javad

    2017-01-01

    Introduction To evaluate students critical thinking skills effectively, change in assessment practices is must. The assessment of a student’s ability to think critically is a constant challenge, and yet there is considerable debate on the best assessment method. There is evidence that the intrinsic nature of open and closed-ended response questions is to measure separate cognitive abilities. Aim To assess critical thinking ability of medical students by using multi-response format of assessment. Materials and Methods A cross-sectional study was conducted on a group of 159 undergraduate third-year medical students. All the participants completed the California Critical Thinking Skills Test (CCTST) consisting of 34 multiple-choice questions to measure general critical thinking skills and a researcher-developed test that combines open and closed-ended questions. A researcher-developed 48-question exam, consisting of 8 short-answers and 5 essay questions, 19 Multiple-Choice Questions (MCQ), and 16 True-False (TF) questions, was used to measure critical thinking skills. Correlation analyses were performed using Pearson’s coefficient to explore the association between the total scores of tests and subtests. Results One hundred and fifty-nine students participated in this study. The sample comprised 81 females (51%) and 78 males (49%) with an age range of 20±2.8 years (mean 21.2 years). The response rate was 64.1%. A significant positive correlation was found between types of questions and critical thinking scores, of which the correlations of MCQ (r=0.82) and essay questions (r=0.77) were strongest. The significant positive correlations between multi-response format test and CCTST’s subscales were seen in analysis, evaluation, inference and inductive reasoning. Unlike CCTST subscales, multi-response format test have weak correlation with CCTST total score (r=0.45, p=0.06). Conclusion This study highlights the importance of considering multi-response format test in

  18. Negotiating the use of formative assessment for learning in an era of accountability testing

    Science.gov (United States)

    Yin, Xinying

    The purpose of this collaborative action research was to understand how science educators can negotiate the tension between integrating formative assessment (FA) for students' learning and meeting the need for standardized summative assessment (testing) from a critical perspective. Using formative assessment in the era of accountability testing was a process in which the science educators identified the ways that the standardized testing system constrained the teacher's use of FA to improve students' learning, sought solutions to overcome the obstacles and came to understand how FA can be utilized to neutralize the power relationship between the institutional requirement and classroom teaching and learning. The challenge of doing FA under the pressure of standardized testing mainly lie in two dimensions: one was the demand of teaching all the desired standard-based content to all students in a limited amount of time and the sufficient time and flexibility required by doing FA to improve students' understanding, the other was the different levels of knowledge and forms of knowledge representation on FA and tests. The negotiation of doing FA for teaching standards and preparing students for tests entailed six aspects for the collaborative team, including clarifying teaching objectives, reconstructing instructional activities, negotiating with time constraints, designing effective FA activities, attending to students' needs in doing FA, and modifying end-of-unit tests to better assess the learning goals. As the teacher's instructional goals evolved to be more focused on conceptual understanding of standards and more thorough understanding for less activities, she perceived doing FA for learning and preparing students for standardized tests as more congruent. By integrating both divergent and convergent FA into instruction as well as modifying tests to be more aligned with standards, students' learning were enhanced and they were also being prepared for tests. This

  19. Use of Multi-Response Format Test in the Assessment of Medical Students' Critical Thinking Ability.

    Science.gov (United States)

    Mafinejad, Mahboobeh Khabaz; Arabshahi, Seyyed Kamran Soltani; Monajemi, Alireza; Jalili, Mohammad; Soltani, Akbar; Rasouli, Javad

    2017-09-01

    To evaluate students critical thinking skills effectively, change in assessment practices is must. The assessment of a student's ability to think critically is a constant challenge, and yet there is considerable debate on the best assessment method. There is evidence that the intrinsic nature of open and closed-ended response questions is to measure separate cognitive abilities. To assess critical thinking ability of medical students by using multi-response format of assessment. A cross-sectional study was conducted on a group of 159 undergraduate third-year medical students. All the participants completed the California Critical Thinking Skills Test (CCTST) consisting of 34 multiple-choice questions to measure general critical thinking skills and a researcher-developed test that combines open and closed-ended questions. A researcher-developed 48-question exam, consisting of 8 short-answers and 5 essay questions, 19 Multiple-Choice Questions (MCQ), and 16 True-False (TF) questions, was used to measure critical thinking skills. Correlation analyses were performed using Pearson's coefficient to explore the association between the total scores of tests and subtests. One hundred and fifty-nine students participated in this study. The sample comprised 81 females (51%) and 78 males (49%) with an age range of 20±2.8 years (mean 21.2 years). The response rate was 64.1%. A significant positive correlation was found between types of questions and critical thinking scores, of which the correlations of MCQ (r=0.82) and essay questions (r=0.77) were strongest. The significant positive correlations between multi-response format test and CCTST's subscales were seen in analysis, evaluation, inference and inductive reasoning. Unlike CCTST subscales, multi-response format test have weak correlation with CCTST total score (r=0.45, p=0.06). This study highlights the importance of considering multi-response format test in the assessment of critical thinking abilities of medical

  20. Role of Systematic Formative Assessment on Students’ Views of Their Learning

    Directory of Open Access Journals (Sweden)

    Areiza Restrepo Hugo Nelson

    2013-10-01

    Full Text Available This article presents a partial report of a small qualitative research study that explored the students’ views of their learning during and after the implementation of formative procedures such as self-assessment, feedback, and conferences. The article also includes their perceptions about this implementation. The research was carried out with a group of students of English enrolled in an extension program of a Colombian public university. The results showed that formative assessment helped these learners to be aware of their communicative competence and to perceive the situations in which they developed this awareness; it also enabled them to experience success in their learning. Also, learners identified the purposes of this kind of assessment and perceived formative assessment as a transparent procedure.Este artículo presenta el reporte parcial de un pequeño estudio de investigación de tipo cualitativo que exploró las percepciones de los estudiantes sobre su aprendizaje durante y después de la implementación de una evaluación formativa sistemática y sus visiones sobre este tipo de intervención. El estudio se llevó a cabo en un grupo de estudiantes de inglés pertenecientes a un programa de extensión de enseñanza de lenguas extranjeras en una universidad pública colombiana. Los resultados mostraron que la evaluación formativa ayudó a estos estudiantes a ser conscientes de su competencia comunicativa y a reconocer las situaciones en las que se generó tal conciencia; además, también les permitió experimentar éxito en su aprendizaje. Asimismo, los estudiantes identificaron los propósitos de este tipo de evaluación, la cual percibieron como un proceso transparente.

  1. Item Banking with Embedded Standards

    Science.gov (United States)

    MacCann, Robert G.; Stanley, Gordon

    2009-01-01

    An item banking method that does not use Item Response Theory (IRT) is described. This method provides a comparable grading system across schools that would be suitable for low-stakes testing. It uses the Angoff standard-setting method to obtain item ratings that are stored with each item. An example of such a grading system is given, showing how…

  2. Assessing efficiency of formation of the bank’s system of financial controlling

    Directory of Open Access Journals (Sweden)

    Chmutova Irina N.

    2014-01-01

    Full Text Available The article offers a scientific and methodical approach to assessment of efficiency of formation of the bank’s system of financial controlling, which takes into account two components – assessment of efficiency of team work on introduction of financial controlling as an investment project. This would allow identification of expediency of investments into the project on introduction and taking into account not only professional level of the team but also psychological distinctive features of its each member. In order to determine correlations of the assessment components the article forms a matrix that would serve as a basis for development of the necessary complex of actions with respect to increase of the bank’s financial controlling efficiency.

  3. Risk Assessment of Geologic Formation Sequestration in The Rocky Mountain Region, USA

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Si-Yong; McPherson, Brian

    2013-08-01

    The purpose of this report is to describe the outcome of a targeted risk assessment of a candidate geologic sequestration site in the Rocky Mountain region of the USA. Specifically, a major goal of the probabilistic risk assessment was to quantify the possible spatiotemporal responses for Area of Review (AoR) and injection-induced pressure buildup associated with carbon dioxide (CO₂) injection into the subsurface. Because of the computational expense of a conventional Monte Carlo approach, especially given the likely uncertainties in model parameters, we applied a response surface method for probabilistic risk assessment of geologic CO₂ storage in the Permo-Penn Weber formation at a potential CCS site in Craig, Colorado. A site-specific aquifer model was built for the numerical simulation based on a regional geologic model.

  4. Development and Implementation of an Electronic Clinical Formative Assessment: Dental Faculty and Student Perspectives.

    Science.gov (United States)

    Kirkup, Michele L; Adams, Brooke N; Meadows, Melinda L; Jackson, Richard

    2016-06-01

    A traditional summative grading structure, used at Indiana University School of Dentistry (IUSD) for more than 30 years, was identified by faculty as outdated for assessing students' clinical performance. In an effort to change the status quo, a feedback-driven assessment was implemented in 2012 to provide a constructive assessment tool acceptable to both faculty and students. Building on the successful non-graded clinical evaluation employed at Baylor College of Dentistry, IUSD implemented a streamlined electronic formative feedback model (FFM) to assess students' daily clinical performance. An important addition to this evaluation tool was the inclusion of routine student self-assessment opportunities. The aim of this study was to determine faculty and student response to the new assessment instrument. Following training sessions, anonymous satisfaction surveys were examined for the three user groups: clinical faculty (60% response rate), third-year (D3) students (72% response rate), and fourth-year (D4) students (57% response rate). In the results, 70% of the responding faculty members preferred the FFM over the summative model; however, 61.8% of the D4 respondents preferred the summative model, reporting insufficient assessment time and low faculty participation. The two groups of students had different responses to the self-assessment component: 70.2% of the D4 respondents appreciated clinical self-assessment compared to 46% of the D3 respondents. Overall, while some components of the FFM assessment were well received, a phased approach to implementation may have facilitated a transition more acceptable to both faculty and students. Improvements are being made in an attempt to increase overall satisfaction.

  5. Vegetable parenting practices scale: Item response modeling analyses

    Science.gov (United States)

    Our objective was to evaluate the psychometric properties of a vegetable parenting practices scale using multidimensional polytomous item response modeling which enables assessing item fit to latent variables and the distributional characteristics of the items in comparison to the respondents. We al...

  6. Item response theory - A first approach

    Science.gov (United States)

    Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

    2017-07-01

    The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.

  7. SHIPPING OF RADIOACTIVE ITEMS

    CERN Multimedia

    TIS/RP Group

    2001-01-01

    The TIS-RP group informs users that shipping of small radioactive items is normally guaranteed within 24 hours from the time the material is handed in at the TIS-RP service. This time is imposed by the necessary procedures (identification of the radionuclides, determination of dose rate, preparation of the package and related paperwork). Large and massive objects require a longer procedure and will therefore take longer.

  8. ARCADO - Adding random case analysis to direct observation in workplace-based formative assessment of general practice registrars.

    Science.gov (United States)

    Ingham, Gerard; Fry, Jennifer; Morgan, Simon; Ward, Bernadette

    2015-12-10

    Workplace-based formative assessments using consultation observation are currently conducted during the Australian general practice training program. Assessment reliability is improved by using multiple assessment methods. The aim of this study was to explore experiences of general practice medical educator assessors and registrars (trainees) when adding random case analysis to direct observation (ARCADO) during formative workplace-based assessments. A sample of general practice medical educators and matched registrars were recruited. Following the ARCADO workplace assessment, semi-structured qualitative interviews were conducted. The data was analysed thematically. Ten registrars and eight medical educators participated. Four major themes emerged - formative versus summative assessment; strengths (acceptability, flexibility, time efficiency, complementarity and authenticity); weaknesses (reduced observation and integrity risks); and contextual factors (variation in assessment content, assessment timing, registrar-medical educator relationship, medical educator's approach and registrar ability). ARCADO is a well-accepted workplace-based formative assessment perceived by registrars and assessors to be valid and flexible. The use of ARCADO enabled complementary insights that would not have been achieved with direct observation alone. Whilst there are some contextual factors to be considered in its implementation, ARCADO appears to have utility as formative assessment and, subject to further evaluation, high-stakes assessment.

  9. Science Library of Test Items. Volume Twenty-Two. A Collection of Multiple Choice Test Items Relating Mainly to Skills.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  10. Science Library of Test Items. Volume Eighteen. A Collection of Multiple Choice Test Items Relating Mainly to Chemistry.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  11. Science Library of Test Items. Volume Twenty. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 1.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  12. Science Library of Test Items. Volume Seventeen. A Collection of Multiple Choice Test Items Relating Mainly to Biology.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  13. Science Library of Test Items. Volume Nineteen. A Collection of Multiple Choice Test Items Relating Mainly to Geology.

    Science.gov (United States)

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  14. Medical student web-based formative assessment tool for renal pathology

    Directory of Open Access Journals (Sweden)

    Vanesa Bijol

    2015-03-01

    Full Text Available Background: Web-based formative assessment tools have become widely recognized in medical education as valuable resources for self-directed learning. Objectives: To explore the educational value of formative assessment using online quizzes for kidney pathology learning in our renal pathophysiology course. Methods: Students were given unrestricted and optional access to quizzes. Performance on quizzed and non-quizzed materials of those who used (‘quizzers’ and did not use the tool (‘non-quizzers’ was compared. Frequency of tool usage was analyzed and satisfaction surveys were utilized at the end of the course. Results: In total, 82.6% of the students used quizzes. The greatest usage was observed on the day before the final exam. Students repeated interactive and more challenging quizzes more often. Average means between final exam scores for quizzed and unrelated materials were almost equal for ‘quizzers’ and ‘non-quizzers’, but ‘quizzers’ performed statistically better than ‘non-quizzers’ on both, quizzed (p=0.001 and non-quizzed (p=0.024 topics. In total, 89% of surveyed students thought quizzes improved their learning experience in this course. Conclusions: Our new computer-assisted learning tool is popular, and although its use can predict the final exam outcome, it does not provide strong evidence for direct improvement in academic performance. Students who chose to use quizzes did well on all aspects of the final exam and most commonly used quizzes to practice for final exam. Our efforts to revitalize the course material and promote learning by adding interactive online formative assessments improved students’ learning experience overall.

  15. Development of a lack of appetite item bank for computer-adaptive testing (CAT)

    DEFF Research Database (Denmark)

    Thamsborg, Lise Laurberg Holst; Petersen, Morten Aa; Aaronson, Neil K

    2015-01-01

    to 12 lack of appetite items. CONCLUSIONS: Phases 1-3 resulted in 12 lack of appetite candidate items. Based on a field testing (phase 4), the psychometric characteristics of the items will be assessed and the final item bank will be generated. This CAT item bank is expected to provide precise...

  16. Automated Item Generation with Recurrent Neural Networks.

    Science.gov (United States)

    von Davier, Matthias

    2018-03-12

    Utilizing technology for automated item generation is not a new idea. However, test items used in commercial testing programs or in research are still predominantly written by humans, in most cases by content experts or professional item writers. Human experts are a limited resource and testing agencies incur high costs in the process of continuous renewal of item banks to sustain testing programs. Using algorithms instead holds the promise of providing unlimited resources for this crucial part of assessment development. The approach presented here deviates in several ways from previous attempts to solve this problem. In the past, automatic item generation relied either on generating clones of narrowly defined item types such as those found in language free intelligence tests (e.g., Raven's progressive matrices) or on an extensive analysis of task components and derivation of schemata to produce items with pre-specified variability that are hoped to have predictable levels of difficulty. It is somewhat unlikely that researchers utilizing these previous approaches would look at the proposed approach with favor; however, recent applications of machine learning show success in solving tasks that seemed impossible for machines not too long ago. The proposed approach uses deep learning to implement probabilistic language models, not unlike what Google brain and Amazon Alexa use for language processing and generation.

  17. Direct rate assessment of laccase catalysed radical formation in lignin by electron paramagnetic resonance spectroscopy

    DEFF Research Database (Denmark)

    Munk, Line; Andersen, Mogens Larsen; Meyer, Anne S.

    2017-01-01

    Laccases (EC 1.10.3.2) catalyse removal of an electron and a proton from phenolic hydroxyl groups, including phenolic hydroxyls in lignins, to form phenoxy radicals during reduction of O2. We employed electron paramagnetic resonance spectroscopy (EPR) for real time measurement of such catalytic...... to suspensions of the individual lignin samples produced immediate time and enzyme dose dependent increases in intensity in the EPR signal with g-values in the range 2.0047–2.0050 allowing a direct quantitative monitoring of the radical formation and thus allowed laccase enzyme kinetics assessment on lignin...... for the radical formation rate in organosolv lignin was determined by response surface methodology to pH 4.8, 33 °C and pH 5.8, 33 °C for the Tv laccase and the Mt laccase, respectively. The results verify direct radical formation action of fungal laccases on lignin without addition of mediators and the EPR...

  18. Effects of a Formative Assessment Script on How Vocational Students Generate Formative Feedback to a Peer's or Their Own Performance

    Science.gov (United States)

    Peters, Olaf; Körndle, Hermann; Narciss, Susanne

    2018-01-01

    The purposes of this study are threefold: It investigates effects of a formative assessment script (FAS) that was designed to support vocational students in generating feedback to (1) a peer's and (2) their own performance. Effects of the FAS are investigated with respect to quantitative and qualitative characteristics of the peer and internal…

  19. The Enactment of Formative Assessment in English Language Classrooms in Two Chinese Universities: Teacher and Student Responses

    Science.gov (United States)

    Chen, Qiuxian; May, Lyn; Klenowski, Val; Kettle, Margaret

    2014-01-01

    The "College English Curriculum Requirements," announced by the Chinese Ministry of Education in 2007, recommended the inclusion of formative assessment into the existing summative assessment framework of College English. This policy had the potential to fundamentally change the nature of assessment and its role in the teaching and…

  20. Formative assessment in an online learning environment to support flexible on-the-job learning in complex professional domains

    NARCIS (Netherlands)

    Tamara van Gog; Desirée Joosten-ten Brinke; F. J. Prins; Dominique Sluijsmans

    2010-01-01

    This article describes a blueprint for an online learning environment that is based on prominent instructional design and assessment theories for supporting learning in complex domains. The core of this environment consists of formative assessment tasks (i.e., assessment for learning) that center on

  1. The importance of formative assessment in science and engineering ethics education: some evidence and practical advice.

    Science.gov (United States)

    Keefer, Matthew W; Wilson, Sara E; Dankowicz, Harry; Loui, Michael C

    2014-03-01

    Recent research in ethics education shows a potentially problematic variation in content, curricular materials, and instruction. While ethics instruction is now widespread, studies have identified significant variation in both the goals and methods of ethics education, leaving researchers to conclude that many approaches may be inappropriately paired with goals that are unachievable. This paper speaks to these concerns by demonstrating the importance of aligning classroom-based assessments to clear ethical learning objectives in order to help students and instructors track their progress toward meeting those objectives. Two studies at two different universities demonstrate the usefulness of classroom-based, formative assessments for improving the quality of students' case responses in computational modeling and research ethics.

  2. Providing Formative Assessment to Students Solving Multipath Engineering Problems with Complex Arrangements of Interacting Parts: An Intelligent Tutor Approach

    Science.gov (United States)

    Steif, Paul S.; Fu, Luoting; Kara, Levent Burak

    2016-01-01

    Problems faced by engineering students involve multiple pathways to solution. Students rarely receive effective formative feedback on handwritten homework. This paper examines the potential for computer-based formative assessment of student solutions to multipath engineering problems. In particular, an intelligent tutor approach is adopted and…

  3. Long-term risk assessment of radioactive waste disposal in geological formations

    International Nuclear Information System (INIS)

    Girardi, F.; Bertozzi, G.; D'Alessandro, M.

    1978-01-01

    Methods for long-term safety analysis of waste from nuclear power production in the European Community are under study at the Joint Research Centre (JRC) at Ispra, Italy. Aim of the work is to develop a suitable methodology for long-term risk assessment. The methodology under study is based on the assessment of the quantitative value of a system of barriers which may be interposed between waste and man. The barriers considered are: a) quality of the segregation afforded by the geological formation, b) chemical and physical stability of conditioned waste, c) interaction with geological environments (subsoil retention), d) distribution in the biosphere. The methodology is presently being applied to idealized test cases based on the following assumptions: waste are generated during 30 years of operations in a nuclear park (reprocessing + refabrication plant) capable of treating 1000 ton/yr of LWR fuel. High activity waste is conditioned as borosilicate glass (HAW) while low- and medium-level wastes are bituminized (BIP). All waste is disposed off into a salt formation. Transport to the biosphere, following the containment failure occurs by groundwater, with no delay due to retention on adsorbing media. Distribution into the biosphere occurs according to the terrestrial model indicated. Under these assumptions, information was drawn concerning environmental contamination, its levels, contributing elements and pathways to man

  4. Demonstration of a performance assessment methodology for high-level radioactive waste disposal in basalt formations

    International Nuclear Information System (INIS)

    Bonano, E.J.; Davis, P.A.; Shipers, L.R.; Brinster, K.F.; Beyler, W.E.; Updegraff, C.D.; Shepherd, E.R.; Tilton, L.M.; Wahi, K.K.

    1989-06-01

    This document describes a performance assessment methodology developed for a high-level radioactive waste repository mined in deep basalt formations. This methodology is an extension of an earlier one applicable to bedded salt. The differences between the two methodologies arise primarily in the modeling of round-water flow and radionuclide transport. Bedded salt was assumed to be a porous medium, whereas basalt formations contain fractured zones. Therefore, mathematical models and associated computer codes were developed to simulate the aforementioned phenomena in fractured media. The use of the methodology is demonstrated at a hypothetical basalt site by analyzing seven scenarios: (1) thermohydrological effects caused by heat released from the repository, (2) mechanohydrological effects caused by an advancing and receding glacier, (3) normal ground-water flow, (4) pumping of ground water from a confined aquifer, (5) rerouting of a river near the repository, (6) drilling of a borehole through the repository, and (7) formation of a new fault intersecting the repository. The normal ground-water flow was considered the base-case scenario. This scenario was used to perform uncertainty and sensitivity analyses and to demonstrate the existing capabilities for assessing compliance with the ground-water travel time criterion and the containment requirements. Most of the other scenarios were considered perturbations of the base case, and a few were studied in terms of changes with respect to initial conditions. The potential impact of these scenarios on the long-term performance of the disposal system was ascertained through comparison with the base-case scenario or the undisturbed initial conditions. 66 refs., 106 figs., 27 tabs

  5. Diet Quality of Items Advertised in Supermarket Sales Circulars Compared to Diets of the US Population, as Assessed by the Healthy Eating Index-2010.

    Science.gov (United States)

    Jahns, Lisa; Scheett, Angela J; Johnson, LuAnn K; Krebs-Smith, Susan M; Payne, Collin R; Whigham, Leah D; Hoverson, Bonita S; Kranz, Sibylle

    2016-01-01

    Supermarkets use sales circulars to highlight specific foods, usually at reduced prices. Resulting purchases help form the set of available foods within households from which individuals and families make choices about what to eat. The purposes of this study were to determine how closely foods featured in weekly supermarket sales circulars conform to dietary guidance and how diet quality compares with that of the US population's intakes. Food and beverage items (n=9,149) in 52 weekly sales circulars from a small Midwestern grocery chain in 2009 were coded to obtain food group and nutrient and energy content. Healthy Eating Index-2010 (HEI-2010) total and component scores were calculated using algorithms developed by the National Cancer Institute. HEI-2010 scores for the US population aged 2+ years were estimated using data from the 2009-2010 National Health and Nutrition Examination Survey. HEI-2010 scores of circulars and population intakes were compared using Student's t tests. Mean total (42.8 of 100) HEI-2010 scores of circulars were lower than that of the US population (55.4; Pdiet quality. Supermarkets could support improvements in consumer diets by weekly featuring foods that are more in concordance with food and nutrient recommendations. Copyright © 2016 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.

  6. Negative Affect Impairs Associative Memory but Not Item Memory

    Science.gov (United States)

    Bisby, James A.; Burgess, Neil

    2014-01-01

    The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine…

  7. How employees perceive organizational learning: construct validation of the 25-item short form of the strategic learning assessment map (SF-SLAM)

    NARCIS (Netherlands)

    Mainert, Jakob; Niepel, Christoph; Lans, T.; Greiff, Samuel

    2018-01-01

    Purpose: The Strategic Learning Assessment Map (SLAM) originally assessed organizational learning (OL) at the level of the firm by addressing managers, who rated OL in the SLAM on five dimensions of individual learning, group learning, organizational learning, feed-forward learning, and feedback

  8. Uncertainty studies and risk assessment for CO{sub 2} storage in geological formations

    Energy Technology Data Exchange (ETDEWEB)

    Walter, Lena Sophie

    2013-07-01

    Carbon capture and storage (CCS) in deep geological formations is one possible option to mitigate the greenhouse gas effect by reducing CO{sub 2} emissions into the atmosphere. The assessment of the risks related to CO{sub 2} storage is an important task. Events such as CO{sub 2} leakage and brine displacement could result in hazards for human health and the environment. In this thesis, a systematic and comprehensive risk assessment concept is presented to investigate various levels of uncertainties and to assess risks using numerical simulations. Depending on the risk and the processes, which should be assessed, very complex models, large model domains, large time scales, and many simulations runs for estimating probabilities are required. To reduce the resulting high computational costs, a model reduction technique (the arbitrary polynomial chaos expansion) and a method for model coupling in space are applied. The different levels of uncertainties are: statistical uncertainty in parameter distributions, scenario uncertainty, e.g. different geological features, and recognized ignorance due to assumptions in the conceptual model set-up. Recognized ignorance and scenario uncertainty are investigated by simulating well defined model set-ups and scenarios. According to damage values, which are defined as a model output, the set-ups and scenarios can be compared and ranked. For statistical uncertainty probabilities can be determined by running Monte Carlo simulations with the reduced model. The results are presented in various ways: e.g., mean damage, probability density function, cumulative distribution function, or an overall risk value by multiplying the damage with the probability. If the model output (damage) cannot be compared to provided criteria (e.g. water quality criteria), analytical approximations are presented to translate the damage into comparable values. The overall concept is applied for the risks related to brine displacement and infiltration into

  9. Uncertainty studies and risk assessment for CO2 storage in geological formations

    International Nuclear Information System (INIS)

    Walter, Lena Sophie

    2013-01-01

    Carbon capture and storage (CCS) in deep geological formations is one possible option to mitigate the greenhouse gas effect by reducing CO 2 emissions into the atmosphere. The assessment of the risks related to CO 2 storage is an important task. Events such as CO 2 leakage and brine displacement could result in hazards for human health and the environment. In this thesis, a systematic and comprehensive risk assessment concept is presented to investigate various levels of uncertainties and to assess risks using numerical simulations. Depending on the risk and the processes, which should be assessed, very complex models, large model domains, large time scales, and many simulations runs for estimating probabilities are required. To reduce the resulting high computational costs, a model reduction technique (the arbitrary polynomial chaos expansion) and a method for model coupling in space are applied. The different levels of uncertainties are: statistical uncertainty in parameter distributions, scenario uncertainty, e.g. different geological features, and recognized ignorance due to assumptions in the conceptual model set-up. Recognized ignorance and scenario uncertainty are investigated by simulating well defined model set-ups and scenarios. According to damage values, which are defined as a model output, the set-ups and scenarios can be compared and ranked. For statistical uncertainty probabilities can be determined by running Monte Carlo simulations with the reduced model. The results are presented in various ways: e.g., mean damage, probability density function, cumulative distribution function, or an overall risk value by multiplying the damage with the probability. If the model output (damage) cannot be compared to provided criteria (e.g. water quality criteria), analytical approximations are presented to translate the damage into comparable values. The overall concept is applied for the risks related to brine displacement and infiltration into drinking water

  10. Item-focussed Trees for the Identification of Items in Differential Item Functioning.

    Science.gov (United States)

    Tutz, Gerhard; Berger, Moritz

    2016-09-01

    A novel method for the identification of differential item functioning (DIF) by means of recursive partitioning techniques is proposed. We assume an extension of the Rasch model that allows for DIF being induced by an arbitrary number of covariates for each item. Recursive partitioning on the item level results in one tree for each item and leads to simultaneous selection of items and variables that induce DIF. For each item, it is possible to detect groups of subjects with different item difficulties, defined by combinations of characteristics that are not pre-specified. The way a DIF item is determined by covariates is visualized in a small tree and therefore easily accessible. An algorithm is proposed that is based on permutation tests. Various simulation studies, including the comparison with traditional approaches to identify items with DIF, show the applicability and the competitive performance of the method. Two applications illustrate the usefulness and the advantages of the new method.

  11. Examination of the PROMIS upper extremity item bank.

    Science.gov (United States)

    Hung, Man; Voss, Maren W; Bounsanga, Jerry; Crum, Anthony B; Tyser, Andrew R

    Clinical measurement. The psychometric properties of the PROMIS v1.2 UE item bank were tested on various samples prior to its release, but have not been fully evaluated among the orthopaedic population. This study assesses the performance of the UE item bank within the UE orthopaedic patient population. The UE item bank was administered to 1197 adult patients presenting to a tertiary orthopaedic clinic specializing in hand and UE conditions and was examined using traditional statistics and Rasch analysis. The UE item bank fits a unidimensional model (outfit MNSQ range from 0.64 to 1.70) and has adequate reliabilities (person = 0.84; item = 0.82) and local independence (item residual correlations range from -0.37 to 0.34). Only one item exhibits gender differential item functioning. Most items target low levels of function. The UE item bank is a useful clinical assessment tool. Additional items covering higher functions are needed to enhance validity. Supplemental testing is recommended for patients at higher levels of function until more high function UE items are developed. 2c. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.

  12. The development and discussion of computerized visual perception assessment tool for Chinese characters structures - Concurrent estimation of the overall ability and the domain ability in item response theory approach.

    Science.gov (United States)

    Wu, Huey-Min; Lin, Chin-Kai; Yang, Yu-Mao; Kuo, Bor-Chen

    2014-11-12

    Visual perception is the fundamental skill required for a child to recognize words, and to read and write. There was no visual perception assessment tool developed for preschool children based on Chinese characters in Taiwan. The purposes were to develop the computerized visual perception assessment tool for Chinese Characters Structures and to explore the psychometrical characteristic of assessment tool. This study adopted purposive sampling. The study evaluated 551 kindergarten-age children (293 boys, 258 girls) ranging from 46 to 81 months of age. The test instrument used in this study consisted of three subtests and 58 items, including tests of basic strokes, single-component characters, and compound characters. Based on the results of model fit analysis, the higher-order item response theory was used to estimate the performance in visual perception, basic strokes, single-component characters, and compound characters simultaneously. Analyses of variance were used to detect significant difference in age groups and gender groups. The difficulty of identifying items in a visual perception test ranged from -2 to 1. The visual perception ability of 4- to 6-year-old children ranged from -1.66 to 2.19. Gender did not have significant effects on performance. However, there were significant differences among the different age groups. The performance of 6-year-olds was better than that of 5-year-olds, which was better than that of 4-year-olds. This study obtained detailed diagnostic scores by using a higher-order item response theory model to understand the visual perception of basic strokes, single-component characters, and compound characters. Further statistical analysis showed that, for basic strokes and compound characters, girls performed better than did boys; there also were differences within each age group. For single-component characters, there was no difference in performance between boys and girls. However, again the performance of 6-year-olds was better than

  13. Designing a Culturally Appropriate Format of Formative Peer Assessment for Asian Students: The Case of Vietnamese Students

    Science.gov (United States)

    Thanh, Pham Thi Hong; Gillies, Robyn

    2010-01-01

    Peer assessment has recently been widely recommended in Vietnamese classrooms. However, there are argumentative opinions about this assessment because it has many conflicts with the learning culture of Vietnamese students. To date, there has not been any study addressing this issue. The present study investigated how Vietnamese students…

  14. Environmental Assessment: Geothermal Energy Geopressure Subprogram. Gulf Coast Well Testing Activity, Frio Formation, Texas and Louisiana

    Energy Technology Data Exchange (ETDEWEB)

    None

    1978-02-01

    This Environmental Assessment (EA) has been prepared to provide the environmental input into the Division of Geothermal Energy's decisions to expand the geothermal well testing activities to include sites in the Frio Formation of Texas and Louisiana. It is proposed that drilling rigs be leased before they are removed from sites in the formation where drilling for gas or oil exploration has been unsuccessful and that the rigs be used to complete the drilling into the geopressured zone for resource exploration. This EA addresses, on a regional basis, the expected activities, affected environment, and the possible impacts in a broad sense as they apply to the Gulf Coast well testing activity of the Geothermal Energy Geopressure Subprogram of the Department of Energy. Along the Texas and Louisiana Gulf Coast (Plate 1 and Overlay, Atlas) water at high temperatures and high pressures is trapped within Gulf basin sediments. The water is confined within or below essentially impermeable shale sequences and carries most or all of the overburden pressure. Such zones are referred to as geopressured strata. These fluids and sediments are heated to abnormally high temperatures (up to 260 C) and may provide potential reservoirs for economical production of geothermal energy. The obvious need in resource development is to assess the resource. Ongoing studies to define large-sand-volume reservoirs will ultimately define optimum sites for drilling special large diameter wells to perform large volume flow production tests. in the interim, existing well tests need to be made to help define and assess the resource.

  15. Nursing students' evaluation of a new feedback and reflection tool for use in high-fidelity simulation - Formative assessment of clinical skills. A descriptive quantitative research design.

    Science.gov (United States)

    Solheim, Elisabeth; Plathe, Hilde Syvertsen; Eide, Hilde

    2017-11-01

    Clinical skills training is an important part of nurses' education programmes. Clinical skills are complex. A common understanding of what characterizes clinical skills and learning outcomes needs to be established. The aim of the study was to develop and evaluate a new reflection and feedback tool for formative assessment. The study has a descriptive quantitative design. 129 students participated who were at the end of the first year of a Bachelor degree in nursing. After highfidelity simulation, data were collected using a questionnaire with 19 closed-ended and 2 open-ended questions. The tool stimulated peer assessment, and enabled students to be more thorough in what to assess as an observer in clinical skills. The tool provided a structure for selfassessment and made visible items that are important to be aware of in clinical skills. This article adds to simulation literature and provides a tool that is useful in enhancing peer learning, which is essential for nurses in practice. The tool has potential for enabling students to learn about reflection and developing skills for guiding others in practice after they have graduated. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Thirdyear medical students’ and clinical teachers’ perceptions of formative assessment feedback in the simulated clinical setting

    Directory of Open Access Journals (Sweden)

    Reina Abraham

    2016-05-01

    Full Text Available Background. Clinical skills training in the clinical skills laboratory (CSL environment forms an important part of the undergraduate medical curriculum. These skills are better demonstrated than described. A lack of direct observation and feedback given to medical students performing these skills has been reported. Without feedback, errors are uncorrected, good performance is not reinforced and clinical competence is minimally achieved. Objectives. To explore the perceptions of 3rd-year medical students and their clinical teachers about formative clinical assessment feedback in the CSL setting. Methods. Questionnaires with open- and closed-ended questions were administered to 3rd-year medical students and their clinical skills teachers. Quantitative data were statistically analysed while qualitative data were thematically analysed. Results. Five clinical teachers and 183 medical students participated. Average scores for the items varied between 1.87 and 5.00 (1: negative to 5:positive. The majority of students reported that feedback informed them of their competence level and learning needs, and motivated them to improve their skills and participation in patient-centred learning activities. Teachers believed that they provided sufficient and balanced feedback. Some students were concerned about the lack of standardised and structured assessment criteria and variation in teacher feedback. No statistical difference (p<0.05 was found between the mean item ratings based on demographic and academic background. Conclusion. Most teachers and students were satisfied with the feedback given and received, respectively. Structured and balanced criterion-referenced feedback processes, together with feedback training workshops for staff and students, are recommended to enhance feedback practice quality in the CSL. Limited clinical staff in the CSL was noted as a concern.

  17. Item validity vs. item discrimination index: a redundancy?

    Science.gov (United States)

    Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.

    2018-03-01

    In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.

  18. Interns reflect: the effect of formative assessment with feedback during pre-internship

    Directory of Open Access Journals (Sweden)

    McKenzie S

    2017-01-01

    Full Text Available Susan McKenzie,1 Annette Burgess,2 Craig Mellis1 1Central Clinical School, 2Education Office, Sydney Medical School, The University of Sydney, Sydney, NSW, Australia Background: It is widely known that the opportunity for medical students to be observed and to receive feedback on their procedural skills performance is variable in the senior years. To address this problem, we provided our Pre-Intern (PrInt students with “one-to-one” formative feedback on their ability to perform urethral catheterization (U/C and hypothesized that their future practice of U/C as interns would benefit. This study sought to evaluate the performance and practice of interns in U/C 4–5 months after having received feedback on their performance of U/C as PrInt students.Methods: Between 2013 and 2014, two cohorts of interns, (total n=66 who had received recent formative feedback on their U/C performance as PrInt students at Central Clinical School, were invited to complete an anonymous survey. The survey contained nine closed unvalidated questions and one open-ended question, designed to allow interns to report on their current practice of U/C.Results: Forty-one out of 66 interns (62% completed the survey. Thirty-five out of 41 respondents (85% reported that the assessment with feedback during their PrInt term was beneficial to their practice. Thirty of 41 (73% reported being confident to perform U/C independently. Eleven out of 41 respondents (27% reported that they had received additional training at intern orientation. Nine of the 11 interns (82% reported that they had a small, but a significant, increase in confidence to perform U/C when compared with the 30 of the 41 respondents (73% who had not (p=0.03.Conclusion: Our results substantiate our hypothesis that further education by assessment with feedback in U/C during PrInt was of benefit to interns’ performance. Additional educational reinforcement in U/C during intern orientation further improved intern

  19. Language-related differential item functioning between English and German PROMIS Depression items is negligible.

    Science.gov (United States)

    Fischer, H Felix; Wahl, Inka; Nolte, Sandra; Liegl, Gregor; Brähler, Elmar; Löwe, Bernd; Rose, Matthias

    2017-12-01

    To investigate differential item functioning (DIF) of PROMIS Depression items between US and German samples we compared data from the US PROMIS calibration sample (n = 780), a German general population survey (n = 2,500) and a German clinical sample (n = 621). DIF was assessed in an ordinal logistic regression framework, with 0.02 as criterion for R 2 -change and 0.096 for Raju's non-compensatory DIF. Item parameters were initially fixed to the PROMIS Depression metric; we used plausible values to account for uncertainty in depression estimates. Only four items showed DIF. Accounting for DIF led to negligible effects for the full item bank as well as a post hoc simulated computer-adaptive test (German general population sample was considerably lower compared to the US reference value of 50. Overall, we found little evidence for language DIF between US and German samples, which could be addressed by either replacing the DIF items by items not showing DIF or by scoring the short form in German samples with the corrected item parameters reported. Copyright © 2016 John Wiley & Sons, Ltd.

  20. Item selection via Bayesian IRT models.

    Science.gov (United States)

    Arima, Serena

    2015-02-10

    With reference to a questionnaire that aimed to assess the quality of life for dysarthric speakers, we investigate the usefulness of a model-based procedure for reducing the number of items. We propose a mixed cumulative logit model, which is known in the psychometrics literature as the graded response model: responses to different items are modelled as a function of individual latent traits and as a function of item characteristics, such as their difficulty and their discrimination power. We jointly model the discrimination and the difficulty parameters by using a k-component mixture of normal distributions. Mixture components correspond to disjoint groups of items. Items that belong to the same groups can be considered equivalent in terms of both difficulty and discrimination power. According to decision criteria, we select a subset of items such that the reduced questionnaire is able to provide the same information that the complete questionnaire provides. The model is estimated by using a Bayesian approach, and the choice of the number of mixture components is justified according to information criteria. We illustrate the proposed approach on the basis of data that are collected for 104 dysarthric patients by local health authorities in Lecce and in Milan. Copyright © 2014 John Wiley & Sons, Ltd.

  1. Criteria for eliminating items of a Test of Figural Analogies

    Directory of Open Access Journals (Sweden)

    Diego Blum

    2013-12-01

    Full Text Available This paper describes the steps taken to eliminate two of the items in a Test of Figural Analogies (TFA. The main guidelines of psychometric analysis concerning Classical Test Theory (CTT and Item Response Theory (IRT are explained. The item elimination process was based on both the study of the CTT difficulty and discrimination index, and the unidimensionality analysis. The a, b, and c parameters of the Three Parameter Logistic Model of IRT were also considered for this purpose, as well as the assessment of each item fitting this model. The unfavourable characteristics of a group of TFA items are detailed, and decisions leading to their possible elimination are discussed.

  2. A directory of computer programs for assessment of radioactive waste disposal in geological formations. Volume 2

    International Nuclear Information System (INIS)

    Ashton, J.; Broyd, T.W.; Jones, M.A.; Knowles, N.C.; Liew, S.K.; Mawbey, C.S.; Read, D.; Smith, S.L.

    1993-01-01

    This directory describes computer programs suitable for the assessment of radioactive waste disposal facilities in geological formations. The programs, which are mainly applicable to the post-closure analysis of the repository, address combinations of the following topics: nuclide inventory, corrosion, leaching, geochemistry, geomechanics, heat transfer, groundwater flow, radionuclide migration, biosphere modelling, safety assessment and site evolution. A total of 320 programs are identified, of which 84 are reviewed in detail, 192 in summary and 44 in tabular fashion. Originally published in 1983, the directory was updated in 1985 with the addition of new programs and the revision of some of the existing program reviews. This directory has been completely rewritten in 1991 with the addition of more new programs and a full revision of all the existing program reviews, some of which have been deleted as they are no longer in general use. Although the directory is specific to the post-closure assessment of a repository site, some of the programs described can also be used in other areas of repository (e.g. repository design). This directory is composed of two volumes, the present volume is the second

  3. A directory of computer programs for assessment of radioactive waste disposal in geological formations. Volume 1

    International Nuclear Information System (INIS)

    Ashton, J.; Broyd, T.W.; Jones, M.A.; Knowles, N.C.; Liew, S.K.; Mawbey, C.S.; Read, D.; Smith, S.L.

    1993-01-01

    This directory describes computer programs suitable for the assessment of radioactive waste disposal facilities in geological formations. The programs, which are mainly applicable to the post-closure analysis of the repository, address combinations of the following topics: nuclide inventory, corrosion, leaching, geochemistry, geomechanics, heat transfer, groundwater flow, radionuclide migration, biosphere modelling, safety assessment and site evolution. A total of 320 programs are identified of which 84 are reviewed in detail, 192 in summary and 44 in tabular fashion. Originally published in 1983, the directory was updated in 1985 with the addition of new programs and the revision of some of the existing program reviews. This directory has been completely rewritten in 1991 with the addition of more new programs and a full revision of all the existing program reviews, some of which have been deleted as they are no longer in general use. Although the directory is specific to the post-closure assessment of a repository site, some of the programs described can also be used in other areas of repository (e.g. repository design). This directory is composed of two volumes, the present volume is the first

  4. Performance assessment of geological isolation systems for medium and alpha waste disposal in granitic formations

    International Nuclear Information System (INIS)

    Lewi, J.; Brun-Yaba, C.; Cernes, A.

    1990-01-01

    PACOMA (Performance Assessment of Confinement for Medium and Alpha Waste) is a coordinated project of the Commission of the European Communities with the participation of the Member States. This project is intended to evaluate the suitability of clay, granite and salt formations to dispose of conditioned alpha and medium-level radioactive waste. In this report, CEA-IPSN presents the database and the results of evaluating the radiological consequences associated to the disposal of alpha-bearing waste in a deep granite formation. Two repository concepts and three sites have been examined (Auriat, a hypothetical site in the UK and Barfleur) which are identical to those considered in the PAGIS project. The methodology adopted for the PAGIS project has been used for carrying out the deterministic calculations of radiological consequences in the case of normal evolution scenarios and in altered evolutions, as well as for sensitivity analysis of results to the calculation parameters and for uncertainty studies. The calculation of individual doses in the case of normal evolutions show, after a first peak due to I-129, Se-79 and Tc-99 some hundred of thousands years, a maximum, which is reached only after several million of years. In all cases, these maxima are largely lower (by a factor of 1000 at least), than the limit recommended by the IRCP

  5. Formation of stress/strain cycles for analytical assessment of fatigue crack initiation and growth

    International Nuclear Information System (INIS)

    Tashkinov, A.V.

    2005-01-01

    This paper discusses standard techniques for setting up cycles of stresses, strains and stress intensity factors (SIF) for use in analysing the fatigue characteristics of crack-free components or the fatigue crack growth if crack-like flaws are present. A number of improved techniques are proposed. An enhanced procedure for analytical description of true metal stress-strain curves, covering plastic effects, is presented. This procedure involves standard physical and mechanical properties of the metal in question, such as ultimate stress, yield stress and elasticity modulus. It is emphasized that the currently practiced rain-flow method of design cycle formation, which is effective for an actual (truly known) cyclic loading history, is not suitable for a projected (anticipated) history, as it leaves out of account possible variations in the sequence of operating conditions. Improved techniques for establishing design stress/strain and SIF cycles are described, which make allowance for the most unfavourable sequence of events in the projected loading history. The paper points to a basic difference in the methods of design cycle formation, employed in assessment of the current condition of a component (with the actual history accounted for) and in estimation of the residual lifetime or life extension (for a projected history). (authors)

  6. Assessment of the potential for karst in the Rustler Formation at the WIPP site

    International Nuclear Information System (INIS)

    Lorenz, John Clay

    2006-01-01

    This report is an independent assessment of the potential for karst dissolution in evaporitic strata of the Rustler Formation at the Waste Isolation Pilot Plant (WIPP) site. Review of the available data suggests that the Rustler strata thicken and thin across the area in depositional patterns related to lateral variations in sedimentary accommodation space and normal facies changes. Most of the evidence that has been offered for the presence of karst in the subsurface has been used out of context, and the different pieces are not mutually supporting. Outside of Nash Draw, definitive evidence for the development of karst in the Rustler Formation near the WIPP site is limited to the horizon of the Magenta Member in drillhole WIPP-33. Most of the other evidence cited by the proponents of karst is more easily interpreted as primary sedimentary structures and the localized dissolution of evaporitic strata adjacent to the Magenta and Culebra water-bearing units. Some of the cited evidence is invalid, an inherited baggage from studies made prior to the widespread knowledge of modern evaporite depositional environments and prior to the existence of definitive exposures of the Rustler Formation in the WIPP shafts. Some of the evidence is spurious, has been taken out of context, or is misquoted. Lateral lithologic variations from halite to mudstone within the Rustler Formation under the WIPP site have been taken as evidence for the dissolution of halite such as that seen in Nash Draw, but are more rationally explained as sedimentary facies changes. Extrapolation of the known karst features in Nash Draw eastward to the WIPP site, where conditions are and have been significantly different for half a million years, is unwarranted. The volumes of insoluble material that would remain after dissolution of halite would be significantly less than the observed bed thicknesses, thus dissolution is an unlikely explanation for the lateral variations from halite to mudstone and siltstone

  7. Assessment of the potential for karst in the Rustler Formation at the WIPP site.

    Energy Technology Data Exchange (ETDEWEB)

    Lorenz, John Clay

    2006-01-01

    This report is an independent assessment of the potential for karst dissolution in evaporitic strata of the Rustler Formation at the Waste Isolation Pilot Plant (WIPP) site. Review of the available data suggests that the Rustler strata thicken and thin across the area in depositional patterns related to lateral variations in sedimentary accommodation space and normal facies changes. Most of the evidence that has been offered for the presence of karst in the subsurface has been used out of context, and the different pieces are not mutually supporting. Outside of Nash Draw, definitive evidence for the development of karst in the Rustler Formation near the WIPP site is limited to the horizon of the Magenta Member in drillhole WIPP-33. Most of the other evidence cited by the proponents of karst is more easily interpreted as primary sedimentary structures and the localized dissolution of evaporitic strata adjacent to the Magenta and Culebra water-bearing units. Some of the cited evidence is invalid, an inherited baggage from studies made prior to the widespread knowledge of modern evaporite depositional environments and prior to the existence of definitive exposures of the Rustler Formation in the WIPP shafts. Some of the evidence is spurious, has been taken out of context, or is misquoted. Lateral lithologic variations from halite to mudstone within the Rustler Formation under the WIPP site have been taken as evidence for the dissolution of halite such as that seen in Nash Draw, but are more rationally explained as sedimentary facies changes. Extrapolation of the known karst features in Nash Draw eastward to the WIPP site, where conditions are and have been significantly different for half a million years, is unwarranted. The volumes of insoluble material that would remain after dissolution of halite would be significantly less than the observed bed thicknesses, thus dissolution is an unlikely explanation for the lateral variations from halite to mudstone and siltstone

  8. Evaluation of radiological safety assessment of a repository in a clay rock formation

    International Nuclear Information System (INIS)

    1999-01-01

    This report presents a comprehensive description of the post-closure radiological safety assessment of a repository for the spent fuel arisings resulting from the Spanish nuclear program excavated in a clay host rock formation. In this report three scenarios have been analysed in detail. The first scenario represents the normal in detail. The first scenario represents the normal evolution of the repository (Reference Scenario); and includes a set of variants to investigate the relative importance of the various repository components and examine the sensitivity of the performance to parameters variations. Two altered scenarios have also been considered: deep well construction and poor sealing of the repository. This document contains a detailed description of the repository system, the methodology adopted for the scenarios generation, the process modelling approach and the results of the consequences analysis. (Author)

  9. Model of practical skill performance as an instrument for supervision and formative assessment

    DEFF Research Database (Denmark)

    Nielsen, Carsten; Sommer, Irene; Larsen, Karin

    2012-01-01

    as during practice, performance and formative assessment of practical skills learning. It provided a common language about practical skills and enhanced the participants’ understanding of professionalism in practical nursing skill. In conclusion, the model helped to highlight the complexity in mastering......There are still weaknesses in the practical skills of newly graduated nurses. There is also an escalating pressure on existing clinical placements due to increasing student numbers and structural changes in health services. Innovative educational practices and the use of tools that might support...... learning are sparsely researched in the field of clinical education for nursing students. This paper reports on an action research study that promoted and investigated use of The Model of Practical Skill Performance as a learning tool during nursing students’ clinical placement. Clinical supervisors...

  10. Left ventricular thrombus formation after acute myocardial infarction as assessed by cardiovascular magnetic resonance imaging

    International Nuclear Information System (INIS)

    Delewi, Ronak; Nijveldt, Robin; Hirsch, Alexander; Marcu, Constantin B.; Robbers, Lourens; Hassell, Marriela E.C.J.; Bruin, Rianne H.A. de; Vleugels, Jim; Laan, Anja M. van der; Bouma, Berto J.; Tio, René A.; Tijssen, Jan G.P.; Rossum, Albert C. van; Zijlstra, Felix; Piek, Jan J.

    2012-01-01

    Introduction: Left ventricular (LV) thrombus formation is a feared complication of myocardial infarction (MI). We assessed the prevalence of LV thrombus in ST-segment elevated MI patients treated with percutaneous coronary intervention (PCI) and compared the diagnostic accuracy of transthoracic echocardiography (TTE) to cardiovascular magnetic resonance imaging (CMR). Also, we evaluated the course of LV thrombi in the modern era of primary PCI. Methods: 200 patients with primary PCI underwent TTE and CMR, at baseline and at 4 months follow-up. Studies were analyzed by two blinded examiners. Patients were seen at 1, 4, 12, and 24 months for assessment of clinical status and adverse events. Results: On CMR at baseline, a thrombus was found in 17 of 194 (8.8%) patients. LV thrombus resolution occurred in 15 patients. Two patients had persistence of LV thrombus on follow-up CMR. On CMR at four months, a thrombus was found in an additional 12 patients. In multivariate analysis, thrombus formation on baseline CMR was independently associated with, baseline infarct size (g) (B = 0.02, SE = 0.02, p < 0.001). Routine TTE had a sensitivity of 21–24% and a specificity of 95–98% compared to CMR for the detection of LV thrombi. Intra- and interobserver variation for detection of LV thrombus were lower for CMR (κ = 0.91 and κ = 0.96) compared to TTE (κ = 0.74 and κ = 0.53). Conclusion: LV thrombus still occurs in a substantial amount of patients after PCI-treated MI, especially in larger infarct sizes. Routine TTE had a low sensitivity for the detection of LV thrombi and the interobserver variation of TTE was large.

  11. Left ventricular thrombus formation after acute myocardial infarction as assessed by cardiovascular magnetic resonance imaging

    Energy Technology Data Exchange (ETDEWEB)

    Delewi, Ronak [Department of Cardiology, Academic Medical Center, University of Amsterdam, Amsterdam (Netherlands); Interuniversity Cardiology Institute of the Netherlands (Netherlands); Nijveldt, Robin [Department of Cardiology, VU University Medical Center, Amsterdam (Netherlands); Hirsch, Alexander [Department of Cardiology, Academic Medical Center, University of Amsterdam, Amsterdam (Netherlands); Marcu, Constantin B.; Robbers, Lourens [Department of Cardiology, VU University Medical Center, Amsterdam (Netherlands); Hassell, Marriela E.C.J.; Bruin, Rianne H.A. de; Vleugels, Jim; Laan, Anja M. van der; Bouma, Berto J. [Department of Cardiology, Academic Medical Center, University of Amsterdam, Amsterdam (Netherlands); Tio, René A. [Thorax Center, University Medical Center Groningen, Groningen (Netherlands); Tijssen, Jan G.P. [Department of Cardiology, Academic Medical Center, University of Amsterdam, Amsterdam (Netherlands); Rossum, Albert C. van [Department of Cardiology, VU University Medical Center, Amsterdam (Netherlands); Zijlstra, Felix [Thorax Center, Department of Cardiology, Erasmus University Medical Center, Rotterdam (Netherlands); Piek, Jan J., E-mail: j.j.piek@amc.uva.nl [Department of Cardiology, Academic Medical Center, University of Amsterdam, Amsterdam (Netherlands)

    2012-12-15

    Introduction: Left ventricular (LV) thrombus formation is a feared complication of myocardial infarction (MI). We assessed the prevalence of LV thrombus in ST-segment elevated MI patients treated with percutaneous coronary intervention (PCI) and compared the diagnostic accuracy of transthoracic echocardiography (TTE) to cardiovascular magnetic resonance imaging (CMR). Also, we evaluated the course of LV thrombi in the modern era of primary PCI. Methods: 200 patients with primary PCI underwent TTE and CMR, at baseline and at 4 months follow-up. Studies were analyzed by two blinded examiners. Patients were seen at 1, 4, 12, and 24 months for assessment of clinical status and adverse events. Results: On CMR at baseline, a thrombus was found in 17 of 194 (8.8%) patients. LV thrombus resolution occurred in 15 patients. Two patients had persistence of LV thrombus on follow-up CMR. On CMR at four months, a thrombus was found in an additional 12 patients. In multivariate analysis, thrombus formation on baseline CMR was independently associated with, baseline infarct size (g) (B = 0.02, SE = 0.02, p < 0.001). Routine TTE had a sensitivity of 21–24% and a specificity of 95–98% compared to CMR for the detection of LV thrombi. Intra- and interobserver variation for detection of LV thrombus were lower for CMR (κ = 0.91 and κ = 0.96) compared to TTE (κ = 0.74 and κ = 0.53). Conclusion: LV thrombus still occurs in a substantial amount of patients after PCI-treated MI, especially in larger infarct sizes. Routine TTE had a low sensitivity for the detection of LV thrombi and the interobserver variation of TTE was large.

  12. Exploring pre-service science teachers' pedagogical capacity for formative assessment through analyses of student answers

    Science.gov (United States)

    Aydeniz, Mehmet; Dogan, Alev

    2016-05-01

    Background: There has been an increasing emphasis on empowering pre-service and in-service science teachers to attend student reasoning and use formative assessments to guide student learning in recent years. Purpose: The purpose of this study was to explore pre-service science teachers' pedagogical capacity for formative assessment. Sample: This study took place in Turkey. The participants include 53 pre-service science teachers in their final year of schooling. All but two of the participants are female. Design and methods: We used a mixed-methods methodology in pursing this inquiry. Participants analyzed 28 responses to seven two-tiered questions given by four students of different ability levels. We explored their ability to identify the strengths and weaknesses in students' answers. We paid particular attention to the things that the pre-service science teachers noticed in students' explanations, the types of inferences they made about students' conceptual understanding, and the affordances of pedagogical decisions they made. Results: The results show that the majority of participants made an evaluative judgment (i.e. the answer is correct or incorrect) in their analyses of students' answers. Similarly, the majority of the participants recognized the type of mistake that the students made. However, they failed to successfully elaborate on fallacies, limitations, or strengths in student reasoning. We also asked the participants to make pedagogical decisions related to what needs to be done next in order to help the students to achieve academic objectives. Results show that 8% of the recommended instructional strategies were of no affordance, 64% of low-affordance, and 28% were of high affordance in terms of helping students achieve the academic objectives. Conclusion: If our goal is to improve pre-service science teachers' noticing skills, and the affordance of feedback that they provide, engaging them in activities that asks them to attend to students' ideas

  13. What We Don't Test: What an Analysis of Unreleased ACS Exam Items Reveals about Content Coverage in General Chemistry Assessments

    Science.gov (United States)

    Reed, Jessica J.; Villafan~e, Sachel M.; Raker, Jeffrey R.; Holme, Thomas A.; Murphy, Kristen L.

    2017-01-01

    General chemistry courses are often the foundation for the study of other science disciplines and upper-level chemistry concepts. Students who take introductory chemistry courses are more often from health and science-related fields than chemistry. As such, the content taught and assessed in general chemistry courses is envisioned as building…

  14. Formative Assessment: Using Concept Cartoon, Pupils' Drawings, and Group Discussions to Tackle Children's Ideas about Biological Inheritance

    Science.gov (United States)

    Chin, Christine; Teou, Lay-Yen

    2010-01-01

    This study was carried out in the context of formative assessment where assessment and learning were integrated to enhance both teaching and learning. The purpose of the study was to: (a) identify pupils' ideas about biological inheritance through the use of a concept cartoon, pupils' drawings and talk, and (b) devise scaffolding structures that…

  15. Elevating Learner Achievement Using Formative Electronic Lab Assessments in the Engineering Laboratory: A Viable Alternative to Weekly Lab Reports

    Science.gov (United States)

    Chen, Baiyun; DeMara, Ronald F.; Salehi, Soheil; Hartshorne, Richard

    2018-01-01

    A laboratory pedagogy interweaving weekly student portfolios with onsite formative electronic laboratory assessments (ELAs) is developed and assessed within the laboratory component of a required core course of the electrical and computer engineering (ECE) undergraduate curriculum. The approach acts to promote student outcomes, and neutralize…

  16. Overview of the OGAP Formative Assessment Project and CPRE's Large-Scale Experimental Study of Implementation and Impacts

    Science.gov (United States)

    Supovitz, Jonathan

    2016-01-01

    In this presentation discussed in this brief abstracted report, the author presents about an ongoing partnership with the Philadelphia School District (PSD) to implement and research the Ongoing Assessment Project (OGAP). OGAP is a systematic, intentional and iterative formative assessment system grounded in the research on how students learn…

  17. Item response theory at subject- and group-level

    NARCIS (Netherlands)

    Tobi, Hilde

    1990-01-01

    This paper reviews the literature about item response models for the subject level and aggregated level (group level). Group-level item response models (IRMs) are used in the United States in large-scale assessment programs such as the National Assessment of Educational Progress and the California

  18. Random Item Generation Is Affected by Age

    Science.gov (United States)

    Multani, Namita; Rudzicz, Frank; Wong, Wing Yiu Stephanie; Namasivayam, Aravind Kumar; van Lieshout, Pascal

    2016-01-01

    Purpose: Random item generation (RIG) involves central executive functioning. Measuring aspects of random sequences can therefore provide a simple method to complement other tools for cognitive assessment. We examine the extent to which RIG relates to specific measures of cognitive function, and whether those measures can be estimated using RIG…

  19. Negative affect impairs associative memory but not item memory.

    Science.gov (United States)

    Bisby, James A; Burgess, Neil

    2013-12-17

    The formation of associations between items and their context has been proposed to rely on mechanisms distinct from those supporting memory for a single item. Although emotional experiences can profoundly affect memory, our understanding of how it interacts with different aspects of memory remains unclear. We performed three experiments to examine the effects of emotion on memory for items and their associations. By presenting neutral and negative items with background contexts, Experiment 1 demonstrated that item memory was facilitated by emotional affect, whereas memory for an associated context was reduced. In Experiment 2, arousal was manipulated independently of the memoranda, by a threat of shock, whereby encoding trials occurred under conditions of threat or safety. Memory for context was equally impaired by the presence of negative affect, whether induced by threat of shock or a negative item, relative to retrieval of the context of a neutral item in safety. In Experiment 3, participants were presented with neutral and negative items as paired associates, including all combinations of neutral and negative items. The results showed both above effects: compared to a neutral item, memory for the associate of a negative item (a second item here, context in Experiments 1 and 2) is impaired, whereas retrieval of the item itself is enhanced. Our findings suggest that negative affect impairs associative memory while recognition of a negative item is enhanced. They support dual-processing models in which negative affect or stress impairs hippocampal-dependent associative memory while the storage of negative sensory/perceptual representations is spared or even strengthened.

  20. A directory of computer programs for assessment of radioactive waste disposal in geological formations

    International Nuclear Information System (INIS)

    Broyd, T.W.; Dean, R.B.; Hobbs, G.D.; Knowles, N.C.; Putney, J.M.; Wrigley, J.

    1984-01-01

    This Directory describes computer programs suitable for the assessment of radioactive waste disposal facilities in geological formations. The programs, which are mainly applicable to the post closure analysis of the repository, address combinations of the following topics: nuclide inventory, corrosion, leaching, geochemistry, stress analysis, heat transfer, groundwater flow and radionuclide transport. Biosphere modelling, surface water flow and risk analysis are not covered. A total of 248 programs are identified, of which 50 are reviewed in detail, 134 in summary and 64 in tabular fashion. The directory has been compiled using a combination of literature searches, telephone and postal correspondence and meetings with recognised experts in the respective areas of work covered. It differs from previous reviews of computer programs for similar topics areas in two main respects. Firstly, the method of obtaining information has resulted in program descriptions of considerable breadth and detail. Secondly, the Directory has concentrated wherever possible on European codes, whereas most previous work of this nature has looked solely at programs developed in North America. The reviews are presented in good faith, but it has not been possible to run any of the programs on a computer, and so truly objective comparisons may not be made. Finally, although the Directory is specific to the post-closure assessment of a repository site, some of the programs described could also be used in other areas of repository analysis (eg repository design)

  1. Editorial Changes and Item Performance: Implications for Calibration and Pretesting

    Directory of Open Access Journals (Sweden)

    Heather Stoffel

    2014-11-01

    Full Text Available Previous research on the impact of text and formatting changes on test-item performance has produced mixed results. This matter is important because it is generally acknowledged that any change to an item requires that it be recalibrated. The present study investigated the effects of seven classes of stylistic changes on item difficulty, discrimination, and response time for a subset of 65 items that make up a standardized test for physician licensure completed by 31,918 examinees in 2012. One of two versions of each item (original or revised was randomly assigned to examinees such that each examinee saw only two experimental items, with each item being administered to approximately 480 examinees. The stylistic changes had little or no effect on item difficulty or discrimination; however, one class of edits -' changing an item from an open lead-in (incomplete statement to a closed lead-in (direct question -' did result in slightly longer response times. Data for nonnative speakers of English were analyzed separately with nearly identical results. These findings have implications for the conventional practice of repretesting (or recalibrating items that have been subjected to minor editorial changes.

  2. Wind effects on long-span bridges: Probabilistic wind data format for buffeting and VIV load assessments

    Science.gov (United States)

    Hoffmann, K.; Srouji, R. G.; Hansen, S. O.

    2017-12-01

    The technology development within the structural design of long-span bridges in Norwegian fjords has created a need for reformulating the calculation format and the physical quantities used to describe the properties of wind and the associated wind-induced effects on bridge decks. Parts of a new probabilistic format describing the incoming, undisturbed wind is presented. It is expected that a fixed probabilistic format will facilitate a more physically consistent and precise description of the wind conditions, which in turn increase the accuracy and considerably reduce uncertainties in wind load assessments. Because the format is probabilistic, a quantification of the level of safety and uncertainty in predicted wind loads is readily accessible. A simple buffeting response calculation demonstrates the use of probabilistic wind data in the assessment of wind loads and responses. Furthermore, vortex-induced fatigue damage is discussed in relation to probabilistic wind turbulence data and response measurements from wind tunnel tests.

  3. Problems with the factor analysis of items: Solutions based on item response theory and item parcelling

    Directory of Open Access Journals (Sweden)

    Gideon P. De Bruin

    2004-10-01

    Full Text Available The factor analysis of items often produces spurious results in the sense that unidimensional scales appear multidimensional. This may be ascribed to failure in meeting the assumptions of linearity and normality on which factor analysis is based. Item response theory is explicitly designed for the modelling of the non-linear relations between ordinal variables and provides a strong alternative to the factor analysis of items. Items may also be combined in parcels that are more likely to satisfy the assumptions of factor analysis than do the items. The use of the Rasch rating scale model and the factor analysis of parcels is illustrated with data obtained with the Locus of Control Inventory. The results of these analyses are compared with the results obtained through the factor analysis of items. It is shown that the Rasch rating scale model and the factoring of parcels produce superior results to the factor analysis of items. Recommendations for the analysis of scales are made. Opsomming Die faktorontleding van items lewer dikwels misleidende resultate op, veral in die opsig dat eendimensionele skale as meerdimensioneel voorkom. Hierdie resultate kan dikwels daaraan toegeskryf word dat daar nie aan die aannames van lineariteit en normaliteit waarop faktorontleding berus, voldoen word nie. Itemresponsteorie, wat eksplisiet vir die modellering van die nie-liniêre verbande tussen ordinale items ontwerp is, bied ’n aantreklike alternatief vir die faktorontleding van items. Items kan ook in pakkies gegroepeer word wat meer waarskynlik aan die aannames van faktorontleding voldoen as individuele items. Die gebruik van die Rasch beoordelingskaalmodel en die faktorontleding van pakkies word aan die hand van data wat met die Lokus van Beheervraelys verkry is, gedemonstreer. Die resultate van hierdie ontledings word vergelyk met die resultate wat deur ‘n faktorontleding van die individuele items verkry is. Die resultate dui daarop dat die Rasch

  4. Assessment of the Potential Role of Streptomyces in Cave Moonmilk Formation

    Directory of Open Access Journals (Sweden)

    Marta Maciejewska

    2017-06-01

    Full Text Available Moonmilk is a karstic speleothem mainly composed of fine calcium carbonate crystals (CaCO3 with different textures ranging from pasty to hard, in which the contribution of biotic rock-building processes is presumed to involve indigenous microorganisms. The real microbial input in the genesis of moonmilk is difficult to assess leading to controversial hypotheses explaining the origins and the mechanisms (biotic vs. abiotic involved. In this work, we undertook a comprehensive approach in order to assess the potential role of filamentous bacteria, particularly a collection of moonmilk-originating Streptomyces, in the genesis of this speleothem. Scanning electron microscopy (SEM confirmed that indigenous filamentous bacteria could indeed participate in moonmilk development by serving as nucleation sites for CaCO3 deposition. The metabolic activities involved in CaCO3 transformation were furthermore assessed in vitro among the collection of moonmilk Streptomyces, which revealed that peptides/amino acids ammonification, and to a lesser extend ureolysis, could be privileged metabolic pathways participating in carbonate precipitation by increasing the pH of the bacterial environment. Additionally, in silico search for the genes involved in biomineralization processes including ureolysis, dissimilatory nitrate reduction to ammonia, active calcium ion transport, and reversible hydration of CO2 allowed to identify genetic predispositions for carbonate precipitation in Streptomyces. Finally, their biomineralization abilities were confirmed by environmental SEM, which allowed to visualize the formation of abundant mineral deposits under laboratory conditions. Overall, our study provides novel evidences that filamentous Actinobacteria could be key protagonists in the genesis of moonmilk through a wide spectrum of biomineralization processes.

  5. Assessing the Formation of Experience-Based Gender Expectations in an Implicit Learning Scenario

    Directory of Open Access Journals (Sweden)

    Anton Öttl

    2017-09-01

    Full Text Available The present study investigates the formation of new word-referent associations in an implicit learning scenario, using a gender-coded artificial language with spoken words and visual referents. Previous research has shown that when participants are explicitly instructed about the gender-coding system underlying an artificial lexicon, they monitor the frequency of exposure to male vs. female referents within this lexicon, and subsequently use this probabilistic information to predict the gender of an upcoming referent. In an explicit learning scenario, the auditory and visual gender cues are necessarily highlighted prior to acqusition, and the effects previously observed may therefore depend on participants' overt awareness of these cues. To assess whether the formation of experience-based expectations is dependent on explicit awareness of the underlying coding system, we present data from an experiment in which gender-coding was acquired implicitly, thereby reducing the likelihood that visual and auditory gender cues are used strategically during acquisition. Results show that even if the gender coding system was not perfectly mastered (as reflected in the number of gender coding errors, participants develop frequency based expectations comparable to those previously observed in an explicit learning scenario. In line with previous findings, participants are quicker at recognizing a referent whose gender is consistent with an induced expectation than one whose gender is inconsistent with an induced expectation. At the same time however, eyetracking data suggest that these expectations may surface earlier in an implicit learning scenario. These findings suggest that experience-based expectations are robust against manner of acquisition, and contribute to understanding why similar expectations observed in the activation of stereotypes during the processing of natural language stimuli are difficult or impossible to suppress.

  6. Validation of the 24-item recovery assessment scale-revised (RAS-R) in the Norwegian language and context: a multi-centre study.

    Science.gov (United States)

    Biringer, Eva; Tjoflåt, Marit

    2018-01-25

    The Recovery Assessment Scale-revised (RAS-R) is a self-report instrument measuring mental health recovery. The purpose of the present study was to translate and adapt the RAS-R into the Norwegian language and to investigate its psychometric properties in terms of factor structure, convergent and discriminant validity and reliability in the Norwegian context. The present study is a cross-sectional multi-centre study. After a pilot test, the Norwegian version of the RAS-R was distributed to 231 service users in mental health specialist and community services. The factor structure of the instrument was investigated by a confirmatory factor analysis (CFA), and internal consistency was assessed by Cronbach's alpha. The RAS-R was found to be acceptable and feasible for service users. The original five-factor structure was confirmed. All model fit indices, including the standardised root mean square residual (SRMR), which is independent of the χ 2 -test, met the criteria for an acceptable model fit. Internal consistencies within sub-scales as measured by Cronbach's alpha ranged from 0.65 to 0.85. Cronbach's alpha for the total scale was 0.90. As expected, some redundancy between factors existed (in particular among the factors Personal confidence and hope, Goal and success orientation and Not dominated by symptoms). The Norwegian RAS-R showed acceptable psychometric properties in terms of convergent validity and reliability, and fit indices from the CFA confirmed the original factor structure. We recommend the Norwegian RAS-R as a tool in service users' and health professionals' collaborative work towards the service users' recovery goals and as an outcome measure in larger evaluations.

  7. Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?

    Science.gov (United States)

    Sinharay, Sandip

    2017-09-01

    Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.

  8. Interns reflect: the effect of formative assessment with feedback during pre-internship.

    Science.gov (United States)

    McKenzie, Susan; Burgess, Annette; Mellis, Craig

    2017-01-01

    It is widely known that the opportunity for medical students to be observed and to receive feedback on their procedural skills performance is variable in the senior years. To address this problem, we provided our Pre-Intern (PrInt) students with "one-to-one" formative feedback on their ability to perform urethral catheterization (U/C) and hypothesized that their future practice of U/C as interns would benefit. This study sought to evaluate the performance and practice of interns in U/C 4-5 months after having received feedback on their performance of U/C as PrInt students. Between 2013 and 2014, two cohorts of interns, (total n=66) who had received recent formative feedback on their U/C performance as PrInt students at Central Clinical School, were invited to complete an anonymous survey. The survey contained nine closed unvalidated questions and one open-ended question, designed to allow interns to report on their current practice of U/C. Forty-one out of 66 interns (62%) completed the survey. Thirty-five out of 41 respondents (85%) reported that the assessment with feedback during their PrInt term was beneficial to their practice. Thirty of 41 (73%) reported being confident to perform U/C independently. Eleven out of 41 respondents (27%) reported that they had received additional training at intern orientation. Nine of the 11 interns (82%) reported that they had a small, but a significant, increase in confidence to perform U/C when compared with the 30 of the 41 respondents (73%) who had not ( p =0.03). Our results substantiate our hypothesis that further education by assessment with feedback in U/C during PrInt was of benefit to interns' performance. Additional educational reinforcement in U/C during intern orientation further improved intern confidence. Our results indicate that extra pre- and post-graduation procedural skills training, with feedback, should be universal.

  9. Assessment, measurement and correlation of (vapour + liquid) equilibrium of (carbon dioxide + butyl, isobutyl, and amyl formate) systems

    International Nuclear Information System (INIS)

    Shen, Yanshu; Zheng, Danxing; Li, Xinru; Li, Yun

    2013-01-01

    Highlights: • Selected three formates that have relative perfect absorption performance for CO 2 . • Measured the VLE data of CO 2 + butyl, isobutyl, and amyl formates systems. • Correlated the VLE data by using PR EOS with two mixing rules and SRK EOS with one mixing rule. • Concluded amyl formate has potential research value as CO 2 physical absorbent. -- Abstract: In this work, three formates (butyl, isobutyl, and amyl formate) were considered as relative perfect CO 2 absorption performance based on the excess Gibbs function as the thermodynamics criterion. An online static-analytical method was used to measure the (vapour + liquid) equilibrium (VLE) data for the CO 2 + butyl, isobutyl, and amyl formates under the pressure of (0.2 to 6) MPa and the temperatures at a range from (283.15 to 343.15) K. Then the VLE data were correlated by Peng–Robinson (PR) equation of state (EOS) with classic mixing rule, PR EOS with Wong–Sandler (WS) mixing rule and Soave–Redlich–Kwong (SRK) EOS with classic mixing rule. It is shown that SRK EOS is comparatively appropriate for CO 2 + butyl formate binary system. Both PR EOS with classic mixing rule and SRK EOS can be used to correlate the binary systems of CO 2 + isobutyl, amyl formate. It is found that the solubility order of three formates for CO 2 from high to low is arranged as CO 2 + amyl formate > CO 2 + butyl formate > CO 2 + isobutyl formate, showing the system of CO 2 + amyl formate has the best absorption performance. By comparison, it indicates that formates have a greater solubility for CO 2 than acetates on the condition of the same temperature and pressure. In addition, the thermophysical properties, mole absorption and mass absorptive amount of several industrial absorbents were assessed and the absorption performance of amyl formate for CO 2 is better than other physical absorbents. Thus, the study concluded that amyl formate has potential research value as physical absorbent for CO 2 capture

  10. Determination of the caffeine contents of various food items within the Austrian market and validation of a caffeine assessment tool (CAT).

    Science.gov (United States)

    Rudolph, E; Färbinger, A; König, J

    2012-01-01

    The caffeine content of 124 products, including coffee, coffee-based beverages, energy drinks, tea, colas, yoghurt and chocolate, were determined using RP-HPLC with UV detection after solid-phase extraction. Highest concentrations of caffeine were found for coffee prepared from pads (755 mg l⁻¹) and regular filtered coffee (659 mg l⁻¹). The total caffeine content of coffee and chocolate-based beverages was between 15 mg l⁻¹ in chocolate milk and 448 mg l⁻¹ in canned ice coffee. For energy drinks the caffeine content varied in a range from 266 to 340 mg l⁻¹. Caffeine concentrations in tea and ice teas were between 13 and 183 mg l⁻¹. Coffee-flavoured yoghurts ranged from 33 to 48 mg kg⁻¹. The caffeine concentration in chocolate and chocolate bars was between 17 mg kg⁻¹ in whole milk chocolate and 551 mg kg⁻¹ in a chocolate with coffee filling. A caffeine assessment tool was developed and validated by a 3-day dietary record (r²= 0.817, p < 0.01) using these analytical data and caffeine saliva concentrations (r²= 0.427, p < 0.01).

  11. Harmonisation of food consumption data format for dietary exposure assessments of chemicals analysed in raw agricultural commodities

    DEFF Research Database (Denmark)

    Boon, Polly E.; Ruprich, Jiri; Petersen, Annette

    2009-01-01

    In this paper, we present an approach to format national food consumption data at raw agricultural commodity (RAC) level. In this way, the data is both formatted in a harmonised way given the comparability of RACs between countries, and suitable to assess the dietary exposure to chemicals analysed......, and the use of the FAO/WHO Codex Classification system of Foods and Animal Feeds to harmonise the classification. We demonstrate that this approach works well for pesticides and glycoalkaloids, and is an essential step forward in the harmonisation of risk assessment procedures within Europe when addressing...... chemicals analysed in RACs by all national food control systems....

  12. Assessment of potential radionuclide transport in site-specific geologic formations

    International Nuclear Information System (INIS)

    Dosch, R.G.

    1980-08-01

    Associated with the development of deep, geologic repositories for nuclear waste isolation is a need for safety assessments of the potential for nuclide migration. Frequently used in estimating migration rates is a parameter generally known as a distribution coefficient, K/sub d/, which describes the distribution of a radionuclide between a solid (rock) and a liquid (groundwater) phase. This report is intended to emphasize that the use of K/sub d/ must be coupled with a knowledge of the geology and release scenarios applicable to a repository. Selected K/sub d/ values involving rock samples from groundwater/brine simulants typical of two potential repository sites, WIPP and NTS, are used to illustrate this concern. Experimental parameters used in K/sub d/ measurements including nuclide concentration, site sampling/rock composition, and liquid-to-solid ratios are discussed. The solubility of U(VI) in WIPP brine/groundwater was addressed in order to assess the potential contribution of this phenomena to K/sub d/ values. Understanding mehanisms of sorption of radionuclides on rocks would lead to a better predictive capability. Sorption is attributed to the presence of trace constituents (often unidentified) in rocks. An attempt was made to determine if this applied to WIPP dolomite rocks by comparing sorption behavior of the natural material with that of a synthetic dolomite prepared in the laboratory with reagent grade chemicals. The results were inconclusive. The results of a study of Tc sorption by an argillite sample from the Calico Hills formation at NTS under ambient laboratory conditions were more conclusive. The Tc sorption was found to be associated with elemental carbon. Available evidence points to a reduction mechanism leading to the apparent sorption of Tc on the solid phase

  13. Melodie: a code for risk assessment of waste repositories in deep geological formations

    International Nuclear Information System (INIS)

    Lewi, J.; Mejon-Goula, M.J.; Cernes, A.

    1988-10-01

    In order to perform the safety evaluation of nuclear waste repositories, a global model, called MELODIE, is currently developed at the CEA/IPSN, in collaboration with order CEA teams and non-CEA like ENSMP (Ecole Nationale Superieure des Mines de Paris). The version now in operation allows to assess the radiological consequences due to a repository located in a granitic formation on a period of several hundred thousands of years. The calculations are based on models which represent the physical and chemical phenomena in connection with: the release of the radionuclides from the waste matrixes and through the engineered barriers; their transfer through the geosphere; their behaviour in the biosphere. Three separate models have been developed for each of these subjects; they are integrated in the code through a modular flexible dataprocessing structure which calls these computational modules with their optimal time step and extracts the data from the data files where they are stored. In addition, a sensitivity and uncertainty analysis algorithm has been implemented into the code. It allows to evaluate the influence of the parameter values on the result and to assess the global uncertainty on it. After a quite general description of MELODIE, the calculations performed with it in the PAGIS (CCE) exercise: global dose calculations and ranking of the most important parameters through the sensitivity analysis, are presented. The studies performed only with the geosphere module of MELODIE (METIS), especially the participation to the HYDROCOIN (OECD/NEA) exercise, are also noticed. In addition, the main future development axes of MELODIE are outlined

  14. Student Perceptions of Online Homework Use for Formative Assessment of Learning in Organic Chemistry.

    Science.gov (United States)

    Richards-Babb, Michelle; Curtis, Reagan; Georgieva, Zomitsa; Penn, John H

    2015-11-10

    Use of online homework as a formative assessment tool for organic chemistry coursework was examined. Student perceptions of online homework in terms of (i) its ranking relative to other course aspects, (ii) their learning of organic chemistry, and (iii) whether it improved their study habits and how students used it as a learning tool were investigated. Our students perceived the online homework as one of the more useful course aspects for learning organic chemistry content. We found a moderate and statistically significant correlation between online homework performance and final grade. Gender as a variable was ruled out since significant gender differences in overall attitude toward online homework use and course success rates were not found. Our students expressed relatively positive attitudes toward use of online homework with a majority indicating improved study habits (e.g., study in a more consistent manner). Our students used a variety of resources to remediate incorrect responses (e.g., class materials, general online materials, and help from others). However, 39% of our students admitted to guessing at times, instead of working to remediate incorrect responses. In large enrollment organic chemistry courses, online homework may act to bridge the student-instructor gap by providing students with a supportive mechanism for regulated learning of content.

  15. A Monte Carlo risk assessment model for acrylamide formation in French fries.

    Science.gov (United States)

    Cummins, Enda; Butler, Francis; Gormley, Ronan; Brunton, Nigel

    2009-10-01

    The objective of this study is to estimate the likely human exposure to the group 2a carcinogen, acrylamide, from French fries by Irish consumers by developing a quantitative risk assessment model using Monte Carlo simulation techniques. Various stages in the French-fry-making process were modeled from initial potato harvest, storage, and processing procedures. The model was developed in Microsoft Excel with the @Risk add-on package. The model was run for 10,000 iterations using Latin hypercube sampling. The simulated mean acrylamide level in French fries was calculated to be 317 microg/kg. It was found that females are exposed to smaller levels of acrylamide than males (mean exposure of 0.20 microg/kg bw/day and 0.27 microg/kg bw/day, respectively). Although the carcinogenic potency of acrylamide is not well known, the simulated probability of exceeding the average chronic human dietary intake of 1 microg/kg bw/day (as suggested by WHO) was 0.054 and 0.029 for males and females, respectively. A sensitivity analysis highlighted the importance of the selection of appropriate cultivars with known low reducing sugar levels for French fry production. Strict control of cooking conditions (correlation coefficient of 0.42 and 0.35 for frying time and temperature, respectively) and blanching procedures (correlation coefficient -0.25) were also found to be important in ensuring minimal acrylamide formation.

  16. Eliciting, Identifying, Interpreting, and Responding to Students' Ideas: Teacher Candidates' Growth in Formative Assessment Practices

    Science.gov (United States)

    Gotwals, Amelia Wenk; Birmingham, Daniel

    2016-06-01

    With the goal of helping teacher candidates become well-started beginners, it is important that methods courses in teacher education programs focus on high-leverage practices. Using responsive teaching practices, specifically eliciting, identifying, interpreting, and responding to students' science ideas (i.e., formative assessment), can be used to support all students in learning science successfully. This study follows seven secondary science teacher candidates in a yearlong practice-based methods course. Course assignments (i.e., plans for and reflections on teaching) as well as teaching videos were analyzed using a recursive qualitative approach. In this paper, we present themes and patterns in teacher candidates' abilities to elicit, identify, interpret, and respond to students' ideas. Specifically, we found that those teacher candidates who grew in the ways in which they elicited students' ideas from fall to spring were also those who were able to adopt a more balanced reflection approach (considering both teacher and student moves). However, we found that even the teacher candidates who grew in these practices did not move toward seeing students' ideas as nuanced; rather, they saw students' ideas in a dichotomous fashion: right or wrong. We discuss implications for teacher preparation, specifically for how to promote productive reflection and tools for better understanding students' ideas.

  17. Characterization of Disability in Canadians with Mental Disorders Using an Abbreviated Version of a DSM-5 Emerging Measure: The 12-Item WHO Disability Assessment Schedule (WHODAS) 2.0.

    Science.gov (United States)

    Sjonnesen, Kirsten; Bulloch, Andrew G M; Williams, Jeanne; Lavorato, Dina; B Patten, Scott

    2016-04-01

    The World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) is a disability scale included in Section 3 of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) as a possible replacement for the Global Assessment of Functioning Scale (GAF). To assist Canadian psychiatrists with interpretation of the scale, we have conducted a descriptive analysis using data from the 2012 Canadian Community Health Survey-Mental Health component (CCHS-MH). The 2012 CCHS-MH was a cross-sectional survey of the Canadian community (n = 23,757). The survey included an abbreviated 12-item version of the WHODAS 2.0. Mental disorder diagnoses were assessed for schizophrenia, other psychosis, major depressive episode (MDE), generalized anxiety disorder (GAD), bipolar I disorder, substance abuse/dependence, and alcohol abuse/dependence. Mean scores ranged from 14.2 (95% CI, 14.1 to 14.3) for the overall community population to 23.1 (95% CI, 19.5 to 26.7) for those with schizophrenia, with higher scores indicating greater disability. Furthermore, the difference in scores between those with lifetime and past-month episodes suggests that the scale is sensitive to changes occurring during the course of these disorders; for example, scores varied from 23.6 (95% CI, 22.2 to 25.1) for past-month MDE to 14.4 (95% CI, 14.2 to 14.7) in the lifetime MDE group without a past-year episode. This analysis suggests that the WHODAS 2.0 may be a suitable replacement for the GAF. As a disability measure, even though it is not a mental health-specific instrument, the 12-item WHODAS 2.0 appears to be sensitive to the impact of mental disorders and to changes over the time course of a mental disorder. However, the clinical utility of this measure requires additional assessment. © The Author(s) 2016.

  18. Performance assessment of geological isolation systems for radioactive waste. Disposal in salt formations

    International Nuclear Information System (INIS)

    Storck, R.; Aschenbach, J.; Hirsekom, R.P.; Nies, A.; Stelte, N.

    1988-01-01

    In the framework of the PAGIS project of the CEC Research Programme on radioactive waste, a performance assessment of a repository of vitrified HLW in rock salt formations has been carried out. The first volume of the study is split into four tasks. Task 1 recalls the main steps that have led to the selection of the reference and the variant site. Task 2 condenses all information available on the rock formations which are planned to host the repository, the overlying geosphere and the geohistoric development of the sites. Task 3 states the technical details of repository planning, while in Task 4 conceivable release scenarios are discussed. Volume II (Tasks 5 to 10) is concerned with the modelling procedures. In Task 5 data for the waste inventory are collected and the selection of relevant nuclides for transport calculations is discussed. Task 6 gives the near-field modelling, i.e. the models for corrosion of the waste canisters, the degradation of the waste matrix and the models used for the HLW boreholes. Task 7 deals with the modelling of the repository. Its division into sections is discussed and models for physical and chemical effects taken into account in each section are presented. In Task 8 the modelling of the overburden is given. In Task 9 additional models for the subrosion scenario and a human intrusion scenario are given. Task 10 is concerned with the biosphere modelling. In Volume III results of deterministic and probabilistic calculations are presented. Task 11 gives the results for deterministic calculations with best estimate values for the parameters involved in the models. Task 12 presents the result of the uncertainty analysis, and Task 13 those of local and global sensitivity analyses followed by concluding remarks. This document is one of a set of 5 reports covering a relevant project of the European Community on a nuclear safety subject having very wide interest. The five volumes are: the summary (EUR 11775-EN), the clay (EUR 11776-EN), the

  19. Selecting Items for Criterion-Referenced Tests.

    Science.gov (United States)

    Mellenbergh, Gideon J.; van der Linden, Wim J.

    1982-01-01

    Three item selection methods for criterion-referenced tests are examined: the classical theory of item difficulty and item-test correlation; the latent trait theory of item characteristic curves; and a decision-theoretic approach for optimal item selection. Item contribution to the standardized expected utility of mastery testing is discussed. (CM)

  20. Identifying Stages in a Learning Hierarchy for Use in Formative Assessment--The Example of Line Graphs

    Science.gov (United States)

    Stacey, Kaye; Price, Beth; Steinle, Vicki

    2012-01-01

    This paper discusses issues arising in the design of questions to use in an on-line computer-based formative assessment system, focussing on how best to identify the stages of a learning hierarchy for reporting to teachers. Data from several hundred students is used to illustrate how design decisions have been made for a test on interpreting line…

  1. Assessment of undiscovered oil and gas resources in the Spraberry Formation of the Midland Basin, Permian Basin Province, Texas, 2017

    Science.gov (United States)

    Marra, Kristen R.; Gaswirth, Stephanie B.; Schenk, Christopher J.; Leathers-Miller, Heidi M.; Klett, Timothy R.; Mercier, Tracey J.; Le, Phuong A.; Tennyson, Marilyn E.; Finn, Thomas M.; Hawkins, Sarah J.; Brownfield, Michael E.

    2017-05-15

    Using a geology-based assessment methodology, the U.S. Geological Survey estimated mean resources of 4.2 billion barrels of oil and 3.1 trillion cubic feet of gas in the Spraberry Formation of the Midland Basin, Permian Basin Province, Texas.

  2. Technology-Enhanced Formative Assessment: A Research-Based Pedagogy for Teaching Science with Classroom Response Technology

    Science.gov (United States)

    Beatty, Ian D.; Gerace, William J.

    2009-01-01

    "Classroom response systems" (CRSs) are a promising instructional technology, but most literature on CRS use fails to distinguish between technology and pedagogy, to define and justify a pedagogical perspective, or to discriminate between pedagogies. "Technology-enhanced formative assessment" (TEFA) is our pedagogy for CRS-based science…

  3. The Common Core State Standards: An Opportunity to Enhance Formative Assessment in History/Social Studies Classrooms

    Science.gov (United States)

    Ateh, Comfort M.; Wyngowski, Aaron J.

    2015-01-01

    This article discusses the opportunity that the Common Core State Standards (CCSS) present for enhancing formative assessment (FA) in history and social studies classrooms. There is evidence that FA can enhance learning for students if implemented well. Unfortunately, teachers continue to be challenged in implementing FA in their classrooms. We…

  4. Formative assessment in the development of an obesity prevention component for the Expanded Food and Nutrition Education Program in Texas

    Science.gov (United States)

    This study conducted formative research (surveys, focus groups); to assess the nutrition education needs of clients in the Texas Expanded Food and Nutrition Education Program prior to curriculum revision. Current participants in the Expanded Food and Nutrition Education Program from 3 Texas cities (...

  5. Formative Assessment and Elementary School Student Academic Achievement: A Review of the Evidence. REL 2017-259

    Science.gov (United States)

    Klute, Mary; Apthorp, Helen; Harlacher, Jason; Reale, Marianne

    2017-01-01

    Formative assessment is a process that engages teachers and students in gathering, interpreting, and using evidence about what and how students are learning in order to facilitate further student learning during a short period of time. The process offers the potential to guide educator decisions about midstream adjustments to instruction that…

  6. The Effects of Formative Assessment on Academic Achievement, Attitudes toward the Lesson, and Self-Regulation Skills

    Science.gov (United States)

    Ozan, Ceyhun; Kincal, Remzi Y.

    2018-01-01

    The purpose of this research is to examine the effects of formative assessment practices on students' academic achievement, attitudes toward lessons, and self-regulation skills in the fifth-grade social studies class. Mixed method research was used to conduct the study. The research group consisted of 45 students in the fifth grade of a secondary…

  7. How Many Formative Assessment Angels Can Dance on the Head of a Meta-Analytic Pin: 0.2

    Science.gov (United States)

    Kingston, Neal; Nash, Brooke

    2012-01-01

    In their critique of Kingston and Nash (2011), Briggs, Ruiz-Primo, Furtak, Shepard, and Yin (2012) make several major points. First, Kingston and Nash's conclusions about the state of research on the efficacy of formative assessment are similar to other researchers, "including some of the authors." Second, their research may be unique in that they…

  8. Examining Data Driven Decision Making via Formative Assessment: A Confluence of Technology, Data Interpretation Heuristics and Curricular Policy

    Science.gov (United States)

    Swan, Gerry; Mazur, Joan

    2011-01-01

    Although the term data-driven decision making (DDDM) is relatively new (Moss, 2007), the underlying concept of DDDM is not. For example, the practices of formative assessment and computer-managed instruction have historically involved the use of student performance data to guide what happens next in the instructional sequence (Morrison, Kemp, &…

  9. Semiparametric Item Response Functions in the Context of Guessing

    Science.gov (United States)

    Falk, Carl F.; Cai, Li

    2016-01-01

    We present a logistic function of a monotonic polynomial with a lower asymptote, allowing additional flexibility beyond the three-parameter logistic model. We develop a maximum marginal likelihood-based approach to estimate the item parameters. The new item response model is demonstrated on math assessment data from a state, and a computationally…

  10. Assessment of activated porous granules on implant fixation and early bone formation in sheep

    Directory of Open Access Journals (Sweden)

    Ming Ding

    2016-04-01

    Conclusion: In conclusion, despite nice bone formation and implant fixation in all groups, bioreactor activated graft material did not convincingly induce early implant fixation similar to allograft, and neither bioreactor nor by adding BMA credited additional benefit for bone formation in this model.

  11. Spent fuel performance assessment (SPA) for a hypothetical repository in crystalline formations in Germany

    International Nuclear Information System (INIS)

    Luehrmann, L.; Noseck, U.; Storck, R.

    2000-07-01

    Within the framework of this project a first long-term safety assessment study for a generic German repository with spent nuclear fuel in granite host formations has been performed. Conceptual models have been developed and implemented into the numerical codes. These models describe the relevant processes in the near and far field of the repository. For the nuclide mobilisation and the diffusion-controlled transport through the bentonite barrier the computer code GRAPOS for far-field transport the code CHETMAD has been developed. Transport in the far field has been assumed to take place in a fracture network. As retardation mechanism matrix diffusion accompanied by linear equilibrium sorption on the rock matrix is considered. Both codes have been tested by intercomparison with codes of other countries. The dose rates have been calculated by the code EXCON considering the transport pathways into the biosphere. A reference scenario has been defined. It considers instantaneous saturation of the bentonite immediately after the operational phase of the repository, failure of all containers after 1000 years, diffusion through the bentonite, transport through fractured dykes, which represent a fast transport pathway in the low permeability region of the granite. The nuclide mobilization has been calculated according to a common source term which has been developed by all participants of the SPA project. It is assumed that 25% of the containers are connected to the considered transport pathway in the far field. The nuclides are transported to layers close to the surface. The contaminated water is pumped from a surface well and used for drinking, irrigation, cattle feed and fish ponds. (orig.) [de

  12. Dilemmas of reform: An exploration of science teachers' collective sensemaking of formative assessment practices

    Science.gov (United States)

    Heredia, Sara Catherine

    Current reform efforts in science education call for significant shifts in how science is taught and learned. Teachers are important gatekeepers for reform, as they must enact these changes with students in their own classrooms. As such, professional development approaches need to be developed and studied to understand how teachers interpret and make instructional plans to implement these reforms. However, traditional approaches to studying implementation of reforms often draw on metrics such as time allotted to new activities, rather than exploring the ways in which teachers make sense of these reforms. In this dissertation I draw upon a body of work called sensemaking that has focused on locating learning in teachers' conversations in departmental work groups. I developed a conceptual and analytic framework to analyze how teachers make sense of reform given their local contexts and then used this framework to perform a case study of one group of teachers that participated in larger professional development project that examined the impact of a learning progression on science teachers' formative assessment practices. I draw upon videotapes of three years of monthly professional development meetings as my primary source of data, and used an ethnographic approach to identify dilemmas surfaced by teachers, sources of ambiguity and uncertainty, and patterns of and resources for teacher sensemaking. The case study reveals relationships between the type of dilemma surfaced by the teachers and different patterns of sensemaking for modification of teaching practices. When teachers expressed concerns about district or administrative requirements, they aligned their work in the professional development to those external forces. In contrast, teachers were able to develop and try out new practices when they perceived coherence between the professional development and school or district initiatives. These results underscore the importance of coherence between various

  13. Assessment of scale formation and corrosion of drinking water supplies in Ilam city (Iran

    Directory of Open Access Journals (Sweden)

    Zabihollah Yousefi

    2016-05-01

    Full Text Available Background: Scaling and corrosion are the two most important indexes in water quality evaluation. Pollutants are released in water due to corrosion of pipelines. The aim of this study is to assess the scale formation and corrosion of drinking water supplies in Ilam city (Iran. Methods: This research is a descriptive and cross-sectional study which is based on the 20 drinking water sources in Ilam city. Experiments were carried out in accordance with the Water and Wastewater Co. standard methods for water and wastewater experiment. The data were analyzed by using Microsoft Excel and GraphPad Prism 5. The results were compared with national and international standards. Results: The mean and standard deviation (SD values of Ryznar, Langelier, Aggressive, Puckorius and Larson-Skold indices in year 2009 were equal to 7.833 (±0.28, -0.102 (±0.35, 11.88 (±0.34, 7.481 (±0.22 and 0.801 (±0.44, respectively, and were 7.861 (±0.28, -0.175 (±0.34, 11.84 (±0.37, 7.298(±0.32 and 0.633 (±0.47, for year 2013 respectively. The average of Langelier, Ryznar, Aggression, and Puckorius indices indicate that potable water resources in Ilam city have the tendency to be corrosive. Statistical analysis and figures carried out by GraphPad Prism version 5.04. Conclusion: The results of different indices for water resources of Ilam city revealed that water supplies of Ilam city were corrosive. Water quality control and replacement of distribution pipes in development of water network should be carried out. Moreover, water pipelines should be preserved with several modes of corrosion inhibition.

  14. Assessment and characterization of biofilm formation among human isolates of Streptococcus dysgalactiae subsp. equisimilis.

    Science.gov (United States)

    Genteluci, Gabrielle Limeira; Silva, Ligia Guedes; Souza, Maria Clara; Glatthardt, Thaís; de Mattos, Marcos Corrêa; Ejzemberg, Regina; Alviano, Celuta Sales; Figueiredo, Agnes Marie Sá; Ferreira-Carvalho, Bernadete Teixeira

    2015-12-01

    The capacity to form biofilm is considered a protective mechanism that allows the bacteria to survive and proliferate in hostile environments, facilitating the maintenance of the infectious process. Recently, biofilm has become a topic of interest in the study of the human pathogen group A Streptococcus (GAS). Although GAS has not been associated with infection on medical implants, the presence of microcolonies embedded in an extracellular matrix on infected tissues has been reported. Despite the similarity between GAS and Streptococcus dysgalactiae subspecies equisimilis (SDSE), there are no studies in the literature describing the production of biofilm by SDSE. In this work, we assessed and characterized biofilm development among SDSE human isolates of group C. The in vitro data showed that 59.3% of the 118 isolates tested were able to form acid-induced biofilm on glass, and 28% formed it on polystyrene surfaces. More importantly, biofilm was also formed in a foreign body model in mice. The biofilm structure was analyzed by confocal laser scanning microscopy, transmission electron microscopy, and scanning electron microscopy. Long fibrillar-like structures were observed by scanning electron microscopy. Additionally, the expression of a pilus associated gene of SDSE was increased for in vitro sessile cells compared with planktonics, and when sessile cells were collected from biofilms formed in the animal model compared with that of in vitro model. Results obtained from the immunofluorescence microscopy indicated the biofilm was immunogenic. Our data also suggested a role for proteins, exopolysaccharide and extracellular DNA in the formation and accumulation of biofilm by SDSE. Copyright © 2015 Elsevier GmbH. All rights reserved.

  15. A systematic review of concept mapping-based formative assessment processes in primary and secondary science education

    DEFF Research Database (Denmark)

    Hartmeyer, Rikke; Stevenson, Matt P.; Bentsen, Peter

    2017-01-01

    assessment: firstly, concept mapping should be constructed in teaching, preferably on repeated occasions. Secondly, concept mapping should be carried out individually if personal understanding is to be elicited; however, collaborative concept mapping might foster discussions valuable for developing students......’ understanding and for activating them as instructional resources and owners of their own learning. Thirdly, low-directed mapping seems most suitable for formative assessment. Fourthly, technology-based or peer assessments are useful strategies likely to reduce the load of interpretation for the educator......In this paper, we present and discuss the results of a systematic review of concept mapping-based interventions in primary and secondary science education. We identified the following recommendations for science educators on how to successfully apply concept mapping as a method for formative...

  16. Evaluation of the Fecal Incontinence Quality of Life Scale (FIQL) using item response theory reveals limitations and suggests revisions.

    Science.gov (United States)

    Peterson, Alexander C; Sutherland, Jason M; Liu, Guiping; Crump, R Trafford; Karimuddin, Ahmer A

    2018-06-01

    The Fecal Incontinence Quality of Life Scale (FIQL) is a commonly used patient-reported outcome measure for fecal incontinence, often used in clinical trials, yet has not been validated in English since its initial development. This study uses modern methods to thoroughly evaluate the psychometric characteristics of the FIQL and its potential for differential functioning by gender. This study analyzed prospectively collected patient-reported outcome data from a sample of patients prior to colorectal surgery. Patients were recruited from 14 general and colorectal surgeons in Vancouver Coastal Health hospitals in Vancouver, Canada. Confirmatory factor analysis was used to assess construct validity. Item response theory was used to evaluate test reliability, describe item-level characteristics, identify local item dependence, and test for differential functioning by gender. 236 patients were included for analysis, with mean age 58 and approximately half female. Factor analysis failed to identify the lifestyle, coping, depression, and embarrassment domains, suggesting lack of construct validity. Items demonstrated low difficulty, indicating that the test has the highest reliability among individuals who have low quality of life. Five items are suggested for removal or replacement. Differential test functioning was minimal. This study has identified specific improvements that can be made to each domain of the Fecal Incontinence Quality of Life Scale and to the instrument overall. Formatting, scoring, and instructions may be simplified, and items with higher difficulty developed. The lifestyle domain can be used as is. The embarrassment domain should be significantly revised before use.

  17. Formative and summative assessment of science in English primary schools: evidence from the Primary Science Quality Mark

    Science.gov (United States)

    Earle, Sarah

    2014-05-01

    Background:Since the discontinuation of Standard Attainment Tests (SATs) in science at age 11 in England, pupil performance data in science reported to the UK government by each primary school has relied largely on teacher assessment undertaken in the classroom. Purpose:The process by which teachers are making these judgements has been unclear, so this study made use of the extensive Primary Science Quality Mark (PSQM) database to obtain a 'snapshot' (as of March 2013) of the approaches taken by 91 English primary schools to the formative and summative assessment of pupils' learning in science. PSQM is an award scheme for UK primary schools. It requires the science subject leader (co-ordinator) in each school to reflect upon and develop practice over the course of one year, then upload a set of reflections and supporting evidence to the database to support their application. One of the criteria requires the subject leader to explain how science is assessed within the school. Sample:The data set consists of the electronic text in the assessment section of all 91 PSQM primary schools which worked towards the Quality Mark in the year April 2012 to March 2013. Design and methods:Content analysis of a pre-existing qualitative data set. Text in the assessment section of each submission was first coded as describing formative or summative processes, then sub-coded into different strategies used. Results:A wide range of formative and summative approaches were reported, which tended to be described separately, with few links between them. Talk-based strategies are widely used for formative assessment, with some evidence of feedback to pupils. Whilst the use of tests or tracking grids for summative assessment is widespread, few schools rely on one system alone. Enquiry skills and conceptual knowledge were often assessed separately. Conclusions:There is little consistency in the approaches being used by teachers to assess science in English primary schools. Nevertheless

  18. Assessing the origin of unusual organic formations in lava caves from Canary Islands (Spain)

    Science.gov (United States)

    Miller, Ana Z.; de la Rosa, Jose M.; Garcia-Sanchez, Angela M.; Pereira, Manuel F. C.; Jurado, Valme; Fernández, Octavio; Knicker, Heike; Saiz-Jimenez, Cesareo

    2016-04-01

    Lava tubes, like other caves, contain a variety of speleothems formed in the initial stage of a lava tube formation or due to leaching and subsequent precipitation of secondary minerals. Primary and secondary mineral formations in lava caves are mainly composed of silicate minerals, although secondary minerals common in limestone caves have been also reported in this type of caves. In addition, unusual colored deposits have been found on the walls and ceilings of lava tubes, some of them of unknown origin and composition. A brown to black-colored mud-like deposits was observed in "Llano de los Caños" Cave, La Palma Island, Canary Islands, Spain. These black deposits coat the wall and ceiling of the lava tube where sub-horizontal fractures occur. FESEM-EDS, X-ray micro-computed tomography and mineralogical analyses were conducted for morphological, 3D microstructural and compositional characterization of these unusual speleothem samples. These techniques revealed that they are mainly composed of amorphous materials, suggesting an organic carbon composition. Hence, analytical pyrolysis (Py-GC/MS), solid-state 13C Nuclear Magnetic Resonance (NMR) and stable isotope analysis were applied to assess the nature and origin of the black deposits. The combination of these analytical tools permits the identification of specific biomarkers (di- and triterpenoids) for tracing the potential sources of the organic compounds in the speleothems. For comparison purposes, samples from the topsoil and overlaying vegetation were also analyzed. Chromatograms resulting from the Py-GC/MS showed an abundance of polysaccharides, lipids and terpenoids typically derived from the vegetation of the area (Erica arborea). In addition, levoglucosan, polycyclic aromatic hydrocarbons and N-containing heterocyclic compounds were detected. They probably derived from the leaching of charred vegetation resulting from a wildfire occurred in the area in 2012. The lack of the typical pattern of odd

  19. Thermodynamic assessment of liquid composition change during solidification and its effect on freckle formation in superalloys

    International Nuclear Information System (INIS)

    Long Zhengdong; Liu Xingbo; Yang Wanhong; Chang, K.-M.; Barbero, Ever

    2004-01-01

    The solidification macrosegregation, i.e. freckle, becomes more and more concerned with ever increasing demand for the large ingot size of superalloys. The evaluation of freckle formation is very difficult because of the less understanding of freckle formation mechanism and complex solidification behaviors of multi-component superalloys. The macrostructure of typical Nb-bearing and Ti-bearing superalloys in horizontally directional solidification and vacuum arc remelting (VAR) ingots were investigated to clarify the freckle formation mechanism. The thermodynamic approach was proposed to simulate the solidification behaviors. The relative Ra numbers, a reliable criterion, of freckle formation for some alloys were obtained based on the results of thermodynamic calculations. This thermodynamic approach was evaluated through comparison of the calculations from semi-experimental results. The Ra numbers obtained by thermodynamic approach are in good agreement with the ingot size capability of the industry melting shops, which is limited mainly by freckle defects

  20. Current issues in dietary acrylamide:formation,mitigation and risk assessment

    DEFF Research Database (Denmark)

    Pedreschi, F.; Salome Mariotti, M.; Granby, Kit

    2014-01-01

    content of browned food, while still maintaining its attractive organoleptic properties. Reducing sugars such as glucose and fructose are the major contributors to AA in potato-based products. On the other hand, the limiting substrate of AA formation in cereals and coffee is the free amino acid asparagine....... For some products the addition of glycine or asparaginase reduces AA formation during baking. Since, for potatoes, the limiting substrate is reducing sugars, increases in sugar content in potatoes during storage then introduce some difficulties and potentially quite large variations in the AA content...... of the final product. Sugars in potatoes may be reduced by blanching. Levels of AA in different foods show large variations and no general upper limit is easily applicable, since some formation will always occur. Current policy is that practical measures should be taken voluntarily to reduce AA formation...